To better reconstruct 4D cone-beam computed tomography (CBCT) images, a general simultaneous motion estimation and image reconstruction (G-SMEIR) method is proposed to mitigate the local optimum trapping problem of the original SMEIR method. In addition to the projection domain motion estimation used in SMEIR, G-SMEIR includes an image domain motion estimation in the iteration to achieve better 4D reconstruction. To improve computational efficiency, the computationally intensive image domain motion estimation is alleviated by parallel computing using graphics processing units (GPUs). The proposed G-SMEIR method is tested using a CBCT simulation study of 4D XCAT phantom at different noise levels and compared with 3D total variation-based reconstruction (3D TV) and SMEIR. G-SMEIR performed similarly at the regular and low doses. The root mean square error of G-SMEIR is improved more than 60% over 3D TV and up to 17% over SMEIR. The structural similarity indices for a representative phase are 0.6418 (3D TV), 0.8893 (SMEIR), and 0.9206 (G-SMEIR). GPU computing shortens computational time of image domain motion estimation from about 17 minutes (CPU) to about 40 seconds for each pairs of 3D images. The simulation results demonstrate that G-SMEIR yields good image quality at different noise levels in a reasonable time. Further improvement of motion estimation algorithms and full parallelization of G-SMEIR will be conducted and tested on patient data.