MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting

Co-corresponding authors
ETRI, South Korea    KAIST, South Korea    Chung-Ang University, South Korea

NVS Visual Comparison with SOTA methods

We compare our method with the state-of-the-art methods SC-GS [20], 4DGS [54], Deformable 3DGS [57] on monocular video sequences. We provide quantitative performance along with the novel view rendering results: PSNR, SSIM, LPIPS, and storage size. The proposed method shows superior quantitative and qualitative quality performance than SOTA methods even with compact model size.

TL;DR

We propose MoDec-GS, a memory-efficient dynamic 3D Gaussian Splatting (3DGS) framework for novel view reconstruction in complex real-world scenarios. Its core is the Global-to-Local Motion Decomposition (GLMD) method, which captures dynamic motions using Global and Local Canonical Scaffolds with coarse-to-fine adjustments. Temporal Interval Adjustment (TIA) further optimizes temporal segment assignments. Experiments show that MoDec-GS reduces model size by 70% on average compared to state-of-the-art methods while maintaining or improving rendering quality.
Result Graph
Performance visualization graph.The x-axis represents rendering speed (FPS)↑, and the y-axis indicates PSNR↑. Each framework is depicted as a bubble, with the size of the bubble representing the model storage size.

Method Overview

Fig1
Overview of MoDec-GS. To effectively train dynamic 3D Gaussians with complex motion, we introduce Global-to-Local Motion Decomposition (GLMD). We first train a Global Canonical Scaffold-GS (Global CS) with entire frames, and apply a Global Anchor Deformation (GAD) to Local Canonical Scaffold-GS (Local CS) dedicated to represent its corresponding temporal segment. Next, to finely adjust the remaining local motion, we apply Local Gaussian Deformation (LGD) which explicitly deforms the reconstructed 3D Gaussians with a shared hexplane. During the training, Temporal Interval Adjustment (TIA) is performed, optimizing the temporal interval into a non-uniform interval that adopts to the scene’s level of motion.

Abstract

3D Gaussian Splatting (3DGS) has made significant strides in scene representation and neural rendering, with intense efforts focused on adapting it for dynamic scenes. Despite delivering remarkable rendering quality and speed, existing methods struggle with storage demands and representing complex real-world motions. To tackle these issues, we propose MoDecGS, a memory-efficient Gaussian splatting framework designed for reconstructing novel views in challenging scenarios with complex motions. We introduce GlobaltoLocal Motion Decomposition (GLMD) to effectively capture dynamic motions in a coarse-to-fine manner. This approach leverages Global Canonical Scaffolds (Global CS) and Local Canonical Scaffolds (Local CS), extending static Scaffold representation to dynamic video reconstruction. For Global CS, we propose Global Anchor Deformation (GAD) to efficiently represent global dynamics along complex motions, by directly deforming the implicit Scaffold attributes which are anchor position, offset, and local context features. Next, we finely adjust local motions via the Local Gaussian Deformation (LGD) of Local CS explicitly. Additionally, we introduce Temporal Interval Adjustment (TIA) to automatically control the temporal coverage of each Local CS during training, allowing MoDecGS to find optimal interval assignments based on the specified number of temporal segments. Extensive evaluations demonstrate that MoDecGS achieves an average 70% reduction in model size over stateoftheart methods for dynamic 3D Gaussians from realworld dynamic videos while maintaining or even improving rendering quality.

2-stage Deformation

Fig1
Concept and Effect of 2-stage Deformation. For representing a complex motion of 3D Gaussians, a global movement over time intervals can be more efficiently handled through deformation of anchor itself. In contrast, subtle motions of individual 3D Gaussians within a time interval can be effectively addressed by explicit deformation of each Gaussian.

Temporal Interval Adjustment

Fig1
TIA effectiveness. During the training process, temporal intervals are appropriately adapted to the degree of motion in the scene through TIA. It has been validated that the accumulated normalized optical flow magnitude are re-balanced by TIA, which leads to re-balance the degree of motion covered by each interval.

Quantitative Results

quantitative_results
Quantitative results comparison of novel view synthesis on Dycheck-iPhone [16] and HyperNeRF [45] monocular datasets. All results were locally re-generated in our environment and averaged across all sequences of the dataset. Red and blue denote the best and second best performances, respectively. For iPhone dataset, the masked metrics are calculated using the masks provided the authors.

quantitative_results2
Performance comparison with a NeRF-extension framework, including training and rendering speed. Averaged over 536×960 HyperNeRF’s vrig datasets [43]. The performance numbers of [11, 18, 22, 44, 45] are sourced from [54].

Demo Video

BibTeX

@misc{kwak2025modecgsglobaltolocalmotiondecomposition,
  title={MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting}, 
  author={Sangwoon Kwak and Joonsoo Kim and Jun Young Jeong and Won-Sik Cheong and Jihyong Oh and Munchurl Kim},
  year={2025},
  eprint={2501.03714},
  archivePrefix={arXiv},
}