TL;DR: We propose a novel motion deblurring NeRF framework for blurry monocular video, called MoBluRF, consisting of a Base Ray Initialization (BRI) stage and a Motion Decomposition-based Deblurring (MDD) stage. In the BRI stage, we coarsely reconstruct dynamic 3D scenes and jointly initialize the base rays which are further used to predict latent sharp rays, using the inaccurate camera pose information from the given blurry frames. In the MDD stage, we introduce a novel Incremental Latent Sharp-rays Prediction (ILSP) approach for the blurry monocular video frames by decomposing the latent sharp rays into global camera motion and local object motion components.
Overview of our MoBluRF framework. o effectively optimize the sharp radiance field with the imprecise camera poses from blurry video frames, we design our MoBluRF consisting of two main procedures (Algo. 2) of (a) Base Ray Initialization (BRI) Stage (Sec. III-C and Algo. 1) and (b) Motion Decomposition-based Deblurring (MDD) Stage (Sec. III-D).
To validate the quality of deblurring monocular video novel view synthesis of our DyBluRF, we compare it with the existing dynamic novel view synthesis methods including HexPlane, HyperNeRF, 4DGS as well as the existing static deblurring novel view synthesis methods DP-NeRF and BAD-NeRF. All methods are optimized using the newly synthesized Blurry iPhone Dataset. For the existing static deblurring novel view synthesis methods, DP-NeRF and BAD-NeRF, which are originally designed solely for static novel view synthesis, we incorporate time instances as additional inputs, resulting in DP-NeRF$_t$ and BAD-NeRF$_t$, to make them synthesize dynamic components for a fair comparison.
We utilize the co-visibility masked image metrics, including mPSNR, mSSIM, and mLPIPS, following the approach introduced by Dycheck. These metrics mask out the regions of the test video frames which are not observed by the training camera. We further utilize tOF to measure the temporal consistency of reconstructed video frames.
Our blurry dataset and implementation are built on top of Dycheck codebase. Dycheck consists of casual captures with strict monocular constraint for dynamic view synthesis.
This work was supported by Institute of Information and communications Technology Planning and Evaluation (IITP) grant funded by the Korean Government [Ministry of Science and ICT (Information and Communications Technology)] (Project Number: RS-2022-00144444, Project Title: Deep Learning Based Visual Representational Learning and Rendering of Static and Dynamic Scenes, 100%).
@ARTICLE{11017407,
author={Bui, Minh-Quan Viet and Park, Jongmin and Oh, Jihyong and Kim, Munchurl},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={MoBluRF: Motion Deblurring Neural Radiance Fields for Blurry Monocular Video},
year={2025},
pages={1-18},
keywords={Neural radiance field;Dynamics;Cameras;Rendering (computer graphics);Optimization;Trajectory;Geometry;Training;Three-dimensional displays;Kernel;Motion Deblurring NeRF;Dynamic NeRF;Video View Synthesis},
doi={10.1109/TPAMI.2025.3574644}}