In this paper, we firstly consider view-dependent effects into single image-based novel view synthesis (NVS) problems.
For this, we propose to exploit the camera motion priors in NVS to model view-dependent appearance or effects (VDE) as the negative disparity in the scene. By recognizing specularities `follow' the camera motion, we infuse VDEs into the input images by aggregating input pixel colors along the negative depth region of the epipolar lines. Also, we propose a `relaxed volumetric rendering' approximation that allows computing the densities in a single pass, improving efficiency for NVS from single images. Our method can learn single-image NVS from image sequences only, which is a completely self-supervised learning method, for the first time requiring neither depth nor camera pose annotations.
We present extensive experiment results and show that our proposed method can learn NVS with VDEs, outperforming the SOTA single-view NVS methods on the RealEstate10k and MannequinChallenge datasets.
The proposed NVSVDE-Net models VDEs at the input view as the negative scene disparities under the target camera motion Rc | tc. Novel views are estimated in two stages. Firstly with coarse fixed ray samples ti, then with refined adaptive sampling distances t*k.
Our results on the RealEstate10k dataset (RE10k).
Our results on the MannequinChallenge dataset (MC).
We trained our NVSVDE-Net to render views that are at most 16 frames apart from the single-image input. In this experiment, we render views equivalent to 40 frames apart from the input view. Despite the inherent challenges associated with extreme Novel View Synthesis, our method consistently produces realistic views, albeit with certain observable artifacts, as anticipated in any single-view NVS framework.
@misc{bello2023novel,
title={Novel View Synthesis with View-Dependent Effects from a Single Image},
author={Juan Luis Gonzalez Bello and Munchurl Kim},
year={2023},
eprint={2312.08071},
archivePrefix={arXiv},
primaryClass={cs.CV}
}