KAIST VICLab

Academic website of KAIST VICLab, under the advisory of Prof.Munchurl Kim, Korea Advanced Institute of Science & Technology (KAIST), Korea.

Our research of interest includes deep-learning-based computer vision, computational image & video processing as well as image & video understanding and 2D/3D video coding.

Email  /  Homepage  /  Contact  /  Github

profile photo

Research

Our recent intensive works focus on Computer Vision research

[1] in the fields of natural image and video restoration: (1) super-resolution, (2) frame interpolation, (3) SDR-to-HDR inverse tone mapping, (4) image in-painting, (5) depth estimation, (6) image deraining, (7) image dehazing, (8) video motion debluring; (9) generative restoration of old photos,

[2] in the fields of 3D image/video reconstruction: (1) depth estimation, (2) optical flow estimation, (3) camera pose estimation, (4) dynamic neural radiance field (NeRF) and Gaussian splatting learning of video for novel view synthesis;

[3] in the fields of satellite images: (1) PAN sharpening, super-resolution and cloud removal of Electro-Optical (EO) images, (2) super-resolution, detection and classification of Synthetic Aperture Radar (SAR) image targets, (3) SAR-to-EO image-to-image translation learning, etc.

Some papers are highlighted.

SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video
Jongmin Park*, Minh-Quan Viet Bui*, Juan Luis Gonzalez Bello , Jaeho Moon , Jihyong Oh, Munchurl Kim
CVPR, 2025
project page / arXiv

COLMAP-free dynamic 3D Gaussian Splatting (3DGS) framework for high-quality reconstruction and fast rendering from monocular videos.

ABBSPO: Adaptive Bounding Box Scaling and Symmetric Prior based Orientation Prediction for Detection Aerial Image Objects
Woojin Lee*, Hyugjae Chang*, Jaeho Moon , Jaehyup Lee , Munchurl Kim
CVPR, 2025
project page / arXiv

TBD.

BiM-VFI: Bidirectional Motion Fields-Guided Frame Interpolation for Video with Non-uniform Motions
Wonyong Seo, Jihyong Oh , Munchurl Kim
CVPR, 2025
project page / arXiv

TBD.

U-Know-Diff-PAN: Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening
Sungpyo Kim, Jeonghyeok Do , Jaehyup Lee , Munchurl Kim
CVPR, 2025
project page / arXiv

TBD.

MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting
Sangwoon Kwak, Joonsoo Kim, Jun Young Jeong , Won-Sik Cheong , Jihyong Oh, Munchurl Kim
CVPR, 2025
project page / arXiv

TBD.

MIVE: New Design and Benchmark for Multi-Instance Video Editing
Samuel Teodoro*, Agus Gunawan* , Soo Ye Kim , Jihyong Oh , Munchurl Kim
arXiv, 2024
project page / arXiv

TBD.

DAKD: Data Augmentation and Knowledge Distillation using Diffusion Models for SAR Oil Spill Segmentation
Jaeho Moon*, Jeonghwan Yun* , Jaehyun Kim* , Jaehyup Lee , Munchurl Kim
arXiv, 2024
project page / arXiv

TBD.

C-DiffSET: Leveraging Latent Diffusion for SAR-to-EO Image Translation with Confidence-Guided Reliable Object Generation
Jeonghyeok Do, Jaehyup Lee, Munchurl Kim
arXiv, 2024
project page / arXiv

C-DiffSET proposes the first framework to fine-tune a pretrained LDM for SET tasks, effectively leveraging their learned representations to overcome the scarcity of SAR-EO image pairs.

TDSM: Triplet Diffusion for Skeleton-Text Matching in Zero-Shot Action Recognition
Jeonghyeok Do, Munchurl Kim
arXiv, 2024
project page / arXiv

TDSM introduces the first framework to apply diffusion models and to implicitly align the skeleton features with text prompts (action labels) by fully taking the advantage of excellent text-image correspondence learning in generative diffusion process, thus being able to learn fused discriminative features in a unified latent space.

SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition
Jeonghyeok Do, Munchurl Kim
ECCV, 2024
project page / arXiv

SkateFormer proposes a partition-specific attention strategy (Skate-MSA) for skeleton-based action recognition that captures skeletal-temporal relations and reduces computational complexity.

FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring
Geunhyuk Youk , Jihyong Oh , Munchurl Kim
CVPR, 2024   (Oral Presentation)
project page / arXiv

TBD.

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior
Jaeho Moon, Juan Luis Gonzalez Bello , Byeongjun Kwon, Munchurl Kim
CVPR, 2024
project page / arXiv

Solving dynamic object problems in self-supervised depth estimation using Ground Contacting Prior.

Novel View Synthesis with View-Dependent Effects from a Single Image
Juan Luis Gonzalez Bello , Munchurl Kim
CVPR, 2024
project page / arXiv

TBD.

DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular Video
Minh-Quan Viet Bui*, Jongmin Park* , Jihyong Oh, Munchurl Kim
arXiv, 2023
project page / arXiv

Dynamic deblurring NeRF framework for reconstructing dynamic scenes from blurry monocular video.

ProNeRF: Learning Efficient Projection-Aware Ray Sampling for Fine-Grained Implicit Neural Radiance Fields
Juan Luis Gonzalez Bello*, Minh-Quan Viet Bui* , Munchurl Kim
IEEE Access
project page / arXiv

Efficient NeRF framework for fine-grained 3D scene reconstruction with few sampling points via projection-aware ray sampling.

COMPASS: High-Efficiency Deep Image Compression with Arbitrary-scale Spatial Scalability
Jongmin Park, Jooyoung Lee , Munchurl Kim
ICCV, 2023
project page / arXiv

The first proposed NN-based spatially scalable image compression method that supports arbitrary-scale spatial scalability.


This website's source code is borrowed from Jon Barron's source code.