Academic website of KAIST VICLab, under the advisory of Prof.Munchurl Kim, Korea Advanced Institute of Science & Technology (KAIST), Korea.
Our research of interest includes deep-learning-based computer vision, computational image & video processing as well as image & video understanding and 2D/3D video coding.
Our recent intensive works focus on Computer Vision research
[1] in the fields of natural image and video restoration: (1) super-resolution, (2) frame interpolation, (3) SDR-to-HDR inverse tone mapping, (4) image in-painting, (5) depth estimation, (6) image deraining, (7) image dehazing, (8) video motion debluring; (9) generative restoration of old photos,
[2] in the fields of 3D image/video reconstruction: (1) depth estimation, (2) optical flow estimation, (3) camera pose estimation, (4) dynamic neural radiance field (NeRF) and Gaussian splatting learning of video for novel view synthesis;
[3] in the fields of satellite images: (1) PAN sharpening, super-resolution and cloud removal of Electro-Optical (EO) images, (2) super-resolution, detection and classification of Synthetic Aperture Radar (SAR) image targets, (3) SAR-to-EO image-to-image translation learning, etc.
C-DiffSET proposes the first framework to fine-tune a pretrained LDM for SET tasks, effectively leveraging their learned representations to overcome the scarcity of SAR-EO image pairs.
TDSM introduces the first framework to apply diffusion models and to implicitly align the skeleton features with text prompts (action labels) by fully taking the advantage of excellent text-image correspondence learning in generative diffusion process, thus being able to learn fused discriminative features in a unified latent space.
SkateFormer proposes a partition-specific attention strategy (Skate-MSA) for skeleton-based action recognition that captures skeletal-temporal relations and reduces computational complexity.