Khiem Vuong

I'm a PhD student at Carnegie Mellon University (Robotics Institute), where I'm extremely fortunate to be advised by Deva Ramanan and Srinivasa Narasimhan.

Previously, I received my BS in Computer Science from University of Minnesota, working with Stergios Roumeliotis and Hyun Soo Park on VI-SLAM and egocentric scene understanding.

Currently, I'm exploring the intersection of video generative models and 3D/4D reconstruction for building a generative world model.

I'm always happy to chat or collaborate — feel free to reach out at kvuong@andrew.cmu.edu!

Khiem Vuong
News
01/2026 I will join Meta Reality Labs as an intern this summer (XR World AI team, working with Peter Kontschieder)!
03/2025 AerialMegaDepth is accepted to CVPR 2025!
02/2025 I will join Apple as an intern this summer (VIO/SLAM team, Video Computer Vision)!
03/2024 WALT3D is accepted to CVPR 2024 as an Oral (top 0.8%)!
10/2023 Toward Planet-Wide Traffic Camera Calibration is accepted to WACV 2024.
Publications
coming soon
FixAnything: 3D-Consistent Rendering Refinement via Video Generative Priors
arXiv 2026   (* equal advising)
arXiv 2026
FrameCrafter adapts pretrained video diffusion models for sparse-view novel view synthesis through lightweight architectural changes, achieving state-of-the-art results with minimal training data.
CVPR 2025   (* equal contribution/advising)
A scalable data generation framework that combines mesh-renderings with real images, enabling robust 3D reconstruction across extreme viewpoint variations.
CVPR 2024 (Oral, top 0.8%)   (* equal contribution)
Automatically generate realistic training data from time-lapse videos for reconstructing dynamic objects under occlusion.
WACV 2024
3D scene reconstruction and precise localization of over 100 real-world traffic cameras.
Tien Do, Khiem Vuong, Hyun Soo Park
CVPR 2022 (Oral, top 4.2%)
Extension of the spatial rectifier to the multi-directional case, applying to depth/surface normal prediction from egocentric view.
ECCV 2020 (Spotlight, top 3%)
Robust surface normal estimation by spatially rectifying image to the densely distributed orientations.
ICRA 2021
Depth estimation by multiview triangulation, followed by an iterative depth refinement module that preserves estimates with high triangulation confidence.
IROS 2020
Depth estimation by completing a sparse VI-SLAM point cloud using planar constraint from robust surface normal prediction.
Academic Services

Teaching Assistant: Computer Vision (16-720, Fall 2024) and Computer Vision (16-385, Spring 2025) at CMU

Conference Reviewer: NeurIPS, CVPR, ICCV, ECCV, WACV, IROS, AAAI, ICLR

Journal Reviewer: IJCV, TPAMI

Experience
Research Intern — XR World AI
(Coming up) Summer 2026
Building a generative world model
CV/ML Intern — Video Computer Vision (VIO/SLAM)
Summer 2025
Building 3D foundation models for AR/VR applications
M.S. in Robotics (2021 – 2023)
Ph.D. in Robotics (2023 – Present)
Working on 3D reconstruction and generative world models with Deva Ramanan and Srinivasa Narasimhan
Research Assistant (2019 – 2021)
Working on learning-based VI-SLAM and egocentric 3D vision with Stergios Roumeliotis and Hyun Soo Park
Software Engineering Intern
Summer 2019
Working on Portfolio Management Systems (PMS)