Khiem Vuong

I'm a PhD student at Carnegie Mellon University (Robotics Institute), where I'm extremely fortunate to be advised by Deva Ramanan and Srinivasa Narasimhan.

Previously, I received my BS in Computer Science from University of Minnesota, working with Stergios Roumeliotis and Hyun Soo Park on VI-SLAM and egocentric scene understanding.

I'm currently interning at Meta Reality Labs, exploring the intersection of video generative models and 3D/4D reconstruction for building world models. I also interned at Apple during my PhD.

I'm always happy to chat or collaborate — feel free to reach out at kvuong@andrew.cmu.edu!

Email Resume GitHub Scholar

News

06/2026 FrameCrafter and FixAnything are accepted to ECCV 2026!

06/2026 Honored to receive the Outstanding Reviewer award at CVPR 2026.

05/2026 Interning at Meta Reality Labs this summer (working with Peter Hedman and Peter Kontschieder)!

03/2025 AerialMegaDepth is accepted to CVPR 2025!

02/2025 I will join Apple as an intern this summer (VIO/SLAM team, Video Computer Vision)!

03/2024 WALT3D is accepted to CVPR 2024 as an Oral (top 0.8%)!

Publications

FixAnything: 3D-Consistent Rendering Refinement via Video Generative Priors

Khiem Vuong, Deva Ramanan*, Srinivasa Narasimhan*

✦ ECCV 2026 (* equal advising) · more details coming soon :)

A single generalist video model that fixes rendering artifacts from any 3D representation — 3DGS, NeRF, mesh, or sparse point clouds — producing photorealistic, 3D-consistent videos.

Novel View Synthesis as Video Completion

Qi Wu, Khiem Vuong, Minsik Jeon, Srinivasa Narasimhan, Deva Ramanan

✦ ECCV 2026

paper project

Adapting pretrained video diffusion models for permutation-invariant sparse-view novel view synthesis, achieving SOTA results with minimal training data.

Dash2Sim: Closed-Loop Driving Simulation from in-the-wild Dashcam Videos

Anurag Ghosh, Francesco Pittaluga, Khiem Vuong, Angela Chen, Juan Alvarez-Padilla, Manmohan Chandraker, Srinivasa Narasimhan

arXiv 2026

paper project

A framework that turns in-the-wild monocular videos into metric, geo-referenced 4D driving logs, enabling closed-loop simulation on long-tail work-zone scenarios.

MapSplat: Feed-Forward Geometry Beyond Visible Surfaces

Mehar Khurana, Nikhil Keetha, Khiem Vuong, Marcel Schreiber, Tarasha Khurana, Deva Ramanan

arXiv 2026 · more details coming soon :)

Extending 2.5D feed-forward geometry predictors to "2.6D" by replacing per-pixel depth supervision with differentiable 3D Gaussian splatting, enabling supervision from novel views.

ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones

Anurag Ghosh, Shen Zheng, Robert Tamburo, Khiem Vuong, Juan Alvarez-Padilla, Hailiang Zhu, Michael Cardei, Nicholas Dunn, Christoph Mertz, Srinivasa G. Narasimhan

✦ ICCV 2025

paper project code

Largest open-source dataset for studying autonomous driving in work zones.

AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis

Khiem Vuong, Anurag Ghosh, Deva Ramanan*, Srinivasa Narasimhan*, Shubham Tulsiani*

✦ CVPR 2025 (* equal contribution/advising)

paper project 3D viewer code

A scalable data generation framework that combines mesh-renderings with real images, enabling robust 3D reconstruction across extreme viewpoint variations.

WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects under Occlusion

Khiem Vuong*, N Dinesh Reddy*, Robert Tamburo, Srinivasa G. Narasimhan

✦ CVPR 2024 (Oral, top 0.8%) (* equal contribution)

paper project video

Automatically generate realistic training data from time-lapse videos for reconstructing dynamic objects under occlusion.

Toward Planet-Wide Traffic Camera Calibration

Khiem Vuong, Robert Tamburo, Srinivasa G. Narasimhan

✦ WACV 2024

paper project video code

3D scene reconstruction and precise localization of over 100 real-world traffic cameras.

Egocentric Scene Understanding via Multimodal Spatial Rectifier

Tien Do, Khiem Vuong, Hyun Soo Park

✦ CVPR 2022 (Oral, top 4.2%)

paper project video code

Extension of the spatial rectifier to the multi-directional case, applying to depth/surface normal prediction from egocentric view.

Surface Normal Estimation of Tilted Images via Spatial Rectifier

Tien Do, Khiem Vuong, Stergios I. Roumeliotis, Hyun Soo Park

✦ ECCV 2020 (Spotlight, top 3%)

paper project video code

Robust surface normal estimation by spatially rectifying image to the densely distributed orientations.

Deep Multi-view Depth Estimation with Predicted Uncertainty

Tong Ke, Tien Do, Khiem Vuong, Kourosh Sartipi, Stergios I. Roumeliotis

✦ ICRA 2021

paper code

Depth estimation by multiview triangulation, followed by an iterative depth refinement module that preserves estimates with high triangulation confidence.