Khiem Vuong

I'm a PhD student at Carnegie Mellon University (Robotics Institute), where I'm extremely fortunate to be advised by Deva Ramanan and Srinivasa Narasimhan.

Previously, I received my BS in Computer Science from University of Minnesota, working with Stergios Roumeliotis and Hyun Soo Park on VI-SLAM and egocentric scene understanding.

I'm currently interning at Meta Reality Labs, exploring the intersection of video generative models and 3D/4D reconstruction for building a generative world model.

I'm always happy to chat or collaborate — feel free to reach out at kvuong@andrew.cmu.edu!

Khiem Vuong
News
05/2026 Interning at Meta Reality Labs this summer (working with Peter Hedman and Peter Kontschieder)!
03/2025 AerialMegaDepth is accepted to CVPR 2025!
02/2025 I will join Apple as an intern this summer (VIO/SLAM team, Video Computer Vision)!
03/2024 WALT3D is accepted to CVPR 2024 as an Oral (top 0.8%)!
Publications
FixAnything: 3D-Consistent Rendering Refinement via Video Generative Priors
arXiv 2026   (* equal advising)  ·  more details coming soon :)
A single generalist video model that fixes rendering artifacts from any 3D representation — 3DGS, NeRF, mesh, or sparse point clouds — producing photorealistic, 3D-consistent videos.
arXiv 2026
Adapting pretrained video diffusion models for permutation-invariant sparse-view novel view synthesis, achieving SOTA results with minimal training data.
MapSplat
MapSplat: Feed-Forward Geometry Beyond Visible Surfaces
arXiv 2026  ·  more details coming soon :)
Extending 2.5D feed-forward geometry predictors to "2.6D" by replacing per-pixel depth supervision with differentiable 3D Gaussian splatting, enabling supervision from novel views.
CVPR 2025   (* equal contribution/advising)
A scalable data generation framework that combines mesh-renderings with real images, enabling robust 3D reconstruction across extreme viewpoint variations.
CVPR 2024 (Oral, top 0.8%)   (* equal contribution)
Automatically generate realistic training data from time-lapse videos for reconstructing dynamic objects under occlusion.
WACV 2024
3D scene reconstruction and precise localization of over 100 real-world traffic cameras.
Tien Do, Khiem Vuong, Hyun Soo Park
CVPR 2022 (Oral, top 4.2%)
Extension of the spatial rectifier to the multi-directional case, applying to depth/surface normal prediction from egocentric view.
ECCV 2020 (Spotlight, top 3%)
Robust surface normal estimation by spatially rectifying image to the densely distributed orientations.
ICRA 2021
Depth estimation by multiview triangulation, followed by an iterative depth refinement module that preserves estimates with high triangulation confidence.
IROS 2020
Depth estimation by completing a sparse VI-SLAM point cloud using planar constraint from robust surface normal prediction.
Academic Services & Awards

Awards: Outstanding Reviewer, CVPR 2026

Teaching Assistant: Computer Vision (16-720, Fall 2024) and Computer Vision (16-385, Spring 2025) at CMU

Conference Reviewer: NeurIPS, CVPR, ICCV, ECCV, WACV, IROS, AAAI, ICLR

Journal Reviewer: IJCV, TPAMI

Experience
Research Scientist Intern — Reality AI Research
Summer 2026
Building a generative world model. (Hosts: Peter Hedman, Peter Kontschieder)
CV/ML Intern — Video Computer Vision (VIO/SLAM)
Summer 2025
Building 3D foundation models for AR/VR applications. (Host: Stergios Roumeliotis)
M.S. in Robotics (2021 – 2023)
Ph.D. in Robotics (2023 – Present)
Working on 3D reconstruction and generative world models with Deva Ramanan and Srinivasa Narasimhan
Research Assistant (2019 – 2021)
Working on learning-based VI-SLAM and egocentric 3D vision with Stergios Roumeliotis and Hyun Soo Park
Software Engineering Intern
Summer 2019
Working on Portfolio Management Systems (PMS)