pottrait

Zhiwen ("Aaron") Fan

University of Texas at Austin

About Me

I am a Ph.D. candidate in Electrical Computer Engineering at The University of Texas at Austin advised by Prof. Atlas Wang at VITA group.

I will be joining the Department of Electrical and Computer Engineering at Texas A&M University as an Assistant Professor in the Fall 2025 semester. I am seeking highly motivated students. Interested candidates are encouraged to email me with resume.

Recent News

  • We are organizing End-to-End 3D Learning workshop at ICCV 2025.
  • I will serve as the Area Chair for NeurIPS 2025.
  • Our ICLR'25 (4K4DGen) is selected as spotlight presentation.
  • Our NeurIPS'24 (LightGaussian) is selected as spotlight presentation.
  • Our Symbolic Visual RL was accepted by IEEE Trans. PAMI.
  • Our IROS'24 (Multi-modal 3DGS SLAM) is selected as oral pitch finalist presentation.
  • Our CVPR'24 (Feature-3DGS) is selected as highlight presentation.
  • Our CVPR'23 (NeuralLift-360) is selected as highlight presentation.
  • I was one of the awardees of the Qualcomm Innovation Fellowship (North America) 2022 (QIF 2022). Innovation title: "Real-time Visual Processing for Autonomous Driving via Video Transformer with Data-Model-Accelerator Tri-design".
  • We won 3rd place in the University Demo Best Demonstration at the 59th Design Automation Conference (DAC 2022). We demo for a multi-task vision transformer on FPGA.
  • Our CVPR'22 (CADTransformer) is selected as oral presentation.
  • Our paper for CVPR'20 (Cascade Cost Volume) is selected as oral presentation.

Researches Interests

loading...
My research goal is to enhance AI agents' ability to interact with the physical world through 3D AI. I focus on developing multi-modal 3D models that enable perception, generation, and action in 3D space. By integrating natural language, images, videos, and 3D data, my work aims to bridge AI agents, human instructions, and the physical world. I am particularly interested in building real-time 3D models that can understand, recreate, and interact with their environment using spatial awareness and common sense.

My research has been demonstrated on platforms such as Quest 3, implemented within IARPA projects, and integrated into multiple commercial products.

Selected Publications
Full publication list at Google Scholar

* denotes equal contribution, † denotes project lead.

3D Modeling

LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS
Zhiwen Fan*†, Kevin Wang*, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang
NeurIPS 2024 (Spotlight, top 2.1% of 15 671)
InstantSplat: Sparse-view Pose‑free Gaussian Splatting in Seconds
Zhiwen Fan*†, Wenyan Cong*, Kairun Wen*, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, Zhangyang Wang, Yue Wang
Preprint
MM3DGS SLAM: Multi‑modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements
Lisong C. Sun, Neel P. Bhatt, Jonathan C. Liu, Zhiwen Fan, Zhangyang Wang, Todd E. Humphreys, Ufuk Topcu
IROS 2024 (Oral Pitch Highlight)
M^3ViT thumbnail
M^3ViT: Mixture‑of‑Experts Vision Transformer for Efficient Multi‑task Learning with Model‑Accelerator Co‑design
Zhiwen Fan*†, Hanxue Liang*, Rishov Sarkar, Ziyu Jiang, Tianlong Chen, Kai Zou, Yu Cheng, Cong Hao, Zhangyang Wang
NeurIPS 2022 (QIF 2022 Award & DAC 3rd best demo)
Cascade Cost Volume thumbnail
Cascade Cost Volume for High‑Resolution Multi‑View Stereo and Stereo Matching
Zhiwen Fan*, Xiaodong Gu*, Siyu Zhu, Zuozhuo Dai, Feitong Tan, Ping Tan
CVPR 2020 (Oral Presentation, top 5% of submissions)
CADTransformer thumbnail
CADTransformer: Panoptic Symbol Spotting Transformer for CAD Drawings
Zhiwen Fan, Tianlong Chen, Peihao Wang, Zhangyang Wang
CVPR 2022 (Oral Presentation, top 5% of submissions)

3D VLMs

VLM-3R thumbnail
VLM‑3R: Vision‑Language Models Augmented with Instruction‑Tuned 3D Reconstruction
Zhiwen Fan et al.
Submitted
Large Spatial Model: Real‑time Unposed Images to Semantic 3D
Zhiwen Fan*†, Jian Zhang*, Wenyan Cong, Peihao Wang, Renjie Li, Kairun Wen, Shijie Zhou, Achuta Kadambi, Zhangyang Wang, Danfei Xu, Boris Ivanovic, Marco Pavone, Yue Wang
NeurIPS 2024
Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields
Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Zehao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, Achuta Kadambi
CVPR 2024 (Highlight, 2.8% of 11 532)
NeRF‑SOS: Any‑View Self‑supervised Object Segmentation from Complex Real‑World Scenes
Zhiwen Fan, Peihao Wang, Yifan Jiang, Xinyu Gong, Dejia Xu, Zhangyang Wang
ICLR 2023

3D AIGC

Can Test‑Time Scaling Improve World Foundation Model?
Wenyan Cong, Hanqing Zhu, Peihao Wang, Bangya Liu, Dejia Xu, Kevin Wang, David Z. Pan, Yan Wang, Zhiwen Fan, Zhangyang Wang
Preprint
4K4DGen: Panoramic 4D Generation at 4K Resolution
Renjie Li, Panwang Pan, Dejia Xu, Shijie Zhou, Xuanyang Zhang, Zeming Li, Achuta Kadambi, Zhangyang Wang, Zhiwen Fan
ICLR 2025 (Spotlight, 3.2% of 11 672)
MoonSim: A Photorealistic Lunar Environment Simulator
Ziyu Chen*†, Henghui Bao*, Ting‑Hsuan Chen*, Haozhe Lou, Ge Yang, Zhiwen Fan, Marco Pavone, Yue Wang
Under submission
DreamScene360: Unconstrained Text‑to‑3D Scene Generation with Panoramic Gaussian Splatting
Shijie Zhou*†, Zhiwen Fan*†, Dejia Xu*, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, Achuta Kadambi
ECCV 2024
Unified Implicit Neural Stylization
Zhiwen Fan*†, Yifan Jiang*, Peihao Wang*, Xinyu Gong, Dejia Xu, Zhangyang Wang
ECCV 2022

3D Human

Expressive Gaussian Human Avatars from Monocular RGB Video
Hezhen Hu, Zhiwen Fan, Tianhao Wu, Yihan Xi, Seoyoung Lee, Georgios Pavlakos, Zhangyang Wang
NeurIPS 2024
MMHU thumbnail
MMHU: A Massive‑Scale Multimodal Benchmark for Human Behavior Understanding in Autonomous Driving
…, Zhiwen Fan, …
Submitted

Inverse Problems

X2‑Gaussian: 4D Radiative Gaussian Splatting for Continuous‑time Tomographic Reconstruction
Weihao Yu, Yuanhao Cai, Ruyi Zha, Zhiwen Fan, Chenxin Li, Yixuan Yuan
Preprint
Joint CS‑MRI thumbnail
Joint CS‑MRI Reconstruction and Segmentation with a Unified Deep Network
Liyan Sun*, Zhiwen Fan*, Xinghao Ding, Yue Huang, John Paisley
IPMI 2019

Invited Talks

  • Scalable 3D/4D Assets Creation @ Duke. November 2024.
  • E cient 3D Learning for Autonomous System @ UNC, Guest Lecture. November 2024.
  • Empowering Machines to Understand 3D @ Stanford, ASU, JHU, Yale. October 2024.
  • 3D Computer Vision @ TAMU, Guest Lecture. October 2024.
  • From Efficient 3D Learning to 3D Foundation Models @ UCLA and CalTech. October 2024.
  • Towards Universal, Real-Time 3D Construction and Interaction @ TAMU AI Lunch. October 2024.
  • Spatial Intelligence via Reconstruction, Distillation, and Generation @ Shanghai AI Lab. July 2024.
  • Streamlined 3D/4D: From Hours to Seconds to Millisecond @ Google Research, VALSE Webinar . May 2024.
  • Streamlined 3D/4D: From Hours to Seconds to Millisecond @ Google Research. May 2024.
  • Real-Time Few-shot View Synthesis w/ Gaussian Splatting @ IARPA WRIVA Workshop. April 2024.
  • Data-efficient and Rendering-efficient Neural Rendering @ IFML Workshop on Gen AI. November 2023.
  • Unified Implicit Neural Stylization @ Xiamen University; Kungfu.ai. July 2022.

Experience

  • Meta, Reality Lab, Burlingame:
    Research Intern (year of 2024).
  • NVIDIA Research (remote):
    Research Intern (year of 2024).
  • Google AR, San Francisco:
    Research Intern (year of 2022).
  • Alibaba Group, Hangzhou:
    Senior Algorithm Engineer(2019 - 2021).

Services

  • Area Chairs: 
    NeurIPS.
  • Journal Reviewers: 
    TPAMI, TIP, IJCV, Neurocomputing.
  • Conference Reviewers: 
    NeurIPS, ICLR, ICML, CVPR, ICCV, ECCV.