I am a Ph.D. student in Electrical Engineering at The University of Texas at Austin advised by Prof. Zhangyang Wang at VITA group. Previously, I was a senior algorithm engineer at Alibaba Cloud worked with Prof. Ping Tan and Siyu Zhu. I am a winner of Qualcomm Innovation Fellowship 2022.
INR editing: In our INS paper, we conduct a pilot study for training stylized implicit representations (e.g., SIREN, NeRF, SDF). We obtain faithful stylizations and can interpolate between different styles to generate new mixed style. In our INR-DSP paper, we propose a theoretically grounded signal processing framework for Implicit Neural Representations (INR), which analytically manipulates INRs on the weight space through differential operators. In our NeRF-SOS paper, We propose a novel collaborative contrastive loss for NeRF to segment objects in complex real-world scenes, without any annotation.
Sparse view NeRF: In our SinNeRF paper, we propose thoughtfully designed semantic and geometry regularizations to train neural radiance field using only a single view.
NeRF with augmentations: In our Aug-NeRF paper, we propose to augment NeRF with worst-case perturbations in three distinct levels with physical grounds.
Efficient MVS: We propose Cas-MVSNet and Cas-StereoNet, by formulating cost volume in a coarse to fine manner. We obtain a 23.1% improvement on
DTU dataset with 50.6% and 74.2% reduction in GPU memory and run-time.
It is also rank 1st within all learning-based methods on Tanks and Temples benchmark.
See CVPR2020 for more details.
Efficient MTL: In our M^3$-ViT paper, we propose to activate any task of interest, by integrating mixture-of-experts (MoE) layers into a ViT backbone, along with hardware-level innovations.
M^3-ViT reduce the memory by 2.4x, saving 9.23x energy, on PASCAL-Context and NYUD-v2 datasets.
CAD symbol spotting can be use in architecture, engineering and construction (AEC) industries to accelerate the efficiency of 3D modeling from CAD drawings.
We release the first large-scale real-world dataset of over 10,000 CAD drawings with line-grained annotations (35 classes), covering various types of builds.
We introduce the new task of Panoptic Symbol Spotting, which is a relaxation of the traditional symbol spotting problem. It spots
and parse both countable object instances (windows, doors, tables, etc.) and uncountable stuff (wall, railing, etc.) from CAD drawings,
Moreover, we propose the Panoptic Quality (PQ) as the evaluation criteria of panotic symbol spotting results.
PanCADNet: We present a CNN-GCN method in our ICCV2021 which unified a GCN head and a detection head for semantic and instance symbol spotting respectively.
CADTransformer: We present a transformer-based framework named CADTransormer (CVPR2022),
by painlessly modifying existing vision transformer (ViT) backbones.
I have also worked on low-level computer vision tasks (e.g. Compressed Sensing MRI and Single Image Deraining) using Deep Neural Network before the year 2019. See IPMI2019, ACM MM2019, ECCV 2018, AAAI 2018 and TIP 2019, MRI 2019, MRI 2019 for details.
I'm interested in devleoping Implicit Neural Representations, Efficient 3D Reconstruction, Transformer Models and Low-level Computer Vision.
Research Intern
05/2022--present
Senior Algorithm Engineer
07/2019--08/2021