I am a third-year Ph.D. student in Computer Science at University of Science and Technology of China (USTC) advised by Prof. Tong He, Prof. Wanli Ouyang and Prof. Xiaogang Wang. I earned my B.S. degree in Artificial Intelligence Honor Class at Shanghai Jiao Tong University (SJTU), advised by Prof. Cewu Lu. I also have had the privilege of working with Dr. Hao-Shu Fang and Dr. Jim Fan.
Research for fun and truth. My current research interests focus on World Model, Embodied AI and Spatial Intelligence. The ultimate goal of my life is to discover myself, find the truth, and change the world! Feel free to follow me on and for latest research announcements and updates!
In my personal life, I am passionate (but amateur) about football, music, literature, philosophy, traditional Chinese painting, and modern Chinese poems!
“The philosophers have only interpreted the world, in various ways. The point, however, is to change it.”
✨ News ✨
- Oct. 2025: Aether has won 🎉 Outstanding Paper Award 🎉 and has presented an Oral Presentation at ICCV 2025 RIWM workshop!
- Sep. 2025: OmniWorld has been released! It is a multi-domain and multi-modal dataset for 4D world modeling. Check it out today!
- Sep. 2025: WinT3R has been announced! It is a feed-forward reconstruction model capable of online prediction of precise camera poses and high-quality point maps. Check it out!
- Jul. 2025: has been announced! is a novel feed-forward neural network that revolutionizes visual geometry reconstruction by eliminating the need for a fixed reference view. Paper, code, and demo are all open access. Check it out today!
- Jun. 2025: Aether and VQ-VLA are accepted by ICCV 2025!
- Jun. 2025: DeepVerse, an auto-regressive 4D world model, has been released!
- Feb. 2025: SPA has been accepted by ICLR 2025 and Tra-MoE has been accepted by CVPR 2025!
- Oct. 2024: SPA has been announced! SPA is a novel representation learning framework that emphasizes the importance of 3D spatial awareness in embodied AI. Paper, code, and pre-trained models are all open-sourced! Check it out!
- Sep. 2024: PointCloudMatters is accepted by NeurIPS D&B 2024! We prove that explicit representation like point cloud can significantly enhance the performance and generalization ability of robot learning policies. Codes are open-sourced!
- Oct. 2023: PonderV2 and UniPAD has been announced! PonderV2 is a universal pre-training paradigm for 3D vision, paving the way for 3D foundation model.
- Jul. 2023: RH20T has been announced! RH20T is a large-scale, open-source, real-world robotic dataset.
- Nov. 2022: MineDojo has won 🎉 Outstanding Paper Award 🎉 at NeurIPS announcement!
- Nov. 2022: AlphaPose paper is accepted by TPAMI! AlphaPose is an accurate multi-person pose estimator, which has received more than 8.3K stars on Github.
- Jun. 2022: MineDojo has been announced! MineDojo is a new framework for building generally capable agents with internet-scale knowledge in Minecraft.