Computer Vision

SPA: 3D Spatial-Awareness Enables Effective Embodied Representation

A novel representation learning framework that emphasizes the importance of 3D spatial awareness in embodied AI.

Haoyi Zhu 朱皓怡, Honghui Yang, Yating Wang, Jiange Yang, Limin Wang, Tong He

SPA: 3D Spatial-Awareness Enables Effective Embodied Representation

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

A general 3D pre-training approach establishing a pathway to 3D foundational models.

Haoyi Zhu 朱皓怡, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Tong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Wanli Ouyang

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

Abstract: In the context of autonomous driving, the significance of effective feature learning is widely acknowledged. While conventional 3D self-supervised pre-training methods have shown widespread success, most methods follow the ideas originally designed for 2D images.

Honghui Yang, Sha Zhang, Di Huang, Xiaoyang Wu, Haoyi Zhu 朱皓怡, Tong He, Shixiang Tang, Hengshuang Zhao, Qibo Qiu, Binbin Lin, Xiaofei He, Wanli Ouyang

AlphaTracker: a multi-animal tracking and behavioral analysis tool

Abstract: Computer vision has emerged as a powerful tool to elevate behavioral research. This protocol describes a computer vision machine learning pipeline called AlphaTracker, which has minimal hardware requirements and produces reliable tracking of multiple unmarked animals, as well as behavioral clustering.

Zexin Chen, Ruihan Zhang, Hao-Shu Fang, Yu E. Zhang, Aneesh Bal, Haowen Zhou, Rachel R. Rock, Nancy Padilla-Coreano, Laurel R. Keyes, Haoyi Zhu 朱皓怡, Yong-Lu Li, Takaki Komiyama, Kay M. Tye, Cewu Lu