I am a first-year Ph.D student of R&L Group at Nanjing University (NJU), under the supervision of Prof. Qi Fan โ homepage. I obtained my M.S. in Computer Science at the University of Chinese Academy of Sciences (UCAS) in 2024 and B.S at Shanghai Jiao Tong University (SJTU) in 2021. I was also fortunate to be an internship at 01AI
, Huawei
, TeleAI
, Kuaishou-Kling ![]()
.
My research interests lie in the intersection of Computer Vision and Machine Learning. From 2021, I started to do some research on Neural architecture search and image caption. Now, I focus on designing novel applications for image/video generation, World model, 3D autoregressive-generation and other downstream AIGC tasks. Welcome to the Zhihu homepage for academic discussions in the field of image/video generation.
๐ Educations
- 2024.10 - up-to-now, Phd, School of Intelligence Science and Technology, Nanjing University.

- 2021.09 - 2024.06, M.S. degree, School of Computer Science and Technology, University of Chinese Academy of Sciences.

- 2017.09 - 2021.06, B.S. degree, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University.

๐ Publications

Chang-han Liu, Xunzhi Xiang, Zixuan Duan, Wenbin Li, Yang Gao, Qi Fan.

Proteus-ID: ID-Consistent and Motion-Enhanced Video Customization
Guiyu Zhang, Chen Shi, Zijian Jiang, Xunzhi Xiang, Jingjing Qian, Shaoshuai Shi, Li Jiang.

Human Motion Video Generation: A Survey
Haiwei Xue, Xiangyang Luo, Zhannghao Hu, Xin Zhang, Xunzhi Xiang, Yuqing dai, Jianzhuang Liu, Zhensong Zhang, Minglei Li, Jian Yang, Fei Ma, Changpeng Yang, Zonghong Dai, and Fei Richard Yu.

ReMask-Animate: Refined Character Image Animation Using Mask-Guided Adapters
Xunzhi Xiang, Haiwei Xue, Zonghong Dai, Di Wang, Minglei Li, Ye Yue, Fei Ma, Weijiang Yu, Heng Chang, Fei Richard Yu.

A Neural Architecture Predictor based on GNN-Enhanced Transformer
Xunzhi Xiang, Kun Jing, Jungang Xu.
๐ฅ Preprints

Denoising Vision Transformer Autoencoder with Spectral Regularization
Xunzhi Xiang, Xingye Tian, Guiyu Zhang, Yabo Chen, Xin Tao, Pengfei Wan, Qi Fan.

Macro-from-Micro Planning for High-Quality and Parallelized Autoregressive Long Video Generation
Xunzhi Xiang, Yabo Chen, Guiyu Zhang, Zhongyu Wang, Zhe Gao, Quanming Xiang, Gonghu Shang, Junqi Liu, Haibin Huang, Yang Gao, Chi Zhang, Qi Fan, et al.

Make It Efficient: Dynamic Sparse Attention for Autoregressive Image Generation
Xunzhi Xiang, Qi Fan.

SmartSAM: Segment Ambiguous Objects like Smart Annotators
Zhe Gao, Shiyu Shen, Xunzhi Xiang, Wenbin Li, Yang Gao, Qi Fan .

Junyuan Ma, Xunzhi Xiang, Wenbin Li, Yang Gao, Qi Fan .
๐ป Internships
- 2024.03 - 2024.09, 01AI, China.
- 2024.03 - 2024.09, Guangming-Lab, China.
- 2024.09 - 2025.03, HUAWEI, China.
- 2025.04 - 2025.06, TeleAI, China.
- 2025.06 - up-to-now, Kuaishou-Kling, China.
๐ Honors and Awards
- 2021.10 CCF-BDCI award.