About Me

I am currently a 2nd-year Master’s student at Tsinghua University . I received my B.Eng. degree in Computer Science (Yingcai Honors College) from University of Electronic Science and Technology of China in 2024.

My research interest includes: Image & Video Generation Human-Centric Generation Reinforcement Learning


News

2026-02Our paper "Beyond the Golden Data" is accepted by CVPR 2026.
2025-11Our Paper FilmWeaver is accepted by AAAI 2026.
2025-08The code of CanonSwap is released. Welcome to star it!
2025-07Our Paper "Human Motion Video Generation: A Survey" is accepted by TPAMI.
2025-06Our paper "CanonSwap" is accepted by ICCV 2025.


Publications

CVPR 2026
sym

Beyond the Golden Data: Resolving the Motion-Vision Quality Dilemma via Timestep Selective Training
Xiangyang Luo, Qingyu Li, Yuming Li, Guanbo Huang, Yongjie Zhu, Wenyu Qin, Meng Wang, Pengfei Wan, Shao-Lun Huang
Paper

We identify the Motion-Vision Quality Dilemma in video data curation and propose Timestep-aware Quality Decoupling (TQD), which skews the training data sampling distribution across timesteps to decouple motion and visual quality, enabling models trained on imbalanced data to surpass those trained on golden data.

AAAI 2026

FilmWeaver: Weaving Consistent Multi-Shot Videos with Cache-Guided Autoregressive Diffusion
Xiangyang Luo, Qingyu Li, Xiaokun Liu, Wenyu Qin, Miao Yang, Meng Wang, Pengfei Wan, Di Zhang, Kun Gai, Shao-Lun Huang
Paper Page

FilmWeaver generates consistent multi-shot videos of arbitrary length via an autoregressive diffusion paradigm, enforcing inter-shot consistency with a Shot Cache and intra-shot coherence with a Temporal Cache, and supports applications such as concept injection and video extension.

TPAMI
sym

Human Motion Video Generation: A Survey
Haiwei Xue, Xiangyang Luo, Zhanghao Hu, Xin Zhang, Xunzhi Xiang, Yuqin Dai, Jianzhuang Liu, Minglei Li, Jian Yang, Fei Ma, Changpeng Yang, Zonghong Dai, Fei Richard Yu
Paper Page

This survey provides a comprehensive review of human motion video generation methods, covering the latest techniques, applications, and future directions.

ICCV 2025

CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation
Xiangyang Luo , Ye Zhu†, Yunfei Liu, Lijian Lin, Cong Wan, Zijian Cai, Shao-Lun Huang†, Yu Li
Paper Page Code

CanonSwap decouples motion information from appearance to enable high-fidelity and consistent video face swapping.

Arxiv
sym

Grid: Omni Visual Generation
Cong Wan*, Xiangyang Luo*, Hao Luo, Zijian Cai, Yiren Song, Yunlong Zhao, Yifan Bai, Fan Wang, Yuhang He, Yihong Gong
Paper Code

GRID is an omni-visual generation framework that reformulates temporal tasks like video into grid layouts, enabling a single powerful image model to efficiently handle image, video, and 3D generation.

ICME 2025
sym

Object Isolated Attention for Consistent Story Visualization
Xiangyang Luo, Junhao Cheng, Yifan Xie, Xin Zhang, Tao Feng, Zhou Liu, Fei Ma†, Fei Yu
Paper

A training-free method that uses isolated attention mechanisms to maintain character consistency and prevent feature confusion in story visualization.

ACM MM 2024
sym

CodeSwap: Symmetrically Face Swapping Based on Prior Codebook
Xiangyang Luo, Xin Zhang, Yifan Xie, Xinyi Tong, Weijiang Yu, Heng Chang, Fei Ma†, Fei Richard Yu
Paper

CodeSwap achieves high-fidelity face swapping by symmetrically manipulating codes within a pre-trained, high-quality facial codebook.