2025
  • Zhiguang Lu, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang. Bidirectional Logits Tree: Pursuing Granularity Reconcilement in Fine-Grained Classification. AAAI Conference on Artificial Intelligence (AAAI), pp. 19189–19197, Philadelphia, PA, USA, Feb. 25-Mar. 4, 2025. PDF
  • Guanqi Ding, Chengyu Yang, Shuhui Wang, Xincheng Li, Jinzhe Zhang, Xin Jin, Qingming Huang. Dis²Booth: Learning Image Distribution with Disentangled Features for Text-to-Image Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 2744–2752, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025. PDF
  • Shuo Cai, Xinzhe Han, Shuhui Wang. Divide-and-Conquer: Tree-structured Strategy with Answer Distribution Estimator for Goal-Oriented Visual Dialogue. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 1917-1925, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025. PDF
  • Yuchen Sun, Qianqian Xu, Zitai Wang, Zhiyong Yang, Junwei He. EDGE: Unknown-aware Multi-label Learning by Energy Distribution Gap Expansion. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 12613–12621, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025. PDF
  • Yunbin Tu, Liang Li, Li Su, Qingming Huang. Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning. 39th Annual AAAI Conference on Artificial Intelligence (AAAI), pp. 7464-7472, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025. PDF
  • Xingyu Lyu, Qianqian Xu, Zhiyong Yang, Shaojie Lyu, Qingming Huang. SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise SAM. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 19278–19286, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025. PDF
  • Gaoxiang Cong, Liang Li, Jiadong Pan, Zhedong Zhang, Amin Beheshti, Anton Van Den Hengel, Yuankai Qi, Qingming Huang. FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing. ACM International Conference on Multimedia (ACM MM), Dublin, Ireland, Oct. 27-31, 2025. PDF
  • Jiadong Pan, Liang Li, Hongcheng Gao, Zhengjun Zha, Qingming Huang, Jiebo Luo. SafeCFG: Controlling Harmful Features with Dynamic Safe Guidance for Safe Generation. ACM International Conference on Multimedia (ACM MM), Dublin, Ireland, Oct. 27-31, 2025. PDF
  • Qiyang Wan, Ruiping Wang, Chengzhi Gao, Xilin Chen. Catch Your Concepts: A Flexible ConceptLocator for Interpretable Visual Recognition. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025. PDF
  • Tianyue Wang, Shuang Yang, Shiguang Shan, Xilin Chen. GLip: A Global-Local Integrated Progressive Framework for Robust Visual Speech Recognition. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025. PDF
  • Yujie Zhao, Jiabei Zeng, Shiguang Shan. Pose-Robust Calibration Strategy for Point-of-Gaze Estimation on Mobile Phones. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025. PDF
  • Gaoxiang Cong, Jiadong Pan, Liang Li, Yuankai Qi, Yuxin Peng, Anton van den Hengel, Jian Yang, Qingming Huang. EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15863-15873, Nashville, TN, USA, Jun. 11–15, 2025. PDF
  • Zonghui Guo, Yingjie Liu, Jie Zhang, Haiyong Zheng, Shiguang Shan. Face Forgery Video Detection via Temporal Forgery Cue Unraveling. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7396-7405, Nashville, TN, USA, Jun. 10–17, 2025. PDF
  • Ziyi Bai, Hanxuan Li, Bin Fu, Chuyan Xiong, Ruiping Wang, Xilin Chen. R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19456–19466, Nashville, TN, USA, Jun. 10–17, 2025. PDF
  • Zhen Yang, Zhuo Tao, Qi Chen, Yuankai Qi, Liang Li, Anton van den Hengel, Qingming Huang. Separation of powers: On segregating knowledge from observation in LLM-enabled knowledge-based visual question answering. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 24753-24762, Nashville, TN, USA, Jun. 10–17, 2025. PDF
  • Yiheng Li, Ruibing Hou, Hong Chang, Shiguang Shan, Xilin Chen. UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 27805-27815, Nashville, TN, USA, Jun. 10–17, 2025. PDF
  • Yue Wu, Zhaobo Qi, Junshu Sun, Yaowei Wang, Qingming Huang, Shuhui Wang. Video Language Model Pretraining with Spatio-temporal Masking. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8557-8567, Nashville, TN, USA, Jun. 10–17, 2025. PDF
  • Yujie Wang, Yunwei Zhao, Jing Yang, Han Han, Shiguang Shan, Jie Zhang. Evaluating Cognitive-Behavioral Fixation via Multimodal User Viewing Patterns on Social Media. The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), Suzhou, China, Nov. 4-9, 2025. PDF
  • Dan Han, Mingjie He, Jie Zhang, Shiguang Shan. Dual-Branch Partial Annotation Learning for Facial Attributes Recognition. IEEE 19th International Conference on Automatic Face and Gesture Recognition (FG), Tampa/Clearwater, FL, USA, May 26-30, 2025. PDF
  • Xinkuan Qiu, Meina Kan, Yongbin Zhou, Shiguang Shan. Benchmarking Multimodal Large Language Models Against Image Corruptions. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025. PDF