Zhiguang Lu, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang. Bidirectional Logits Tree: Pursuing Granularity Reconcilement in Fine-Grained Classification. AAAI Conference on Artificial Intelligence (AAAI), pp. 19189–19197, Philadelphia, PA, USA, Feb. 25-Mar. 4, 2025.
PDF
Guanqi Ding, Chengyu Yang, Shuhui Wang, Xincheng Li, Jinzhe Zhang, Xin Jin, Qingming Huang. Dis²Booth: Learning Image Distribution with Disentangled Features for Text-to-Image Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 2744–2752, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
PDF
Shuo Cai, Xinzhe Han, Shuhui Wang. Divide-and-Conquer: Tree-structured Strategy with Answer Distribution Estimator for Goal-Oriented Visual Dialogue. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 1917-1925, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
PDF
Yuchen Sun, Qianqian Xu, Zitai Wang, Zhiyong Yang, Junwei He. EDGE: Unknown-aware Multi-label Learning by Energy Distribution Gap Expansion. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 12613–12621, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
PDF
Yunbin Tu, Liang Li, Li Su, Qingming Huang. Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning. 39th Annual AAAI Conference on Artificial Intelligence (AAAI), pp. 7464-7472, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
PDF
Xingyu Lyu, Qianqian Xu, Zhiyong Yang, Shaojie Lyu, Qingming Huang. SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise SAM. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 19278–19286, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
PDF
Gaoxiang Cong, Liang Li, Jiadong Pan, Zhedong Zhang, Amin Beheshti, Anton Van Den Hengel, Yuankai Qi, Qingming Huang. FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing. ACM International Conference on Multimedia (ACM MM), Dublin, Ireland, Oct. 27-31, 2025.
PDF
Jiadong Pan, Liang Li, Hongcheng Gao, Zhengjun Zha, Qingming Huang, Jiebo Luo. SafeCFG: Controlling Harmful Features with Dynamic Safe Guidance for Safe Generation. ACM International Conference on Multimedia (ACM MM), Dublin, Ireland, Oct. 27-31, 2025.
PDF
Qiyang Wan, Ruiping Wang, Chengzhi Gao, Xilin Chen. Catch Your Concepts: A Flexible ConceptLocator for Interpretable Visual Recognition. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025.
PDF
Tianyue Wang, Shuang Yang, Shiguang Shan, Xilin Chen. GLip: A Global-Local Integrated Progressive Framework for Robust Visual Speech Recognition. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025.
PDF
Yujie Zhao, Jiabei Zeng, Shiguang Shan. Pose-Robust Calibration Strategy for Point-of-Gaze Estimation on Mobile Phones. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025.
PDF
Gaoxiang Cong, Jiadong Pan, Liang Li, Yuankai Qi, Yuxin Peng, Anton van den Hengel, Jian Yang, Qingming Huang. EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15863-15873, Nashville, TN, USA, Jun. 11–15, 2025.
PDF
Zonghui Guo, Yingjie Liu, Jie Zhang, Haiyong Zheng, Shiguang Shan. Face Forgery Video Detection via Temporal Forgery Cue Unraveling. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7396-7405, Nashville, TN, USA, Jun. 10–17, 2025.
PDF
Ziyi Bai, Hanxuan Li, Bin Fu, Chuyan Xiong, Ruiping Wang, Xilin Chen. R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19456–19466, Nashville, TN, USA, Jun. 10–17, 2025.
PDF
Zhen Yang, Zhuo Tao, Qi Chen, Yuankai Qi, Liang Li, Anton van den Hengel, Qingming Huang. Separation of powers: On segregating knowledge from observation in LLM-enabled knowledge-based visual question answering. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 24753-24762, Nashville, TN, USA, Jun. 10–17, 2025.
PDF
Yiheng Li, Ruibing Hou, Hong Chang, Shiguang Shan, Xilin Chen. UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 27805-27815, Nashville, TN, USA, Jun. 10–17, 2025.
PDF
Yue Wu, Zhaobo Qi, Junshu Sun, Yaowei Wang, Qingming Huang, Shuhui Wang. Video Language Model Pretraining with Spatio-temporal Masking. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8557-8567, Nashville, TN, USA, Jun. 10–17, 2025.
PDF
Yujie Wang, Yunwei Zhao, Jing Yang, Han Han, Shiguang Shan, Jie Zhang. Evaluating Cognitive-Behavioral Fixation via Multimodal User Viewing Patterns on Social Media. The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), Suzhou, China, Nov. 4-9, 2025.
PDF
Dan Han, Mingjie He, Jie Zhang, Shiguang Shan. Dual-Branch Partial Annotation Learning for Facial Attributes Recognition. IEEE 19th International Conference on Automatic Face and Gesture Recognition (FG), Tampa/Clearwater, FL, USA, May 26-30, 2025.
PDF
Xinkuan Qiu, Meina Kan, Yongbin Zhou, Shiguang Shan. Benchmarking Multimodal Large Language Models Against Image Corruptions. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
PDF