Visual Modeling----Visual Information Processing and Learning (VIPL)

Location：

Visual Modeling

Leader： Hong Chang (Associate Professor)

Email： changhong [at] ict.ac.cn

Introduction of research group

Visual Modeling (VISMOD) group focuses on research of innovative machine learning methodologies and technologies, in order to solve real-world computer vision problems and AI+ problems, specifically human-centric vision scenarios including image and video representation, understanding and generation, as well as application of AI methods in other science and technology domains.

Research

The current main research topics include:

● Models and algorithms in machine learning, especially

1) Learning under data distribution shift: long-tailed, few-shot, meta-learning, etc.

2) Learning under weak annotations: unsupervised/semi-supervised/weakly supervised learning, etc.

3) Anomaly and generalization dilemma: anomaly detection, out-of-distribution (OOD) detection/ generalization, etc.

● Human-centric visual modeling

1) Person re-identification

2) Human pose, motion understanding and generalization

3) Human-centric multimodal large language models

● AI4Science

1) Multimodal large language models with scientific data

2) Deep learning models in specific domains

Papers

Journal Papers

Nan Kang, Hong Chang, Bingpeng Ma, Shiguang Shan. A Comprehensive Framework for Long-tailed Learning via Pretraining and Normalization. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), Vol. 35, No. 3, pp. 3437-3449, 2024.
Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen. Triplet adaptation framework for robust semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 46, No. 12, pp. 8056-8073, 2024.
Ruibing Hou，Hong Chang，Bingpeng Ma，Shiguang Shan，Xilin Chen. Dual Compensation Residual Networks for Class Imbalanced Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 45, No. 10, pp. 11733 - 11752, 2023.
Ruibing Hou, Hong Chang, Bingpeng Ma, Rui Huang, Shiguang Shan. Temporal Multi-Scale Complementary Feature for Video Person Re-Identification. CHINESE JOURNAL OF COMPUTERS, Vol 46, No. 1, pp. 31-50, 2023.
Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen. Dual Compensation Residual Networks for Class Imbalanced Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, No. 10, pp. 11733-11752, 2023.
Cheng Wang, Bingpeng Ma, Hong Chang, Shiguang Shan and Xilin Chen. Person Search by a Bi-directional Task-Consistent Learning Model. IEEE Transactions on Multimedia (TMM), vol. 25, pp. 1190-1203, 2023.
Xiaoyi Yin, Zhen Cui, Hong Chang, Bingpeng Ma and Shiguang Shan. Extending Generalized Unsupervised Manifold Alignment. SCIENCE CHINA Information Sciences, vol. 65, no. 7, pp. 135-152, 2022.
Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan and Xilin Chen. Feature Completion for Occluded Person Re-Identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 4894-4912, September 2022.
Shutao Bai, Bingpeng Ma, Hong Chang, Rui Huang, Shiguang Shan and Xilin Chen. SANet: Statistic Attention Network for Video-Based Person Re-Identification. IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 6, pp. 3866-3879, June 2022.
Furong Xu, Bingpeng Ma, Hong Chang and Shiguang Shan. PRDP: Person Re-identification with Dirty and Poor Data. IEEE Transactions on Cybernetics, vol. 52, no. 10, pp. 11014-11026, October 2022.
Xinqian Gu, Hong Chang, Bingpeng Ma and Shiguang Shan. Motion Feature Aggregation for Video-based Person Re-identification. IEEE Transactions on Image Processing, vol. 31, 3908-3919, 2022.
Linmao Zhou, Hong Chang, Bingpeng Ma and Shiguang Shan. Interactive Regression and Classification for Dense Object Detector. IEEE Transactions on Image Processing, vol. 31, pp. 3684-3696, 2022.
Shutao Bai, Bingpeng Ma, Hong Chang, Rui Huang, Shiguang Shan and Xilin Chen. SANet: Statistic Attention Network for Person Re-Identification. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2021. (Accepted)
Fengling Mao, Bingpeng Ma, Hong Chang, Shiguang Shan and Xilin Chen. Learning Efficient Text to Image Synthesis via Interstage Cross-sample Similarity Distillation. SCIENCE CHINA Information Sciences, 64(2): 120102:1-120102:12, 2021.
Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan and Xilin Chen. IAUnet: Global Context-Aware Feature Learning for Person Re-Identification. IEEE Trans. on Neural Networks and Learning Systems (TNNLS), 32(10):4460-4474, 2021.
Xiangzhou Zhang, Bingpeng Ma, Hong Chang, Shiguang Shan and Xilin Chen. Location Sensitive Network for Human Instance Segmentation. IEEE Transactions on Image Processing (TIP), 30:7649-7662, 2021.
Yucheng Chen, Rui Huang, Hong Chang, Chuanqi Tan, Tao Xue and Bingpeng Ma. Cross-Modal Knowledge Adaptation for Language-Based Person Search. IEEE Transactions on Image Processing (TIP), 30:4057-4069, 2021.
Xiaoyi Yin, Zhen Cui, Hong Chang, Bingpeng Ma and Shiguang Shan. Extending Generalized Unsupervised Manifold Alignment. SCIENCE CHINA Information Sciences, 2021. (Accepted)
Furong Xu, Bingpeng Ma, Hong Chang and Shiguang Shan. PRDP: Person Re-identification with Dirty and Poor Data. IEEE Transactions on Cybernetics, 2021. (Accepted)
Cheng Wang, Bingpeng Ma, Hong Chang, Shiguang Shan and Xilin Chen. Person Search by a Bi-directional Task-Consistent Learning Model. IEEE Transactions on Multimedia (TMM), 2021. (Accepted)

Conference Papers

Jiahe Zhao, Ruibing Hou, Zejie Tian, Hong Chang, Shiguang Shan. HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, Hawaii, USA, Oct. 19-23, 2025. (Accepted)
Zhuo Li*, Mingshuang Luo*, Ruibing Hou, Xin Zhao, Hao Liu, Hong Chang, Zimo Liu, Chen Li. Morph: A Motion-free Physics Optimization Framework for Human Motion Generation. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, Hawaii, USA, Oct. 19-23, 2025. (Accepted)
Mengdi Liu, Zhangyang Gao, Hong Chang, Ziqing Li, Shiguang Shan, Xilin Chen. G2PDiffusion: Cross-species Genotype-to-Phenotype Prediction via Evolutionary Diffusion. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, Hawaii, USA, Oct. 19-23, 2025. (Accepted)
Yiheng Li, Ruibing Hou, Hong Chang, Shiguang Shan, Xilin Chen. UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville TN, USA, Jun. 11-15, 2025. (Accepted)
Jinjing Hu, Wenrui Liu, Hong Chang, Bingpeng Ma,Shiguang Shan, Xilin Chen. An Information Theoretical View for Out-Of-Distribution Detection. European Conference on Computer Vision (ECCV), pp. 418-435, Milano, Italy, Sep 29-Oct 4, 2024.
Minyang Hu, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen. Scalable Modular Network: A Framework for Adaptive Learning via Agreement Routing. International Conference on Learning Representations (ICLR), Vienna, Austria, May 7-11, 2024.
Jiachen Liang, Ruibing Hou, Minyang Hu, Hong Chang, Shiguang Shan, Xilin Chen. UMFC: Unsupervised Multi-Domain Feature Calibration for Vision-Language Models. Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, Dec. 10-15, 2024.
Mingshuang Luo, Ruibing Hou, Zhuo Li, Hong Chang, Zimo Liu, Yaowei Wang, Shiguang Shan. M3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation. Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, Dec. 10-15, 2024.
Nan Kang, Hong Chang, Bingpeng Ma, Shutao Bai, Shiguang Shan, Xilin Chen. Predictive Consistency Learning for Long-Tailed Recognition. British Machine Vision Conference (BMVC), Aberdeen, UK, Nov. 20-24, 2023.
Wenrui Liu, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen. Diversity-Measurable Anomaly Detection. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12147-12156, Vancouver, Canada, Jun. 20-22, 2023.
Botao Ye, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen. Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework. Proceedings of the 17th European Conference on Computer Vision (ECCV), Vol. 22, pp. 341-357, Oct. 23-27, 2022, Tel Aviv, Isreal / Cyberspace.
Minyang Hu, Hong Chang, Bingpeng Ma and Shiguang Shan. Learning Continuous Graph Structure with Bilevel Programming for Graph Neural Networks. International Joint Conference on Artificial Intelligence (IJCAI), 2022.
Yinqi Li, Hong Chang, Bingpeng Ma, Shiguang Shan and Xilin Chen. Optimal Positive Generation via Latent Transformation for Contrastive Learning. Annual Conference on Neural Information Processing Systems (NeurIPS), 2022.
Zong Guo, Bingpeng Ma, Hong Chang, Xilin Chen. Gradual Domain Adaptation with Sample Transferability Exploitation for Person Re-Identification. Proceedings of the 2022 IEEE International Conference on Multimedia and Expo (ICME), July 18-22, 2022, Taipei, China.
Xinqian Gu, Hong Chang, Bingpeng Ma, Shutao Bai, Shiguang Shan, Xilin Chen. Clothes-Changing Person Re-identification with RGB Modality Only. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1050-1059, June 19-24, 2022, New Orleans, Louisiana, USA / Cyberspace.
Shutao Bai, Bingpeng Ma, Hong Chang, Rui Huang, Xilin Chen. Salient-to-Broad Transition for Video Person Re-identification. Proceeedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7329-7338, June 19-24, 2022, New Orleans, Louisiana, USA / Cyberspace.
Xinqian Gu, Hong Chang, Bingpeng Ma, Hongkai Zhang, Xilin Chen, "Appearance-Preserving 3D Convolution for Video-based Person Re-identification," Proceedings of the 16th European Conference on Computer Vision (ECCV), LNCS 12347, Vol.2, pp.228-243, Glasgow, UK / Cyberspace, August 23-28, 2020.
Ruibing Hou, Hong Chang, Bingpeng Ma, Rui Huang and Shiguang Shan. BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2014–2023, Virtual Event, Jun. 19-25, 2021.
RuiBing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen, "Temporal Complementary Learning for Video Person Re-Identification," Proceedings of the 16th European Conference on Computer Vision (ECCV), LNCS 12370, Vol.25, pp.388-405, Glasgow, UK / Cyberspace, August 23-28, 2020.
Hongkai Zhang, Hong Chang, Bingpeng Ma, Shiguang Shan and Xilin Chen, \"Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection,\" British Machine Vision Conference (BMVC), Cardiff, UK, September, 9-12, 2019.