Medical Visual Intelligence Group----Visual Information Processing and Learning (VIPL)

Location：

Home > Research>Medical Visual Intelligence Group

Medical Visual Intelligence Group

Leader： Hu Han (Associate Professor)

Email： hanhu [at] ict.ac.cn

Introduction of research group

The group focuses on frontier research in Medical Visual Intelligence, dedicated to developing approaches spanning from "Precise Perception" to "Cognitive Reasoning" and "Decision Collaboration". Addressing the challenges of cross-modal heterogeneity in medical imaging, the dynamic complexity of surgical scenes, and the high reliability required for clinical decisions, we investigate Medical Multimodal Large Models (MLMs), self-supervised representation learning, and controllable visual generation. By integrating multi-source healthcare data, we provide key algorithmic support for computer-aided diagnosis, surgical planning & navigation, and clinical decision-making—empowering clinicians to achieve robust perception, insightful thinking, and optimized intervention.

Research

(1) Precise Multimodal Perception

We investigate robust visual perception methods based on multimodal medical data. This direction aims to overcome the limitations of single-modality perception in complex medical environments (e.g., bleeding, smoke, and occlusions), enabling accurate localization, segmentation, and recognition of medical targets in challenging scenarios.

(2) Insightful Cross-modal Reasoning

We establish associative mappings between medical visual features and semantic knowledge to transcend the limitations of traditional "end-to-end" black-box models. This research focuses on deep reasoning guided by multimodal logical anchors, facilitating a shift from simple pattern recognition to interpretable clinical inference.

(3) Full-process Collaborative Optimization

We model the spatio-temporal dynamic evolution of surgical scenes to address the nonlinear offsets between static pre-operative plans and complex intra-operative environments. This work aims to develop personalized surgical navigation and evaluation systems, ensuring precise and adaptive intra-operative guidance.

Papers

Journal Papers

Jiyang Tang, Hu Han, Shiguang Shan, Xilin Chen. Distillation-SAM: Knowledge Distillation Based Auto-prompt Embedding Learning for Surgical Image Segmentation. IEEE Transactions on Medical Imaging (TMI), 2026. (Accepted)
Tianxin Xie, Hu Han, Shiguang Shan, Xilin Chen. Natural Adversarial Mask for Face Identity Protection in Physical World. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 3, pp. 2089-2106, March 2025.
Ling Lin, Hao Liu, Jinqiao Liang, Zhendong Li, Jiao Feng, Hu Han. Consensus-agent deep reinforcement learning for face aging. IEEE Transactions on Image Processing (TIP), Vol. 33, pp. 1795-1809, 2024.
Xinye Zhou, Hu Han, Shiguang Shan, Xilin Chen. Fine-grained Open-set Deepfake Detection via Unsupervised Domain Adaptation. IEEE Transactions on Information Forensics and Security (TIFS), Vol. 19, pp. 7536-7547, 2024.
Jiachen Chen, Mengyang Li, Hu Han, Zhiming Zhao, and Xilin Chen. SurgNet: Self-supervised pretraining with semantic consistency for vessel and instrument segmentation in surgical images. IEEE Transactions on Medical Imaging (TMI), Vol. 43, No. 4, pp. 1513-1525, 2024.
Shikang Yu, Hu Han, Shiguang Shan, and Xilin Chen. CMOS-GAN: Semi-supervised Generative Adversarial Model for Cross-Modality Face Image Synthesis, IEEE Transactions on Image Processing (TIP), vol. 32, pp. 144-158, 2023.
Abhijit Das, Xuesong Niu, Antitza Dantcheva, S L Happy, Hu Han, Radia Zeghari, Philippe Robert, Shiguang Shan, Francois Bremond and Xilin Chen. A Spatio-temporal Approach for Apathy Classification. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 32(5): 2561-2573, May 2022.
Guoqing Wang, Hu Han, Shiguang Shan and Xilin Chen. Unsupervised Adversarial Domain Adaptation for Cross-domain Face Presentation Attack Detection. IEEE Transactions on Information Forensics and Security (TIFS), 16:56-69, 2021.
Haomiao Sun, Hongyu Pan, Hu Han and Shiguang Shan. Deep Conditional Distribution Learning for Age Estimation. IEEE Transactions on Information Forensics and Security (TIFS), 16:4679-4690, 2021.
Xuesong Niu, Shiguang Shan*, Hu Han, and Xilin Chen, \"RhythmNet: End-to-end Heart Rate Estimation from Face via Spatial-temporal Representation,\" IEEE Transactions on Image Processing (T-IP), vol. 29, pp. 2409-2423, 2020.
Jiancheng Cai, Hu Han, Shiguang Shan, and Xilin Chen, \"FCSR-GAN: Joint Face Completion and Super-resolution via Multi-task Learning,\" IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 2, no. 2, pp. 109-121, April 2020.
Xuesong Niu, Hu Han, and Shiguang Shan, "Remote Photoplethysmography-based Physiological Measurement: A Survey," Journal of Image and Graphics (CIG), vol.25, no.11, pp.2321-2336, 2020.
Keyurkumar Patel, Hu Han, Anil K.Jain, \"Secure Face Unlock:Spoof Detection on Smartphones,\" IEEE Transactions on Information Forensics and Security, vol. 11, no. 10, pp. 2268-2283, 2016.
Hu Han, Charles Otto, Xiaoming Liu, Anil K.Jain, \"Demographic Estimation from Face Images:Human vs. Machine Performance,\" IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI2015), vol. 37, no. 6, pp. 1148-1161, 2015.
Di Wen, Hu Han, Anil K.Jain, "Face Spoof Detection with Image Distortion Analysis," IEEE Transactions on Information Forensics and Security, vol. 10, no. 4, pp. 746-761, 2015.
Hu Han, Anil K.Jain, Fang Wang, Shiguang Shan, Xilin Chen, \"Heterogeneous Face Attribute Estimation:A Deep Multi-Task Learning Approach,\" IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI), vol. 40, no. 11, pp. 2597-2609, Nov. 2018.
Hu Han, Jie Li, Anil Jain, Shiguang Shan, Xilin Chen, \"Tattoo Image Search at Scale: Joint Detection and Compact Representation Learning,\" IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), vol. 41, no. 10, pp. 2333-2348, Oct. 1, 2019.

Conference Papers

Yufei Cai, Hu Han, Yuxiang Wei, Shiguang Shan, Xilin Chen. EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
Yufei Cai, Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hu Han, Wangmeng Zuo. Decoupled Textual Embeddings for Customized Image Generation. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 909-917, Vancouver, Canada, Feb. 20–27, 2024.
Haozhan Wu, Hu Han, Shiguang Shan, Xilin Chen. Multi-view consistent 3D GAN inversion via bidirectional encoder. IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-10, Istanbul, Turkiye, May. 27-31, 2024.
Ce Wang, Xiaoyu Huang, Yaqing Kong, Qian Li, You Hao & Xiang Zhou. 3DGPS: A 3D Differentiable-Gaussian-Based Planning Strategy for Liver Tumor Cryoablation. International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 3-13, Marrakesh, Morocco, Oct. 6–10, 2024.
Yewei Zhao, Hu Han, Shiguang Shan, and Xilin Chen. Deep subdomain alignment for cross-domain image classification. IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2808-2817, Waikoloa, Hawaii, USA, Jan. 1-8, 2024.
Yifan Li, Hu Han, Shiguang Shan, Zhilong Ji, Jinfeng Bai, and Xilin Chen. ReCoT: Regularized Co-training for Facial Action Unit Recognition with Noisy Labels. British Machine Vision Conference, Aberdeen, UK, Nov. 20-24, 2023.
Yifan Li, Hu Han, Shiguang Shan, and Xilin Chen. DISC: Learning from Noisy Labels via Dynamic Instance-Specific Selection and Correction. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 24070-24079, Vancouver, Canada, Jun. 18-22, 2023.
Shikang Yu, Jiachen Chen, Hu Han, and Shuqiang Jiang. Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 24266-24275, Vancouver, Canada, Jun. 18-22, 2023.
Jinsong Chen, Hu Han and Shiguang Shan. Towards High-Fidelity Face Self-occlusion Recovery via Multi-view Residual-based GAN Inversion. AAAI Conference on Artificial Intelligence (AAAI), 36(1): 294-302, 2022.
Haomiao Sun, Shiguang San, and Hu Han. Intrinsic Imaging Model Enhanced Contrastive Face Representation Learning. IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-8, Hawaii, USA, Jan. 5-8, 2023.
Xuesong Niu, Hu Han, Shiguang Shan, and Xilin Chen, \"Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition,\" The Thirty-third Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, Dec. 8-14, 2019.
Xiaobai Li, Hu Han, Hao Lu, Xuesong Niu, Zitong Yu, Antitza Dantcheva, Guoying Zhao, Shiguang Shan, "The 1st Challenge on Remote Physiological Signal Sensing (RePSS)," in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp.1-8, Seattle, Washington, USA, Jun. 14-19, 2020.
Guoqing Wang, Hu Han, Shiguang Shan, Xilin Chen, \"Cross-domain Face Presentation Attack Detection via Multi-domain Disentangled Representation Learning,\" IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2020), pp. 6678-6687, 2020.
Xuesong Niu, Zitong Yu, Hu Han, Xiaobai Li, Shiguang Shan, Guoying Zhao, "Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling," Proceedings of the 16th European Conference on Computer Vision (ECCV), Glasgow, UK / Cyberspace, August 23-28, 2020.
Jinsong Chen, Hu Han and Shiguang Shan. Towards High-Fidelity Face Self-occlusion Recovery via Multi-view Residual-based GAN Inversion. AAAI Conference on Artificial Intelligence (AAAI), 2022. (Accepted)
Abhijit Das, Hao Lu, Hu Han, Antitza Dantcheva, Shiguang Shan and Xilin Chen. BVPNet: Video-to-BVP Signal Prediction for Remote Heart Rate Estimation. International Conference on Automatic Face and Gesture Recognition (FG), 2021. (Accepted)
Guoqing Wang, Hu Han, Shiguang Shan, Xilin Chen, \"Improving Cross-database Face Presentation Attack Detection via Adversarial Domain Adaptation,\" 12th IAPR International Conference on Biometrics (ICB 2019), pp. 1-8, Crete, Greece, Jun 4-7, 2019.
Mengying Hu, Hu Han, Shiguang Shan, Xilin Chen, “Weakly Supervised Image Classification through Noise Regularization,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR2019), pp. 11517-11525, Long Beach, California, USA, June 16-20, 2019.
Xuesong Niu, Xingyuan Zhao, Hu Han, Abhijit Das, Antitza Dantcheva, Shiguang Shan, Xilin Chen,"Robust Remote Heart Rate Estimation from Face Utilizing Spatial-temporal Attention," Proceedings of the 14th IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-8, Lille, France, May 14-18, 2019.
Shikang Yu, Hu Han, Shiguang Shan, Antitza Dantcheva, and Xilin Chen, "Improving Face Sketch Recognition via Adversarial Sketch-Photo Transformation," Proceedings of the 14th IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 1-8, Lille, France, May 14-18, 2019.