多媒体计算与多模态智能组
组 长: 蒋树强 研究员
Email: sqjiang [at] dot ict dot ac dot cn
课题组简介

现有研究员1人、副研究员1人、博士后1人、博士和硕士研究生10余人。

曾经或正在承担国家自然科学基金杰出青年科学基金、国家自然科学基金优秀青年科学基金、国家自然科学基金重点项目、国家自然科学基金面上项目、国家863课题、北京市科技项目、企业合作项目等课题二十余项。


获奖情况:

在基于搜索的多物体识别技术获得ACM ICMR2013 Best Demo Award;基于多传感器的视觉识别技术获得ImageClef Robot Vision竞赛2013年度的冠军,在图像与语言的关联理解技术上的工作分别获ACM Multimedia 2016 Yahoo-Flickr Challenge on Caption Prediction竞赛的冠军。


数据库:

复杂场景下的实例级图像数据集,主页为:http://vipl.ict.ac.cn/isia/instre/;论文:Shuang Wang, Shuqiang Jiang, INSTRE: A New Benchmark for Instance-Level Object Retrieval and Recognition. ACM Transactions on Multimedia Computing, Communications, and Applications(TOMCAT) Vol.11(3), pp. 37:1-37:21, 2015

建立了基于多传感器的手持物体检测数据集,主页为:http://vipl.ict.ac.cn/isia/HOD/;论文:Xiong Lv, Shuqiang Jiang, Luis Herranz, Shuang Wang, RGB-D Hand-Held Object Recognition Based on Heterogeneous Feature Fusion. Journal of Computing Science and Technology, Vol.30(2), pp.340-352 ,2015

建立了面向食品图像识别和多模态菜谱分析的多个食品相关数据集,主页为:http://123.57.42.89/FoodComputing__Dataset.html

研究内容

图像/视频等多媒体信息的分析、理解与搜索技术;
视觉、语言、知识库和各种上下文信息的多模态关联、融合与理解技术;
多模态智能交互技术。


部分论文

刊物论文

  • Xinhang Song, Bohan Wang, Liye Dong, Gongwei Chen, Xinyun Hu, Shuqiang Jiang. Object-to-Manipulation Graph for Affordance Navigation. CAAI Artificial Intelligence Research, 3: 9150032, 2024.
  • Yancun Yang, Weiqing Min, Jingru Song, Guorui Sheng, Lili Wang, Shuqiang Jiang. Lightweight Food Recognition via Aggregation Block and Feature Encoding. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM), Vol. 20, No. 10, pp. 1-25, 2024.
  • Tianyu Zhang, Weiqing Min, Tao Liu, Shuqiang Jiang, Yong Rui. Toward Egocentric Compositional Action Anticipation with Adaptive Semantic Debiasing. ACM Transactions on Multimedia Computing, Communications and Applications (TOMM), Vol. 20, No. 5, pp. 1-21, 2024.
  • Qizheng Wang, Meiyi Yao, Xinhang Song, Yandong Liu, Xiaoying Xing, Yongye Chen, Fangbo Zhao, Ke Liu, Xiaoguang Cheng, Shuqiang Jiang, Ning Lang. Automated Segmentation and Classification of Knee Synovitis Based on MRI Using Deep Learning. Academic Radiology, Vol. 31, No. 4, pp. 1518-1527, 2024.
  • Tingjing Zhang, Mingyu Huang, Liangkai Chen, Yang Xia, Weiqing Min, Shuqiang Jiang. Machine learning and statistical models to predict all-cause mortality in type 2 diabetes: Results from the UK Biobank study. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 18(9): 103135, 2024.
  • Zhihui Feng, Hao Xiong, Weiqing Min, Sujuan Hou, Huichuan Duan, Zhonghua Liu, Shuqiang Jiang. Ingredient-Guided RGB-D Fusion Network for Nutritional Assessment. IEEE Transactions on AgriFood Electronics, 2024.
  • Guorui Sheng, Weiqing Min, Tao Yao, Jingru Song, Yancun Yang, Lili Wang, Shuqiang Jiang. Lightweight Food Image Recognition with Global Shuffle Convolution. IEEE Transactions on AgriFood Electronics (TAFE), Vol. 2, No. 2, pp. 392-402, 2024.
  • Yuxin Liu, Weiqing Min, Shuqiang Jiang, Yong Rui. Convolution-Enhanced Bi-Branch Adaptive Transformer with Cross-Task Interaction for Food Category and Ingredient Recognition. IEEE Transactions on Image Processing (TIP), Vol. 33, pp. 2572-2586, 2024.
  • Pengfei Zhou, Weiqing Min, Jiajun Song, Yang Zhang, Shuqiang Jiang. Synthesizing knowledge-enhanced features for real-world zero-shot food detection. IEEE Transactions on Image Processing (TIP), Vol. 33, pp. 1285-1298, 2024.
  • Qizheng Wang, Meiyi Yao, Xinhang Song, et al.. Automated Segmentation and Classification of Knee Synovitis Based on MRI Using Deep Learning. Academic Radiology, 2023.
  • Sujuan Hou, Jiacheng Li, Weiqing Min, Qiang Hou, Yanna Zhao, Yuanjie Zheng, Shuqiang Jiang. Deep Learning for Logo Detection: A Survey, ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 20, No. 3, pp. 1-23, 2023.
  • Jiajun Song, Zhuo Li, Weiqing Min, Shuqiang Jiang. Towards Food Image Retrieval via Generalization-oriented Sampling and Loss Function Design, ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 20, No. 1, pp. 1-19, 2023.
  • Wenjing Shao, Weiqing Min, Sujuan Hou, Mengjiang Luo, Tianhao Li and Yuanjie Zheng and Shuqiang Jiang. Vision-based Food Nutrition Estimation via RGB-D Fusion Network. Food Chemistry, Vol. 424, 2023.
  • Weiqing Min, Zhiling Wang, Jiahao Yang, Chunlin Liu, Shuqiang Jiang. Vision-based fruit recognition via multi-scale attention CNN. Computers and Electronics in Agriculture, Vol. 210, 2023.
  • Tianyu Zhang, Weiqing Min, Xinyang Han, Shuqiang Jiang. A Survey on Future Action Anticipation in Videos. CHINESE JOURNAL OF COMPUTERS, Vol. 46, No. 6, pp: 1315-1338, 2023.
  • Mengjiang Luo, Weiqing Min, Zhiling Wang, Jiajun Song, Shuqiang Jiang. Ingredient Prediction via Context Learning Network with Class-Adaptive Asymmetric Loss. IEEE Transaction on Image Processing, Vol. 32, pp. 5509-5523, 2023.
  • Xinhang Song, Chenlong Liu, Haitao Zeng, Yaohui Zhu, Gongwei Chen, Xiaorong Qin, Shuqiang Jiang: Composite Object Relation Modeling for Few-Shot Scene Recognition. IEEE Transactions on Image Processing (TIP), Vol. 32, pp. 5678-5691, 2023.
  • Tianhao Li, Wensong Wei, Weiqing Min, Shujuan Xing, Chunjiang Zhang, Shuqiang Jiang. Deep Learning-based Near-infrared Hyperspedtral Imaging for Food Nutrition Estimation. Foods, Vol. 12, No 17, pp. 3145, 2023.
  • Haitao Zeng, Xinhang Song, Shuqiang Jiang: Multi-Object Navigation Using Potential Target Position Policy Function. IEEE Transactions on Image Processing (TIP), Vol. 32, pp. 2608-2619, 2023.
  • Jiahao Yang, Xiangyang Li, Mao Zheng, Zihan Wang, Yongqing Zhu, Xiaoqian Guo, Yuchen Yuan, Zifeng Chai, Shuqiang Jiang. MemBridge: Video-Language Pre-training with Memory-Augmented Inter-Modality Bridge. IEEE Transactions on Image Processing (TIP), vol. 32, pp. 4073-4087, 2023.

会议论文

  • Xiaohan Wang, Yuehu Liu, Xinhang Song, Yuyi Liu, Sixian Zhang, Shuqiang Jiang. An Interactive Navigation Method with Effect-oriented Affordance. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16446-16456, Seattle WA, USA, Jun. 17-21, 2024.
  • Zihan Wang, Xiangyang Li, Jiahao Yang, Yeqi Liu, Shuqiang Jiang. Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation. Conference on Robot Learning (CRL), Munich, Germany, Nov. 6-9, 2024.
  • Yuyi Liu, Xinhang Song, Weijie Li, Xiaohan Wang, Shuqiang Jiang. A Category Agnostic Model for Visual Rearrangement. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16457-16466, Seattle WA, USA, Jun. 17-21, 2024.
  • Sixian Zhang, Xinyao Yu, Xinhang Song, Xiaohan Wang, Shuqiang Jiang. Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16414-16425, Seattle WA, USA, Jun. 17-21, 2024.
  • Zihan Wang, Xiangyang Li, Jiahao Yang, Yeqi Liu, Junjie Hu, Ming Jiang, Shuqiang Jiang. Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13753-13762, Seattle WA, USA, Jun. 17-21, 2024.
  • Xinyao Yu, Sixian Zhang, Xinhang Song, Xiaorong Qin, Shuqiang Jiang. Trajectory Diffusion for ObjectGoal Navigation. Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, Dec. 10-15, 2024.
  • Xiaohan Wang, Yuehu Liu, Xinhang Song, Beibei Wang, Shuqiang Jiang. CaMP: Causal Multi-policy Planning for Interactive Navigation in Multi-room Scenes. Annual Conference on Neural Information Processing Systems (NeurIPS), New Orleans, LA, Dec. 10-16, 2023.
  • Xiaohan Wang, Yuehu Liu, Xinhang Song, Beibei Wang, Shuqiang Jiang. Generating Explanations for Embodied Action Decision from Visual Observation. Proceedings of the 31st ACM International Conference on Multimedia, pp. 2838-2846, Ottawa, Canada, 2023.
  • Xiaorong Qin, Xinhang Song, Shuqiang Jiang. Bi-level Meta-learning for Few-shot Domain Generalization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15900-15910, Vancouver, Canada, Jun. 18-22, 2023.
  • Sixian Zhang, Xinhang Song, Weijie Li, Yubing Bai, Xinyao Yu, Shuqiang Jiang. Layout-Based Causal Inference for Object Navigation. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10792-10802, Vancouver, Canada, Jun. 18-22, 2023.
  • Yongqing Zhu, Shuqiang Jiang. “Attention-based Densely Connected LSTM for Video Captioning,” 27th ACM International Conference on Multimedia (ACM Multimedia 2019), Nice, France, October 21-25, 2019.
  • Weiqing Min, Linhu Liu, Zhengdong Luo, Shuqiang Jiang, "Ingredient-Guided Cascaded Multi-Attention Network for Food Recognition," 27th ACM International Conference on Multimedia (ACM Multimedia 2019), Nice, France, October 21-25, 2019.
  • Xinhang Song, Bohan Wang, Gongwei Chen, Shuqiang Jiang, "MUCH: MUtual Coupling enHancement of scene recognition and dense captioning," 27th ACM International Conference on Multimedia (ACM Multimedia 2019), Nice, France, October 21-25, 2019.
  • Shulin Li, Weigang Zhang, Guorong Li, Li Su, Qingming Huang, "Vehicle Detection in UAV Traffic Video Based on Convolution Neural Network," IEEE 1st International Conference on Multimedia Information Processing and Retrieval(ICESIP2018), Miami, FL, USA, April 10-12, 2018.
  • Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang*, “From Common to Special: When Multi-Attribute Learning Meets Personalized”, 32nd AAAI Conference on Artificial Intelligence (AAAI2018), New Orleans, Lousiana, United States, Feb 2-7, 2018.
  • Yaohui Zhu, Shuqiang Jiang, "Deep Structured Learning for Visual Relationship Detection," Thirty-Second AAAI Conference on Artificial Intelligence(AAAI2018), New Orleans, Lousiana, USA, February 2-7, 2018.
  • Yongjian Xin, Shuhui Wang, Liang Li, Weigang Zhang, Qingming Huang, "Reverse Densely Connected Feature Pyramid Network for Object Detection," 14th Asian Conference on Computer Vision (ACCV2018), Perth, Australia, December 2-6, 2018.
  • Qianqian Xu, Jiechao Xiong, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang,Yuan Yao, “A Margin-based MLE for Crowdsourced Partial Ranking”, 26th ACM International Conference on Multimedia (ACMMM2018), Seoul, Korea, October 22-26, 2018.
  • Jiangyangbang Yan, Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang, “When to Learn What: Deep Cognitive Subspace Clustering”, 26th ACM International Conference on Multimedia (ACMMM2018), Seoul, Korea, October 22-26, 2018.
  • Yiling Wu, Shuhui Wang, Qingming Huang, “Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval”, 26th ACM International Conference on Multimedia (ACMMM2018), Seoul, Korea, October 22-26, 2018.