多媒体计算与多模态智能组
组 长: 蒋树强 研究员
Email: sqjiang [at] dot ict dot ac dot cn
课题组简介
    现有研究员1人、副研究员1人、博士后1人、博士和硕士研究生10余人。
    曾经或正在承担国家自然科学基金杰出青年科学基金、国家自然科学基金优秀青年科学基金、国家自然科学基金重点项目、国家自然科学基金面上项目、国家863课题、北京市科技项目、企业合作项目等课题二十余项。

获奖情况:
    在基于搜索的多物体识别技术获得ACM ICMR2013 Best Demo Award;基于多传感器的视觉识别技术获得ImageClef Robot Vision竞赛2013年度的冠军,在图像与语言的关联理解技术上的工作分别获ACM Multimedia 2016 Yahoo-Flickr Challenge on Caption Prediction竞赛的冠军。


数据库:

    复杂场景下的实例级图像数据集,主页为:http://vipl.ict.ac.cn/isia/instre/;论文:Shuang Wang, Shuqiang Jiang, INSTRE: A New Benchmark for Instance-Level Object Retrieval and Recognition. ACM Transactions on Multimedia Computing, Communications, and Applications(TOMCAT) Vol.11(3), pp. 37:1-37:21, 2015
    建立了基于多传感器的手持物体检测数据集,主页为:http://vipl.ict.ac.cn/isia/HOD/;论文:Xiong Lv, Shuqiang Jiang, Luis Herranz, Shuang Wang, RGB-D Hand-Held Object Recognition Based on Heterogeneous Feature Fusion. Journal of Computing Science and Technology, Vol.30(2), pp.340-352 ,2015
    建立了面向食品图像识别和多模态菜谱分析的多个食品相关数据集,主页为:http://123.57.42.89/FoodComputing__Dataset.html
研究内容

图像/视频等多媒体信息的分析、理解与搜索技术;
视觉、语言、知识库和各种上下文信息的多模态关联、融合与理解技术;
多模态智能交互技术。


部分论文

刊物论文

  • Chunjie Zhang, Guibo Zhu, Chao Liang, Yifan Zhang, Qingming Huang, Qi Tian, “Image Class Prediction by Joint Object, Context, and Background Modeling”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, Issue 2, 2018, 428-438.
  • Zhe Xue, Guorong Li*, Shuhui Wang, Weigang Zhang, Qingming Huang*, “Bilevel Multiview Latent Space Learning”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, Issue 2, 2018, 327-341.
  • Dawei Du, Longyin Wen, Honggang Qi, Qingming Huang, Qi Tian, and Siwei Lyu, “Iterative Graph Seeking for Object Tracking”, IEEE Transactions on Image Processing, Vol. 27, Issue 4, 2018, 1809-1821.
  • Xinhang Song, Shuqiang Jiang, Luis Herranz, Chengpeng Chen, \"Learning Effective RGB-D Representations for Scene Recognition,\" IEEE Transactions on Image Processing (TIP), doi:10.1109/TIP.2018.2872629.
  • Weiqing Min, Bing-Kun Bao, Shuhuan Mei, Yaohui Zhu, Yong Rui, Shuqiang Jiang, \"You Are What You Eat: Exploring Rich Recipe Information for Cross-Region Food Analysis,\" IEEE Trans. Multimedia, 20(4): 950-964, 2018.
  • Shuhuan Mei, Weiqing Min, Hua Duan, Shuqiang Jiang, \"Instance-level object retrieval via deep region CNN,\" Multimedia Tools and Applications (MTAP), 78(10): 13247-13261, 2019.
  • Chengpeng Chen, Weiqing Min, Xue Li, Shuqiang Jiang, \"Hybrid incremental learning of new data and new classes for hand-held object recognition,\" Journal of Visual Communication and Image Representation(JVCI), 58: 138-148, 2019
  • Weiqing Min,Shuqiang Jiang, Linhu Liu,Yong Rui and Ramesh Jain, \"A Survey on Food Computing,\" ACM Computing Surveys (CSUR), 52(2): 92:1–92:36, 2019
  • Xinhang Song, Shuqiang Jiang, Luis Herranz, Chengpeng Chen, "Learning Effective RGB-D Representations for Scene Recognition," IEEE Transactions on Image Processing (TIP), Vol. 28, No. 2, pp. 980-993, February 2019.
  • Xiangyang Li, Shuqiang Jiang, “Know More Say Less: Image Captioning Based on Scene Graphs,” IEEE Transactions on Multimedia (TMM), vol.21, no.8, pp.2117-2130, Aug.2019.
  • Shuqiang Jiang, Weiqing Min, Shuhuan Mei, \"Hierarchy-Dependent Cross-Platform Multi-View Feature Learning for Venue Category Prediction,\" IEEE Transactions on Multimedia, 21(6): 1609–1619, 2019.
  • Weiqing Min, Shuqiang Jiang, and Ramesh Jain, "Food Recommendation: Framework, Existing Solutions and Challenges," IEEE Transactions on Multimedia (TMM), vol.22, no.10, pp.2659-2671, 2020.
  • Weiqing Min, Shuhuan Mei, Linhu Liu, Yi Wang, and Shuqiang Jiang, "Multi-Task Deep Relative Attribute Learning for Visual Urban Perception," IEEE Trans. on Image Processing, 29(1): 657-669, 2020.
  • Gongwei Chen, Xinhang Song, Haitao Zeng, Shuqiang Jiang, \"Scene Recognition With Prototype-Agnostic Scene Layout,\" IEEE Transactions on Image Processing (TIP), vol.29, pp.5877-5888, 2020.
  • Shuqiang Jiang, Weiqing Min, Linhu Liu, Zhengdong Luo, \"Multi-Scale Multi-View Deep Feature Aggregation for Food Recognition,\" IEEE Trans. on Image Processing, 29(1): 265-276, 2020.
  • Xinhang Song, Shuqiang Jiang, Bohan Wang, Chengpeng Chen, Gongwei Chen, "Image Representations with Spatial Object-to-Object Relations for RGB-D Scene Recognition," IEEE Transactions on Image Processing (TIP), vol.29, pp.525-537, 2020.
  • Xiangyang Li, Luis Herranz, Shuqiang Jiang. Multifaceted Analysis of Fine-Tuning in a Deep Model for Visual Recognition[J]. ACM Transactions on Data Science (ACM TDS), vol. 1, no. 1, pp. 4:1-4:22, 2020.
  • Shuqiang Jiang, Weiqing Min, Yongqiang Lyu, Linhu Liu, "Few-Shot Food Recognition via Multi-View Representation Learning," ACM Transactions on Multimedia Computing, Communications and Applications, vol.16, no.3, pp.87:1-87:20, 2020.
  • Weiqing Min, Shuhuan Mei, Zhuo Li and Shuqiang Jiang, "A Two-Stage Triplet Network Training Framework for Image Retrieval," IEEE Trans. Multim. 22(12): 3128-3138 (2020)
  • Yaohui Zhu, Weiqing Min and Shuqiang Jiang. Attribute-Guided Feature Learning for Few-Shot Image Recognition. IEEE Transactions on Multimedia (TMM), 23: 1200-1209, 2021.

会议论文

  • Shulin Li, Weigang Zhang, Guorong Li, Li Su, Qingming Huang, \"Vehicle Detection in UAV Traffic Video Based on Convolution Neural Network,\" IEEE 1st International Conference on Multimedia Information Processing and Retrieval(ICESIP2018), Miami, FL, USA, April 10-12, 2018.
  • Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang*, “From Common to Special: When Multi-Attribute Learning Meets Personalized”, 32nd AAAI Conference on Artificial Intelligence (AAAI2018), New Orleans, Lousiana, United States, Feb 2-7, 2018.
  • Yaohui Zhu, Shuqiang Jiang, \"Deep Structured Learning for Visual Relationship Detection,\" Thirty-Second AAAI Conference on Artificial Intelligence(AAAI2018), New Orleans, Lousiana, USA, February 2-7, 2018.
  • Yongjian Xin, Shuhui Wang, Liang Li, Weigang Zhang, Qingming Huang, \"Reverse Densely Connected Feature Pyramid Network for Object Detection,\" 14th Asian Conference on Computer Vision (ACCV2018), Perth, Australia, December 2-6, 2018.
  • Qianqian Xu, Jiechao Xiong, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang,Yuan Yao, “A Margin-based MLE for Crowdsourced Partial Ranking”, 26th ACM International Conference on Multimedia (ACMMM2018), Seoul, Korea, October 22-26, 2018.
  • Jiangyangbang Yan, Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang, “When to Learn What: Deep Cognitive Subspace Clustering”, 26th ACM International Conference on Multimedia (ACMMM2018), Seoul, Korea, October 22-26, 2018.
  • Yiling Wu, Shuhui Wang, Qingming Huang, “Learning Semantic Structure-preserved Embeddings for Cross-modal Retrieval”, 26th ACM International Conference on Multimedia (ACMMM2018), Seoul, Korea, October 22-26, 2018.
  • Yongqing Zhu, Shuqiang Jiang. “Attention-based Densely Connected LSTM for Video Captioning,” 27th ACM International Conference on Multimedia (ACM Multimedia 2019), Nice, France, October 21-25, 2019.
  • Weiqing Min, Linhu Liu, Zhengdong Luo, Shuqiang Jiang, \"Ingredient-Guided Cascaded Multi-Attention Network for Food Recognition,\" 27th ACM International Conference on Multimedia (ACM Multimedia 2019), Nice, France, October 21-25, 2019.
  • Xinhang Song, Bohan Wang, Gongwei Chen, Shuqiang Jiang, \"MUCH: MUtual Coupling enHancement of scene recognition and dense captioning,\" 27th ACM International Conference on Multimedia (ACM Multimedia 2019), Nice, France, October 21-25, 2019.
  • Xinhang Song, Sixian Zhang, Yuyun Hua, Shuqiang Jiang, \"Aberrance-aware gradient-sensitive attentions for scene recognition with RGB-D videos,\" 27th ACM International Conference on Multimedia (ACM Multimedia 2019), Nice, France, October 21-25, 2019.
  • Tianyu Zhang, Weiqing Min, Ying Zhu, Yong Rui, Shuqiang Jiang, "An Egocentric Action Anticipation Framework via Fusing Intuition and Analysis," 28th ACM International Conference on Multimedia (ACM Multimedia 2020), pp.402-410, Seattle, United States, October 12-16, 2020.
  • Xinhang Song, Haitao Zeng, Sixian Zhang, Luis Herranz, Shuqiang Jiang, "Generalized Zero-shot Learning with Multi-source Semantic Embeddings for Scene Recognition," 28th ACM International Conference on Multimedia (ACM Multimedia 2020), Seattle, United States, October 12-16, 2020.
  • Weiqing Min, Linhu Liu, Zhiling Wang, Zhengdong Luo, Xiaoming Wei, Xiaolin Wei, Shuqiang Jiang, "ISIA Food-500: A dataset for Large-Scale Food Recognition via Stacked Global-Local Attention Network," 28th ACM International Conference on Multimedia (ACM Multimedia 2020), pp.393-401, Seattle, United States, October 12-16, 2020.
  • Xiaoqian Guo, Xiangyang Li, Shuqiang Jiang, "Expressional Region Retrieval," 28th ACM International Conference on Multimedia (ACM Multimedia 2020), pp.2581-2589, Seattle, United States, October 12-16, 2020.
  • Sixian Zhang, Xinhang Song, Yubing Bai, Weijie Li, Yakui Chu and Shuqiang Jiang. Hierarchical Object-to-zone Graph for Object Navigation. IEEE/CVF International Conference on Computer Vision (ICCV), pages 15130–15140, Montreal, Canada, Oct. 11-17, 2021.
  • Tianyu Zhang, Weiqing Min, Jiahao Yang, Tao Liu, Shuqiang Jiang and Yong Rui. What If We Could Not See? Counterfactual Analysis for Egocentric Action Anticipation. International Joint Conference on Artificial Intelligence (IJCAI), pp. 1316-1322, Virtual Event / Montreal Canada, Aug. 19-26, 2021.
  • Weijie Li, Xinhang Song, Yubing Bai, Sixian Zhang and Shuqiang Jiang. ION: Instance-level Object Navigation. ACM International Conference on Multimedia (ACM Multimedia), pp. 4343-4352, Chengdu, China, Oct. 20-24, 2021.
  • Zhuo Li, Weiqing Min, Jiajun Song, Yaohui Zhu, Liping Kang, Xiaoming Wei, Xiaolin Wei and Shuqiang Jiang. Rethinking the Optimization of Average Precision: Only Penalizing Negative Instances before Positive Ones is Enough. AAAI Conference on Artificial Intelligence (AAAI), 2022. (Accepted)
  • Gongwei Chen, Xinhang Song, Bohan Wang and Shuqiang Jiang. See More for Scene: Pairwise Consistency Learning for Scene Classification. Annual Conference on Neural Information Processing Systems (NeurIPS), 2021. (Accepted)