中科院计算所视觉信息处理与学习组
中科院计算所视觉信息处理与学习组


您所在的位置 / 科学研究

科学研究
多媒体计算与多模态智能组

组  长:蒋树强 研究员

Email:sqjiang@ict.ac.cn

*  现有研究员1人、博士后1人、博士和硕士研究生10余人。

*  曾经或正在承担国家自然科学基金优秀青年科学基金、国家自然科学基金重点项目、国家自然科学基金面上项目、国家863课题、北京市科技项目、企业合作项目等课题十余项。

*获奖情况:

在基于搜索的多物体识别技术获得ACM ICMR2013 Best Demo Award;
基于多传感器的视觉识别技术获得ImageClef Robot Vision竞赛2013年度的冠军,
在图像与语言的关联理解技术上的工作分别获ACM Multimedia 2016 Yahoo-Flickr Challenge on Caption Prediction竞赛的冠军。
数据库:

复杂场景下的实例级图像数据集,主页为:http://vipl.ict.ac.cn/isia/instre/;论文:Shuang Wang, Shuqiang Jiang, INSTRE: A New Benchmark for Instance-Level Object Retrieval and Recognition. ACM Transactions on Multimedia Computing, Communications, and Applications(TOMCAT) Vol.11(3), pp. 37:1-37:21, 2015
建立了基于多传感器的手持物体检测数据集,主页为:http://vipl.ict.ac.cn/isia/HOD/;论文:Xiong Lv, Shuqiang Jiang, Luis Herranz, Shuang Wang, RGB-D Hand-Held Object Recognition Based on Heterogeneous Feature Fusion. Journal of Computing Science and Technology, Vol.30(2), pp.340-352 ,2015
建立了基于地理信息的多模态食品图像数据集,主页为:http://vipl.ict.ac.cn/isia/datasets_dish/index.html;论文:Ruihan Xu, Luis Herranz, Shuqiang Jiang, Shuang Wang, Xinhang Song, Ramesh Jain, Geolocalized Modeling for Dish Recognition. IEEE Trans. Multimedia,  Vol.17(8), pp.1187-1199, 2015
研究内容

*图像/视频等多媒体信息的分析、理解与搜索技术;
*视觉、语言、知识库和各种上下文信息的多模态关联、融合与理解技术;
*多模态智能交互技术。


部分论文

刊物论文

1.    Liang Zhang, Bingpeng Ma, Guorong Li, Qingming Huang, Qi Tian, "Generalized Semi-Supervised and Structured Subspace Learning for Cross-Modal Retrieval," IEEE Transaction on Multimedia(TMM2017), vol. 20, no. 1, pp. 128-141, 2018. 【pdf】

2.    Xiangyang Li, Shuqiang Jiang, "Bundled Object Context for Referring Expressions," IEEE Transactions on Multimedia, 2018.(Accepted)

3.    Jiaming Zhang, Shuhui Wang, Qingming Huang, "Location-Based Parallel Tag Completion for Geo-Tagged Social Image Retrieval," ACM Transactions on Intelligent Systems and Technology, vol. 8, no. 3, pp. 1-21, 2017. 【pdf】

4.    Siyuan Liu, Shuhui Wang, Qiang Qu, "Trajectory Mining," Encyclopedia of GIS, pp. 2310-2313, 2017.

5.    Liang Zhang, Bingpeng Ma, Guorong Li, Qingming Huang, Qi Tian, "Cross-Modal Retrieval Using Multi-Ordered Discriminative Structured Subspace Learning," IEEE Transaction on Multimedia(TMM2017), vol. 19, no. 6, pp. 1220-1233, 2017. 【pdf】

6.    Weiqing Min, Bingkun Bao, Shuhuan Mei, Yaohui Zhu, Yong Rui, Shuqiang Jiang, "You Are What You Eat:Exploring Multi-Modal and Multi-Attribute Information from Recipes for Cross-Region Food Analysis," IEEE Transaction on Multimedia(TMM2017), 2017.

7.    Weiqing Min, Shuqiang Jiang, Jitao Sang, Huayang Wang, Xinda Liu, Luis Herranz, "Being a Supercook:Joint Food Attributes and Multimodal Content Modeling for Recipe Retrieval and Exploration," IEEE Transaction on Multimedia(TMM2017), 2017. 【pdf】

8.    Yanhao Zhang, Lei Qin, Rongrong Ji, Sicheng Zhao, Qingming Huang, Jiebo Luo, "Exploring Coherent Motion Patterns Via Structured Trajectory Learning for Crowd Mood Modeling," IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 3, pp. 635-648, 2017.

9.    Zhe Xue, Guorong Li, Shuhui Wang, Weigang Zhang, Qingming Huang, "Bi-Level Multi-View Latent Space Learning," IEEE Transactions on Circuits and Systems for Video Technology, (DOI:10.1109/TCSVT.2016.2607842) (Accepted) 【pdf】

10.    Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian, "Multimodal Similarity Gaussian Process Latent Variable Model," IEEE Transactions on Image Processing(TIP2017), vol. 26, no. 9, pp. 4168-4181, 2017. 【pdf】

11.    Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian, "Multimodal Similarity Gaussian Process Latent Variable Model," IEEE Transactions on Image Processing, vol. 26, no.9, pp. 4168-4181, 2017. 【pdf】

12.    Xinhang Song, Shuqiang Jiang, Luis Herranz, "Multi-Scale Multi-Feature Context Modeling for Scene Recognition in the Semantic Manifold," IEEE Transactions on Image Processing(TIP2017), 2017. 【pdf】

13.    Siyuan Liu, Shuhui Wang, "Trajectory Community Discovery and Recommendation by Multi-Source Diffusion Modeling," IEEE Transactions on Knowledge and Data Engineering(TKDE2017), vol. 29, no. 4, pp. 898-911, 2017.

14.    Chunjie Zhang, Chao Liang, Li Liang, Liu Jing, Qingming Huang, Qi Tian, "Fine-Grained Image Classification Via Low-Rank Sparse Coding with General and Class-Specific Codebooks," IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 7, pp. 1550-1559, 2017. 【pdf】

15.    Junbiao Pang, Jing Huang, Lei Qin, Weigang Zhang, Laiyun Qing, Qingming Huang, Baocai Yin, "Rotative Maximal Pattern:A Local Coloring Descriptor for Object Classification and Recognition," Information Sciences, Vol. 405, pp. 190-206, 2017.

16.    Shuqiang Jiang, Liangliang Cao, Jitao Sang, Jiebo Luo, Ramesh Jain, "Guest Editorial:Mobile Visual Tagging with Mobile Context," Multimedia Systems, vol. 23, no. 6, pp. 645-646, 2017. 【pdf】

17.    Weiqing Min, Shuqiang Jiang, Shuhui Wang, Ruihan Xu, Yushan Cao, Luis Herranz, Zhiqiang He, "A Survey on Context‑Aware Mobile Visual Recognition," Multimedia Systems, vol. 23, no. 6, pp. 647-665, 2017. 【pdf】

18.    Xiong Lv, Xinda Liu, Xiangyang Li, Xue Li, Shuqiang Jiang, Zhiqiang He, "Modality-Specific and Hierarchical Feature Learning for RGB-D Hand-Held Object Recognition," Multimedia Tools and Applications(MULTIMEDIA TOOLS APPL2017), vol. 76, no. 3, pp. 4273-4290, 2017. 【pdf】

19.    Jun Huang, Guorong Li, Shuhui Wang, Zhe Xue, Qingming Huang, "Multi-label Classification by Exploiting Local Positive and Negative Pairwise Label Correlation," Neurocomputing, Vol. 257, pp. 164-174, 2017. 【pdf】

20.    Luis Herranz, Shuqiang Jiang, "Scalable Storyboards in Handheld Devices:Applications and Evaluation Metrics," Multimedia Tools and Applications, vol. 75, no. 10, pp. 12597–12625, 2016.

会议论文

1.    Xinhang Song, Luis Herranz, Shuqiang Jiang, "Depth CNNs for RGB-D Scene Recognition: Learning from Scratch Better than Transferring from RGB-CNNs," AAAI Conference on Artificial Intelligence(AAAI2017), 2017. 【pdf】

2.    Junbao Zhuo, Shuhui Wang, Weigang Zhang, Qingming Huang, "Deep Unsupervised Convolutional Domain Adaptation," ACM Multimedia Conference(ACMMM2017), 2017. 【pdf】

3.    Liang Zhang, Bingpeng Ma, Guorong Li, Qingming Huang, Qi Tian, "Multi-Networks Joint Learning for Large-Scale Cross-Modal Retrieval," ACM Multimedia Conference(ACMMM2017), Mountain View, CA, USA, 2017. 【pdf】

4.    Weiqing Min, Shuqiang Jiang, Shuhui Wang, Jitao Sang, Shuhuan Mei, "A Delicious Recipe Analysis Framework for Exploring Multi-Modal Recipes with Various Aributes," ACM Multimedia Conference(ACMMM2017), 2017. 【pdf】

5.    Xinhang Song, Chengpeng Chen, Shuqiang Jiang, "RGB-D Scene Recognition with Object-to-Object Relation," ACM Multimedia Conference(ACMMM2017), 2017. 【pdf】

6.    Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang, "Multi-View Subspace Learning with Diversity Enforced Skeleton Embedding," IEEE International Conference on Multimedia Big Data(BigMM2017), Laguna Hills, USA, pp. 121-128, 2017.

7.    Sisi Liang, Xiangyang Li, Yongqing Zhu, Xue Li, Shuqiang Jiang, "ISIA at the ImageCLEF 2017 Image Caption Task," Conference and Labs of the Evaluation Forum(CLEF2017), 2017. 【pdf】

8.    Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang, "A Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning," IEEE Conference on Computer Vision and Pattern Recognition(CVPR2017), Honolulu, Hawaii, USA, 2017.

9.    Yiling Wu, Shuhui Wang, Weigang Zhang, Qingming Huang, "Online Asymmetric Similarity Learning for Cross-Modal Retrieval," IEEE Conference on Computer Vision and Pattern Recognition(CVPR2017), 2017. 【pdf】

10.    Guoli Song, Shuhui Wang, Qingming Huang, and Qi Tian, "Multimodal Gaussian Process Latent Variable Models with Harmonization," IEEE International Conference on Computer Vision(ICCV2017), pp. 5039-5047, Venice, Italy, 2017. 【pdf】

11.    Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian, "Multimodal Gaussian Process Latent Variable Models with Harmonization," IEEE International Conference on Computer Vision(ICCV2017), Venice, Italy, pp. 5029-5037, 2017. 【pdf】

12.    Liang Zhang, Bingpeng Ma, Guorong Li, Qingming Huang, "Metric based on Multi-order Spaces for Cross-modal Retrieval," IEEE International Conference on Multimedia and Expo(ICME2017), Hong Kong, China, 2017. 【pdf】

13.    Minfeng Zhan, Liang Li, Yugui Liu, Qingming Huang, "Cross-Media Retrieval with Semantics Clustering and Enhancement," IEEE International Conference on Multimedia and Expo(ICME2017), Hong Kong, China, 2017.

14.    Xiaodan Zhang, Shengfeng He, Xinhang Song, Pengxu Wei, Shuqiang Jiang, Qixiang Ye, Jianbin Jiao, Rynson W.H. Lau, "Keyword-Driven Image Captioning Via Context-Dependent Bilateral Lstm," IEEE International Conference on Multimedia and Expo(ICME2017), 2017. 【pdf】

15.    Yaohui Zhu, Shuqiang Jiang, Xiangyang Li, "Visual Relationship Detection with Object Spatial Distribution," IEEE International Conference on Multimedia and Expo(ICME2017), 2017. 【pdf】

16.    Yiling Wu, Shuhui Wang, Weigang Zhang, Qingming Huang, "Online Low-Rank Similarity Function Learning with Adaptive Relative Margin for Cross-Modal Retrieval," IEEE International Conference on Multimedia and Expo(ICME2017), 2017 【pdf】

17.    Liang Zhang, Bingpeng Ma, Jianfeng He, Guorong Li, Qingming Huang, Qi Tian, "Adaptively Unified Semi-Supervised Learning for Cross-Modal Retrieval," International Joint Conference on Artificial Intelligence(IJCAI2017), Melbourne, Australia, 2017. 【pdf】

18.    Shuqiang Jiang, Weiqing Min, Xue Li, Huayang Wang, Jian Sun, Jiaqi Zhou, "Dual Track Multimodal Automatic Learning through Human-Robot Interaction," International Joint Conference on Artificial Intelligence(IJCAI2017), 2017. 【pdf】

19.    Xinge Zhu, Liang Li, Weigang Zhang, Tianrong Rao, Min Xu, Qingming Huang, Dong Xu, "Dependency Exploitation:A Unified CNN-RNN Approach for Visual Emotion Recognition," International Joint Conference on Artificial Intelligence(IJCAI2017), Melbourne, Australia, 2017.

20.    Xinhang Song, Shuqiang Jiang, Luis Herranz, "Combining Models from Multiple Sources for RGB-D Scene Recognition," International Joint Conference on Artificial Intelligence(IJCAI2017), 2017. 【pdf】


视觉信息处理和学习组
  • 单位地址:北京海淀区中关村科学院南路6号
  • 邮编:100190
  • 联系电话:010-62600514
  • Email:yi.cheng@vipl.ict.ac.cn
  • Valse

  • 深度学习大讲堂

版权所有 @ 中科院计算所视觉信息处理与学习组 京ICP备05002829号 京公网安备1101080060