蒋树强个人主页
蒋树强
博士,研究员,博士生导师
电话:
010-62600505
邮箱:
sqjiang@ict.ac.cn
地址:
北京市海淀区科学院南路6号 中国科学院计算技术研究所 智能信息处理重点实验室 100190

Combining Models from Multiple Sources for RGB-D Scene Recognition.

Xinhang Song, Shuqiang Jiang, Luis Herranz,
IJCAI 2017: 4523-4529, Melbourne, Australia, August 19-25, 2017
[PDF ]

Abstract

Depth can complement RGB with useful cues about object volumes and scene layout. However, RGB-D image datasets are still too small for directly training deep convolutional neural networks (CNNs), in contrast to the massive monomodal RGB datasets. Previous works in RGB-D recognition typically combine two separate networks for RGB and depth data, pretrained with a large RGB dataset and then fine tuned to the respective target RGB and depth datasets. These approaches have several limitations: 1) only use low-level filters learned from RGB data, thus not being able to exploit properly depth-specific patterns, and 2) RGB and depth features are only combined at high-levels but rarely at lower-levels. In this paper, we propose a framework that leverages both knowledge acquired from large RGB datasets together with depth-specific cues learned from the limited depth data, obtaining more effective multi-source and multi-modal representations. We propose a multi-modal combination method that selects discriminative combinations of layers from the different source models and target modalities, capturing both high-level properties of the task and intrinsic low-level properties of both modalities.

  • Xinhang Song, Shuqiang Jiang , Luis Herranz. “Combining Models from Multiple Sources for RGB-D Scene Recognition”, in IJCAI2017, CCF A



Download: