Shuqiang Jiang

Ph.D

Tel:

010-62600505

Email:

sqjiang@ict.ac.cn

Address:

No.6 Kexueyuan South Road Zhongguancun,Haidian District Beijing,China The Institute of Computing Technology of the Chinese Academy of Sciences Key Laboratory of Intelligent Information Processing 100190

Combining Models from Multiple Sources for RGB-D Scene Recognition.

Xinhang Song, Shuqiang Jiang, Luis Herranz,

IJCAI 2017: 4523-4529, Melbourne, Australia, August 19-25, 2017

[PDF ]

Abstract

Depth can complement RGB with useful cues about object volumes and scene layout. However, RGB-D image datasets are still too small for directly training deep convolutional neural networks (CNNs), in contrast to the massive monomodal RGB datasets. Previous works in RGB-D recognition typically combine two separate networks for RGB and depth data, pretrained with a large RGB dataset and then fine tuned to the respective target RGB and depth datasets. These approaches have several limitations: 1) only use low-level filters learned from RGB data, thus not being able to exploit properly depth-specific patterns, and 2) RGB and depth features are only combined at high-levels but rarely at lower-levels. In this paper, we propose a framework that leverages both knowledge acquired from large RGB datasets together with depth-specific cues learned from the limited depth data, obtaining more effective multi-source and multi-modal representations. We propose a multi-modal combination method that selects discriminative combinations of layers from the different source models and target modalities, capturing both high-level properties of the task and intrinsic low-level properties of both modalities.

Xinhang Song, Shuqiang Jiang , Luis Herranz. “Combining Models from Multiple Sources for RGB-D Scene Recognition”, in IJCAI2017, CCF A

Download: