Aberrance-aware gradient-sensitive attentions for scene recognition
with RGB-D videos

Xinhang Song, Sixian Zhang, Yuyun Hua, Shuqiang Jiang
(ACMMM 2019)


With the developments of deep learning, previous approaches have made successes in scene recognition with massive RGB data obtained from the ideal environments. However, scene recognition in real world may face various types of aberrant conditions caused by different unavoidable factors, such as the lighting variance of the environments and the limitations of cameras, which may damage the performance of previous models. In addition to ideal conditions, our motivation is to investigate researches on robust scene recognition models for unconstrained environments. In this paper, we propose an aberrance-aware framework for RGB-D scene recognition, where several types of attentions, such as temporal, spatial and modal attentions are integrated to spatio-temporal RGB-D CNN models to avoid the interference of RGB frame blurring, depth missing, and light variance. All the attentions are homogeneously obtained by projecting the gradient-sensitive maps of visual data into corresponding spaces. Particularly, the gradient maps are captured with the convolutional operations with the typically designed kernels, which can be seamlessly integrated into end-to-end CNN training. The experiments under different challenging conditions demonstrate the effectiveness of the proposed method.

  • Xinhang Song, Sixian Zhang, Yuyun Hua and Shuqiang Jiang. Aberrance-aware gradient-sensitive attentions for scene recognition with RGB-D videos. (ACM Multimedia 2019), 21-25 October 2019, Nice, France.