Visual Scene Understanding
Leader: Ruiping Wang (Professor)
Email: ruiping.wang [at] vipl.ict.ac.cn
Introduction of research group

Our group focuses on comprehensive scene understanding to enable intelligent perception and understanding of natural visual environment in the open world. More specifically, we aim to propose a vision-based robot system that has the basic capability just like human visual processing system for real world visual scene understanding, mainly including perceptual tasks such as object detection, object recognition, semantic segmentation, scene classification, attribute learning, relationship extraction, and so on. To facilitate more advanced natural language based visual concept semantic description, the system can also incorporate language models and knowledge-based reasoning for cognitive tasks like image/video captioning (description) and visual question answering.

Research

Research topics of our group mainly cover three aspects: 1) Object recognition, e.g. zero-shot learning, incremental/life-long learning, image retrieval, image classification, etc. 2) Scene understanding, e.g. object detection/segmentation, scene classification, relationship detection, scene graph generation, etc., and 3) Language/knowledge-based cognition, e.g. image/video captioning (description), visual question answering, visual concept learning, knowledge graph, etc.

Papers

Journal Papers

  • Shishi Qiao, Ruiping Wang, Shiguang Shan and Xilin Chen. Deep Video Code for Efficient Face Video Retrieval. Pattern Recognition, 113:107754, 2021.
  • Zhiwu Huang, Ruiping Wang, Xianqiu Li, Wenxian Liu, Shiguang Shan, Luc Van Gool, Xilin Chen, \"Geometry-aware Similarity Learning on SPD Manifolds for Visual Recognition,\" IEEE Transactions on circuits and systems for video technology, 28(10), Page(s):2513 – 2523. 2018.10.
  • Wen Wang, Ruiping Wang, Zhiwu Huang, Shiguang Shan, Xilin Chen, “Discriminant Analysis on Riemannian Manifold of Gaussian Distributions for Face Recognition with Image Sets,” IEEE Transactions on Image Processing (TIP), vol. 27, no. 1, pp. 151-163, Jan. 2018.
  • Haomiao Liu, Ruiping Wang, Shiguang Shan and Xilin Chen, “Deep Supervised Hashing for Fast Image Retrieval,” International Journal of Computer Vision, vol. 127, no. 9, pp. 1217–1234, Sep. 2019.
  • Haomiao Liu, Ruiping Wang, Shiguang Shan, Xilin Chen, “Learning Multifunctional Binary Codes for Personalized Image Retrieval,” International Journal of Computer Vision, vol. 128, no. 8, pp. 2223–2242, Sep. 2020.
  • Difei Gao, Ruiping Wang, Shiguang Shan, and Xilin Chen, "Learning to Recognize Visual Concepts for Visual Question Answering with Structural Label Space," IEEE Journal of Selected Topics in Signal Processing, 14(3):494-505, 2020.
  • Haomiao Liu, Ruiping Wang, Shiguang Shan amd Xilin Chen. What is Tabby? Interpretable Model Decisions by Learning Attribute-based Classification Criteria. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 43(5):1791–1807, 2021.
  • Shishi Qiao, Ruiping Wang, Shiguang Shan, Xilin Chen, "Deep Heterogeneous Hashing for Face Video Retrieval," IEEE Transactions on Image Processing, vol. 29, no. 1, pp. 1299-1312, Dec. 2020.
  • Huajie Jiang, Ruiping Wang, Shiguang Shan, Yan Li, Haomiao Liu, Xilin Chen, “Attribute Annotation on Large Scale Image Database by Active Knowledge Transfer,” Image and Vision Computing, vol. 78, pp. 1-13, Oct. 2018.

Conference Papers

  • Wenbin Wang, Ruiping Wang and Xilin Chen. Topic Scene Graph Generation by Attention Distillation from Caption. IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15900-15910, Montreal, Canada, Oct. 11-17, 2021.
  • Jiwei Xiao, Ruiping Wang and Xilin Chen. Holistic Pose Graph: Modeling Geometric Structure among Objects in a Scene using Graph Inference for 3D Object Prediction. IEEE/CVF International Conference on Computer Vision (ICCV), pp. 12717–12726, Montreal, Canada, Oct. 11-17, 2021.
  • Difei Gao, Ruiping Wang, Ziyi Bai and Xilin Chen. Env-QA: A Video Question Answering Benchmark for Comprehensive Understanding of Dynamic Environments. IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1675-1685, Montreal, Canada, Oct. 11-17, 2021.
  • Chen He, Ruiping Wang and Xilin Chen. A Tale of Two CILs: The Connections Between Class Incremental Learning and Class Imbalanced Learning and Beyond. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop on Continual Learning in Computer Vision (CLVision), pp. 3559–3569, Virtual Event, Jun. 19-25, 2021.
  • Difei Gao, Ke li, Ruiping Wang, Shiguang Shan, Xilin Chen, \"Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text,\" IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 12746–12756, 2020.
  • Sijin Wang, Ziwei Yao, Ruiping Wang, Zhongqin Wu and Xilin Chen. FAIEr: Fidelity and Adequacy Ensured Image Caption Evaluation. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14050–14059, Virtual Event, June 19-25, 2021.
  • Ruikui Wang, Shishi Qiao, Ruiping Wang, Shiguang Shan, Xilin Chen, "Hybrid Video and Image Hashing for Robust Face Retrieval," IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020) , pp. 168-175, 2020.
  • Wenbin Wang, Ruiping Wang, Shiguang Shan, Xilin Chen, "Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation," Proceedings of the 16th European Conference on Computer Vision (ECCV), LNCS 12358, Vol.13, pp.222-239, Glasgow, UK / Cyberspace, August 23-28, 2020.
  • Wenbin Wang, Ruiping Wang, Shiguang Shan, Xilin Chen, “Exploring Context and Visual Pattern of Relationship for Scene Graph Generation,” IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp.8180–8189, Long Beach, California, USA, June 16-20, 2019.
  • Yirong Mao, Ruiping Wang, Shiguang Shan, Xilin Chen, \"COSONet: Compact Second-Order Network for Video Face Recognition,\" Asian Conference on Computer Vision 2018(ACCV2018), 2-6 Dec. 2018, Perth Western Australia.
  • Ruikui Wang, Ruiping Wang, Shishi Qiao, Shiguang Shan, Xilin Chen, “Deep Position-Aware Hashing for Semantic Continuous Image Retrieval,” IEEE Winter Conference of Applications on Computer Vision (WACV 2020), pp. 2493–2502, Aspen, CO, Mar. 2-5, 2020.
  • Huajie Jiang, Ruiping Wang, Shiguang Shan, Xilin Chen, “Transferable Contrastive Network for Generalized Zero-Shot Learning,” 17th IEEE International Conference on Computer Vision (ICCV 2019), pp. 9764-9773, Seoul, Korea, Oct. 27-Nov. 2, 2019.
  • Huajie Jiang, Ruiping Wang, Shiguang Shan, Xilin Chen, “Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition,”15th European Conference on Computer Vision (ECCV2018), Munich, Germany, Sep. 8-14, 2018.
  • Sijin Wang, Ruiping Wang, Ziwei Yao, Shiguang Shan, Xilin Chen, “Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval,” IEEE Winter Conference of Applications on Computer Vision (WACV 2020), pp. 1508–1517, Aspen, CO, Mar. 2-5, 2020.
  • Chen He, Ruiping Wang, Shiguang Shan, Xilin Chen, “Exemplar-Supported Generative Reproduction for Class Incremental Learning,”29th British Machine Vision Conference (BMVC2018), Newcastle upon Tyne, UK, Sep. 3-6, 2018.
  • Yong Liu, Ruiping Wang, Shiguang Shan, Xilin Chen, “Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR2018), pp. 6985-6994, Salt Lake City, UT, June 18-22, 2018.
  • Zhiwu Huang, Ruiping Wang, Xianqiu Li, Wenxian Liu, Shiguang Shan, Luc Van Gool, Xilin Chen, \"Geometry-aware Similarity Learning on SPD Manifolds for Visual Recognition,\" IEEE Transactions on circuits and systems for video technology, 28(10), Page(s):2513 – 2523. 2018.10.
  • Huajie Jiang, Ruiping Wang, Shiguang Shan, Yan Li, Haomiao Liu, Xilin Chen, “Attribute Annotation on Large Scale Image Database by Active Knowledge Transfer,” Image and Vision Computing, vol. 78, pp. 1-13, Oct. 2018.
  • Difei Gao, Ruiping Wang, Shiguang Shan, Xilin Chen, \"Visual Textbook Network:Watch Carefully before Answering Visual Questions,\" British Machine Vision ConferenceConference(BMVC2017), 2017.
  • Wen Wang, Ruiping Wang, Zhiwu Huang, Shiguang Shan, Xilin Chen, “Discriminant Analysis on Riemannian Manifold of Gaussian Distributions for Face Recognition with Image Sets,” IEEE Transactions on Image Processing (TIP), vol. 27, no. 1, pp. 151-163, Jan. 2018.