A Survey on Context-Aware Mobile Visual Recognition

Weiqing Min, Shuqiang Jiang, Shuhui Wang, Ruihan Xu, Yushan Cao, Luis Herranz, Zhiqiang He


Human beings have developed a diverse food culture. Many factors like ingredients, visual appearance, courses (e.g., breakfast and lunch), flavor and geographical regions affect our food perception and choice. In this work, we focus on multi-dimensional food analysis based on these food factors to benefit various applications like summary and recommendation. For that solution, we propose a delicious recipe analysis framework to incorporate various types of continuous and discrete attribute features and multi-modal information from recipes. First, we develop a Multi-Attribute Theme Modeling (MATM) method, which can incorporate arbitrary types of attribute features to jointly model them and the textual content. We then utilize a multi-modal embedding method to build the correlation between the learned textual theme features from MATM and visual features from the deep learning network. By learning attribute-theme relations and multi-modal correlation, we are able to fulfill different applications, including (1) flavor analysis and comparison for better understanding the flavor patterns from different dimensions, such as the region and course, (2) region-oriented multi-dimensional food summary with both multi-modal and multi-attribute information and (3) multi-attribute oriented recipe recommendation. Furthermore, our proposed framework is flexible and enables easy incorporation of arbitrary types of attributes and modalities. Qualitative and quantitative evaluation results have validated the effectiveness of the proposed method and framework on the collected Yummly dataset.

  • Weiqing Min, Shuqiang Jiang, Shuhui Wang, Ruihan Xu, Yushan Cao, Luis Herranz, Zhiqiang He,A survey on context-aware mobile visual recognition. Multimedia Syst. 23(6): 647-665 (2017)