Resource----Visual Information Processing and Learning (VIPL)

Shuqiang Jiang

Ph.D

Tel:

010-62600505

Email:

sqjiang@ict.ac.cn

Address:

No.6 Kexueyuan South Road Zhongguancun,Haidian District Beijing,China The Institute of Computing Technology of the Chinese Academy of Sciences Key Laboratory of Intelligent Information Processing 100190

Resource

1 、Dataset name : INSTRE

Dataset description and introduction : The whole dataset is split into two disjoint subsets: INSTRE-S and INSTRE-M. INSTRE-S, containing 200 single-labelled classes and 23070 images, is established for measuring SINGLE OBJECT CASE, where only one specific object instance is present in view. Similarly, INSTRE-M is designed for MULTIPLE OBJECTS CASE,where each image displays two different object instances. INSTRE-M contains 5473 images and 100 objects distributed into 50 two-tuples, each of which is considered as one double-labelled class. Note that INSTRE-S allows multiple appearances of the same object instance in one image while each image in INSTRE-M strictly displays two different classes.

Release time : 2015.1.3

Download address : http://123.57.42.89/instre/home.html

Papers cited : Shuang Wang, Shuqiang Jiang: INSTRE: A New Benchmark for Instance-Level Object Retrieval and Recognition. TOMCCAP 11(3): 37:1-37:21 (2015)

Dataset contributors : Shuang Wang, Shuqiang Jiang

2 、Dataset name : Geolocation-food

Dataset description and introduction : Selected six regions, a total of 117504 pictures. Among them, Beijing 187 restaurants, 1173 dishes, 45541 pictures. Shanghai 198 restaurants, 1253 dishes, 37590 pictures. Tianjin 78 restaurants, 435 dishes, 10811 pictures. Nanjing 64 restaurants, 328 dishes, 7895 pictures. Hangzhou 62 restaurants, 371 dishes, 9124 pictures. Guangzhou 57 restaurants, 272 dishes, 6543 pictures.

Release time : 2014

Download address : http://123.57.42.89/Dataset_ict/Geolocation-food%28Dishes%29/Geolocation-food/

Papers cited : Ruihan Xu, Luis Herranz, Shuqiang Jiang, Shuang Wang, Xinhang Song, Ramesh Jain: Geolocalized Modeling for Dish Recognition. IEEE Trans. Multimedia 17(8): 1187-1199 (2015)

Dataset contributors : Ruihan Xu, Luis Herranz,Shuqiang Jiang

3 、Dataset name : Yummly-28K

Dataset description and introduction : This dataset is crawled from one recipe-sharing website, Yummly. It has 27,638 recipes in total. Each recipe contains one recipe image, the ingredients, the cuisine and the course information. There are 16 kinds of cuisines (e.g,“American”,“Italian” and “Mexican”) and 13 kinds of recipe courses (e.g, “Main Dishes”,“Desserts” and “Lunch and Snacks”).

Release time : 2017

Download address :http://123.57.42.89/Dataset_ict/Yummly-66K-28K/Yummly28K.zip

Papers cited : Weiqing Min, Shuqiang Jiang, Jitao Sang, Huayang Wang, Xinda Liu, and Luis Herranz. 2017. Being a Super Cook: Joint Food Attributes and Multi-Modal Content Modeling for Recipe Retrieval and Exploration. IEEE Transactions on Multimedia 19, 5 (2017), 1100 – 1113.

Dataset contributors : Weiqing Min, Shuqiang Jiang

4 、Dataset name : Yummly-66K

Dataset description and introduction : This dataset consists of 66,615 recipe items from Yummly, namely Yummly_66K. Each recipe item includes the recipe name, preprocessed ingredient line, recipe image, cuisine and course attribute information, and so on. There are totally 10 kinds of cuisines, 14 kinds of courses and 2,416 ingredients in our dataset.

Release time : 2018

Download address : https://github.com/minweiqing/You-Are-What-You-Eat-Exploring-Rich-Recipe-Information-for-Cross-Region-Food-Analysis

Papers cited : Weiqing Min, Bing-Kun Bao, Shuhuan Mei, Yaohui Zhu, Yong Rui, Shuqiang Jiang. You Are What You Eat: Exploring Multi-modal and Multi-attribute Information from Recipes for Cross-Region Food Analysis. IEEE Trans. on Multimedia 20(4):950-964 (2018)

Dataset contributors : Weiqing Min, Bing-Kun Bao, Shuhuan Mei, Yaohui Zhu, Yong Rui, Shuqiang Jiang

5 、Dataset name : HOD (16 categories)

Dataset description and introduction : The data in the HOD is collected using Kinect (about 1.5 meters from the ground) and contains a total of 12800 video frames. For each frame, we capture an RGB image, a depth map and human skeleton data (obtained using KinectAPI). The data set consists of 16 commonly used handheld object categories, each containing four instances (a total of 64 object instances). Two different users collect data in two different scenarios. Each instance is collected four times, corresponding to four different combinations of users and different scenes. For each combination, we capture 50 frames (640 x 480 pixels, 30 fps, sub-sampling ratio of 1/30) images, so each instance collects a total of 200 frames, each of which acquires 800 frames. In the data collection process, the gesture and distance of the handheld object are variable, so the data set covers multiple views of each object.

Release time : 2015

Download address : http://123.57.42.89/hod/home.html

Papers cited : Lv X, Jiang S Q, Herranz L, et al. Rgb-d hand-held object recognition based on heterogeneous feature fusion[J].

Dataset contributors : Xiong Lv, Shuang Wang, Shuqiang Jiang

6 、Dataset name : HOD (20 categories)

Dataset description and introduction : The data in the HOD is collected using Kinect (about 1.5 meters from the ground) and contains a total of 16000 video frames. For each frame, we capture an RGB image, a depth map and human skeleton data (obtained using KinectAPI). The data set consists of 20 commonly used handheld object categories, each containing four instances (a total of 80 object instances). Two different users collect data in two different scenarios. Each instance is collected four times, corresponding to four different combinations of users and different scenes. For each combination, we capture 50 frames (640 x 480 pixels, 30 fps, sub-sampling ratio of 1/30) images, so each instance collects a total of 200 frames, each of which acquires 800 frames. In the data collection process, the gesture and distance of the handheld object are variable, so the data set covers multiple views of each object.

Release time : 2017

Download address : http://123.57.42.89/hod/home.html

Papers cited : Xiong Lv, Shuqiang Jiang, Luis Herranz, Shuang Wang, RGB-D Hand-Held Object Recognition Based on Heterogeneous Feature Fusion. J. Comput. Sci. Technol. 30(2): 340-352 (2015)

Dataset contributors : Xiong Lv, Shuang Wang, Shuqiang Jiang

7 、Dataset name : ISIA Food-200

Dataset description and introduction : ISIA Food-200 consists of 197,323 food items.Each item includes the food name,food images,main ingredients.There are totally 200 kinds of food dishes and 398 ingredients.

Release time : 2019

Download address : http://123.57.42.89/Dataset_ict/WIKI Food/ISIA Food200_v2/

Papers cited : Weiqing Min, Linhu Liu, Zhengdong Luo and Shuqiang Jiang. "Ingredient-Guided Cascaded Multi-Attention Network for Food Recognition.", ACM Multimedia, 1331-1339 (2019)

Dataset contributors : Weiqing Min, Zhengdong Luo and Shuqiang Jiang

8 、Dataset name : ISIA Food-500

Dataset description and introduction : ISIA Food-500 consists of 399,726 food items.Each item includes the food name,food images. There are totally 500 kinds of food dishes.

Release time : 2020

Download address : http://123.57.42.89/Dataset_ict/ISIA_Food500_Dir/

Papers cited : Weiqing Min, Linhu Liu, Zhiling Wang, Zhengdong Luo, Xiaoming Wei, Xiaolin Wei, Shuqiang Jiang, ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked Global-Local Attention Network. ACM Multimedia (CCF-A), (2020 Oral)

Dataset contributors : Weiqing Min, Zhengdong Luo, Shuqiang Jiang