Visual Information Processing and Learning
Visual Information Processing and Learning


ICT-TV Dataset

1. Overview

ICT-TV dataset is designed for studying video face retrieval problem, which contains two large scale video collections parsed from the whole first season of two hit American shows, i.e., 17 episodes of the Big Bang Theory (BBT) and 22 episodes of the Prison Break (PB). These two TV series are quite different in their filming styles. The BBT is a sitcom with 5 main characters, and most scenes are taken indoors during each episode of about 20 minutes long. Differently, many shots of the PB are taken outside during the episodes with the length of about 42 minutes, which results in a large range of different illumination. The numbers of parsed video clips of the two sets are 4, 667 and 9, 435, respectively.

2. Data processing

Each video clip of this dataset is a set of continues video frames extracted from one particular episode via several technologies, i.e., shot boundary detection, face detection, tracking and facial landmark localization. The collected video clips are stored under corresponding characters and episodes they belong to and in the form of images with size of 150 × 150. Some examples of video frames are given below:



3. Data partition

For each video collection (either BBT or PB), we suggest randomly select 10 video clips per character for training and leave the rest as test data (i.e. the database for retrieval). The query set of each collection is consist of 10 video clips per main character randomly selected from the test data. To be specific, the names of main characters for each collections are:

BBT: Howard Wolowitz, Leonard Hofstadter, Penny, Raj Koothrappali and Sheldon Cooper.

PB: Benjamin Miles 'C-Note' Franklin, Fernando Sucre, John Abruzzi, LJ Burrows, Lincoln Burrows, Michael Scofield, Sara Tancredi and Theodore 'T-Bag' Bagwell.

4. Contact

Ruiping Wang (, Institute of Computing Technology, Chinese Academy of Sciences

Shishi Qiao (, Institute of Computing Technology, Chinese Academy of Sciences

5. Download

The ICT-TV dataset is released to universities and research institutes for research purpose only. To request a copy of the ICT-TV dataset, please do as follows:

  • Send an email to Dr. Wang ( When we receive your email, we would provide the download link to you.

  • By using the ICT-TV dataset, you are recommended to refer to the following paper:
    Yan Li, Ruiping Wang, Shiguang Shan, Xilin Chen. Hierarchical Hybrid Statistic based Video Binary Code and Its Application to Face Retrieval in TV-Series. Automatic Face and Gesture Recognition (FG), pp: 1-8, May 2015.



Visual Information Processing and Learning
  • Address :No.6 Kexueyuan South Road
  • Zhongguancun,Haidian District
  • Beijing,China
  • Postcode :100190
  • Tel : (8610)62600514
  • Valse

  • Big Lecture of DL

Copyright @ Visual Information Processing and Learning 京ICP备05002829号 京公网安备1101080060