中科院计算所视觉信息处理与学习组
中科院计算所视觉信息处理与学习组


您所在的位置 / 资源共享

资源共享

LRW-1000: Lip Reading database


1. Overview

LRW-1000 is a naturally-distributed large-scale benchmark for word-level lipreading in the wild, including 1000 classes with about 718,018 video samples from more than 2000 individual speakers. There are more than 1,000,000 Chinese character instances in total. Each class corresponds to the syllables of a Mandarin word which is composed by one or several Chinese characters. This dataset aims to cover a natural variability over different speech modes and imaging conditions to incorporate challenges encountered in practical applications. It shows a large variation over several aspects, including the number of samples in each class, resolution of videos, lighting conditions, and speakers' attributes such as pose, age, gender, and make-up and so on, as shown in Fig. 1 and Fig. 2.


Fig.1 The diversity of the speakers' appearance in LRW-1000


 

Fig.2 Lip Samples in LRW-1000


2. Statistics

  •  >1,000,000Chinese character instances
  •  718,018samples with an average of 718 samples for each class

  •  1000 classes, with each class corresponds to the syllables of a Mandarin word

  •  ~2000 different speakers with a large coverage over speech modes, including speech rate, viewpoint, age, gender and make-up and so on

3. Evaluation Protocols

We provide two evaluation metrics for experiments. A). The recognition accuracy over all 1000 classes is naturally considered as the base metric, since this is a classification task. B). Motivated by the large diversity of the data shown in many aspects, such as the number of samples in each class, we also provide the Kappa Coefficient as a second evaluation metric.

4. Download 

The LRW-1000 database is public to universities and research institutes for research purpose only. To request a copy of the database, please do as follows:

  • Download the database Release Agreement [pdf], read it carefully, and complete it appropriately. Note that the agreement should be signed by a full-time staff member (that is, student is not acceptable). Then, please scan the signed agreement and send it to lipreading@vipl.ict.ac.cn. When we receive your reply, we would provide the download link to you.
  • Before using the LRW-1000 dataset, you are recommended to refer to the following paper:
    Shuang Yang, Yuanhang Zhang, Dalu Feng, Mingmin Yang, Chenhao Wang, Jingyun Xiao, Keyu Long, Shiguang Shan, Xilin Chen, "LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild,"IEEE FG 2019  [pdf |bibtex |code

5. Contact Info
    lipreading@vipl.ict.ac.cn

 

 


视觉信息处理和学习组
  • 单位地址:北京海淀区中关村科学院南路6号
  • 邮编:100190
  • 联系电话:010-62600514
  • Email:yi.cheng@vipl.ict.ac.cn
  • Valse

  • 深度学习大讲堂

版权所有 @ 中科院计算所视觉信息处理与学习组 京ICP备05002829号 京公网安备1101080060