中科院计算所视觉信息处理与学习组
中科院计算所视觉信息处理与学习组


您所在的位置 / 资源共享

资源共享

LRW-1000: Lip Reading database

 

(a) Face Samples in LRW-1000

 

(b) Lip Samples in LRW-1000

1. Overview

LRW-1000 is a naturally-distributed large-scale benchmark for word-level lipreading in the wild, including 1000 classes with about 745,187 video samples from more than 2000 individual speakers. Each class corresponds to the syllables of a Mandarin word which is composed of one or several Chinese characters. This dataset aims to cover a natural variability over different speech modes and imaging conditions to incorporate challenges encountered in practical applications. It shows a large variation over several aspects, including the number of samples in each class, resolution of videos, lighting conditions, and speakers’ attributes such as pose, age, gender, and make-up.

2. Statistics

  • 1000 classes, with each class corresponds to the syllables of a Mandarin word
  •  745,187 samples with an average of 745 samples for each class
  •   ~2000  different speakers with a large coverage over speech modes, including speech rate, viewpoint, age, gender and make-up
  •  >1.14M Chinese character instances
  •  About 60-hour samples, extracted from 500-hour raw videos

3. Evaluation Protocols

We provide two evaluation metrics for experiments. The recognition accuracy over all 1000 classes is naturally considered as the base metric, since this is a classification task. Meanwhile, motivated by the large diversity the data shows in many aspects, such as the number of samples in each class, we also provide the Kappa Coefficient as a second evaluation metric.

       4. Reference

Shuang Yang, Yuanhang Zhang, Dalu Feng, Mingmin Yang, Chenhao Wang, Jingyun Xiao, Keyu Long, Shiguang Shan, Xilin Chen, LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild, arxiv 2018. (https://arxiv.org/pdf/1810.06990.pdf) 

5. Contact Info
Dalu Feng (dalu.feng@vipl.ict.ac.cn), Institute of Computing Technology, Chinese Academy of Sciences
Shuang Yang (shuang.yang@ict.ac.cn), Institute of Computing Technology, Chinese Academy of Sciences

 

 


视觉信息处理和学习组
  • 单位地址:北京海淀区中关村科学院南路6号
  • 邮编:100190
  • 联系电话:010-62600514
  • Email:yi.cheng@vipl.ict.ac.cn
  • Valse

  • 深度学习大讲堂

版权所有 @ 中科院计算所视觉信息处理与学习组 京ICP备05002829号 京公网安备1101080060