免费的中文语音数据集汇总列表

截止2019年01月11日

  1. AISHELL-1
  2. AISHELL-2(高校与研究机构免费申请)
  3. THCHS30
  4. ST-CMDS
  5. Primewords Chinese Corpus Set 1

下载地址:openslr上都有,除了aishell-2。aishell-2可以向希尔公司申请或购买。

Dataset

Duration(hours)

Description

AISHELL-1

178

AISHELL-ASR0009录音文本涉及智能家居、无人驾驶、工业生产等11个领域。录制过程在安静室内环境中, 同时使用3种不同设备: 高保真麦克风(44.1kHz,16-bit);Android系统手机(16kHz,16-bit);iOS系统手机(16kHz,16-bit)。高保真麦克风录制的音频降采样为16kHz,用于制作AISHELL-ASR0009-OS1。400名来自中国不同口音区域的发言人参与录制。经过专业语音校对人员转写标注,并通过严格质量检验,此数据库文本正确率在95%以上。分为训练集、开发集、测试集。(支持学术研究,未经允许禁止商用。)

AISHELL-2

1000

希尔贝壳中文普通话语音数据库AISHELL-2的语音时长为1000小时,其中718小时来自AISHELL-ASR0009-[ZH-CN],282小时来自AISHELL-ASR0010-[ZH-CN]。录音文本涉及唤醒词、语音控制词、智能家居、无人驾驶、工业生产等12个领域。录制过程在安静室内环境中, 同时使用3种不同设备: 高保真麦克风(44.1kHz,16bit);Android系统手机(16kHz,16bit);iOS系统手机(16kHz,16bit)。AISHELL-2采用iOS系统手机录制的语音数据。1991名来自中国不同口音区域的发言人参与录制。经过专业语音校对人员转写标注,并通过严格质量检验,此数据库文本正确率在96%以上。(支持学术研究,未经允许禁止商用。)

AISHELL-EVAL

(AISHELL2-2018A-EVAL)

TEST DATA: 5000 utterances from 10 speakers

 DEV DATA: 2500 utterances from 5 speaker

Sampling Rate :         16kHz

Sample Format :        16bit

Environment :             Indoor

Speech Data Type :    PCM

Channel Number :     1

Recording Equipment :  iOS / Android / High Fidelity Microphone

THCHS30

30

THCHS30 is an open Chinese speech database published by Center for Speech and Language Technology (CSLT) at Tsinghua University.

ST-CMDS

500

A free Chinese Mandarin corpus by Surfingtech (www.surfing.ai), containing utterances from 855 speakers, 102600 utterances.This corpus were recorded in silence in-door environment using cellphone. It has 855 speakers. Each speaker has 120 utterances. All utterances were carefully transcribed and checked by human. Transcription accuracy is guaranteed.

Primewords Chinese Corpus Set 1

100

This free Chinese Mandarin speech corpus set is released by Shanghai Primewords Information Technology Co., Ltd.The corpus is recorded by smart mobile phones from 296 native Chinese speakers. The transcription accuracy is larger than 98%, at the confidence level of 95%. It is free for academic use.The mapping between the transcript and utterance is given in JSON format.

猜你喜欢

转载自blog.csdn.net/qq_40767896/article/details/86291664