Your search
Results 36 resources
-
The M-AILABS Speech Dataset is the first large dataset that we are providing free-of-charge, freely usable as training data for speech recognition and speech synthesis. Most of the data is based on LibriVox and Project Gutenberg. The training data consist of nearly thousand hours of audio and the text-files in prepared format. A transcription is provided for each clip. Clips vary in length...
-
This speech corpus contains recordings for 104 monolingual native southern British English speakers aged between 8 and 85 years old while they engaged in a problem-solving picture-based ‘spot the difference’ task (Diapix) with a conversational partner in four listening conditions. In NORM (quiet, no masking), participants heard each other normally. In SPSN (speech-shaped noise), participants...
-
This collection contains the quantitative data resulting from the analysis of the elderLUCID audio corpus – a set of speech recordings collected for 83 adults aged 19 to 84 years inclusive. Recordings were made while participants carried out two types of collaborative tasks with a conversational partner who was a young adult of the same sex: (1) a ‘spot the difference’ picture task (‘diapix’)...
-
Fully-annotated corpus of spontaneous speech dialogues for children. Diapix task recorded as a stereo wav files with one speaker per channel. 96 children aged between 9 to 14 years old Non-bilingual native Southern British English speakers
-
The Nijmegen Corpus of Casual Czech contains 30 hours of high-quality recordings featuring 60 Czech speakers conversing among friends. The speech has been orthographically transcribed.
-
The Nijmegen Corpus of Casual French contains 35 hours of high-quality recordings featuring 46 French speakers conversing among friends. The speech has been orthographically annotated by professional transcribers.
-
The Nijmegen Corpus of Casual Spanish contains around 30 hours of high-quality recordings featuring 52 Spanish speakers from Madrid conversing among friends. The speech has been orthographically annotated by professional transcribers.
-
The Nijmegen Corpus of Spanish English (NCSE) contains 38.5 hours of high-quality recordings of English speech produced by 34 native Spanish speakers in interaction with two native Dutch confederates. The NCSE contains a formal and an informal recording for each Spanish speaker. The speech has been orthographically transcribed.
-
Multi-speaker TTS data for Bangladesh Bengali (bn-BD) and Indian Bengali (bn-IN).
-
Multi-speaker TTS data for four South African languages, Afrikaans, Sesotho, Setswana and isiXhosa. This data set contains multi-speaker high quality transcribed audio data for four languages of South Africa. The data set consists of wave files, and a TSV file transcribing the audio. In each folder, the file line_index.tsv contains a FileID, which in turn contains the UserID and the...
-
Multi-speaker TTS data for Javanese (jv-ID). This data set contains high-quality transcribed audio data for Javanese. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might still...
-
Multi-speaker TTS data for Khmer (km-KH). This data set contains high-quality transcribed audio data for Khmer. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might still be...
-
Multi-speaker TTS data for Nepali (ne-NP). This data set contains high-quality transcribed audio data for Nepali. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might still be...
-
Multi-speaker TTS data for Sundanese (su-ID). This data set contains high-quality transcribed audio data for Sundanese. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might...
-
Bengali ASR training data set containing ~196K utterances. This data set contains transcribed audio data for Bengali. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, anonymized UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors.
Explore
Audio Data
- Accents (4)
- Child Speech (4)
- Conversation (9)
- Directed Speech (1)
- Emotional Speech (1)
-
Language
(7)
- African Languages (1)
- Bi-/Multilingual (1)
- English (3)
- French (1)
- Korean (1)
- Mandarin (1)
- Multiple (1)
- Spanish (1)
- Pathological (2)
- Speech in Noise (5)
Derived & Measured Data
Speech Production Data
- MRI (1)
-
Vocal Anatomy
(1)
- Vocal Tract (1)
Tags
- transcribed
- audio data (31)
- English (9)
- spontaneous speech (8)
- conversation (7)
- female (4)
- male (4)
- read speech (4)
- child speech (4)
- speech in noise (4)
- adult (4)
- French (3)
- British English (3)
- Mandarin (2)
- Spanish (2)
- speech-language pathology (2)
- multi-language (2)
- Sudanese (2)
- Nepali (2)
- Javanese (2)
- Bengali (2)
- older adult (2)
- American English (2)
- phonetic labels (2)
- open-source (1)
- speech recognition (1)
- environmental noise (1)
- noisy audio (1)
- reverberation (1)
- angry (1)
- emotional speech (1)
- happy (1)
- sad (1)
- surprise (1)
- bilingual (1)
- child-centered audio (1)
- mother-child interaction (1)
- Amyotrophic Lateral Sclerosis (ALS) (1)
- Down syndrome (1)
- Parkinson's disease (1)
- annotated (1)
- cerebral palsy (1)
- stroke (1)
- stutter (1)
- Lombard speech (1)
- clear speech (1)
- computer-directed speech (1)
- infant-directed speech (1)
- non-native-directed speech (1)
- formant measurement (1)
- phone duration (1)
- phone-level alignment (1)
- pitch (1)
- Chinese (1)
- Amharic (1)
- Swahili (1)
- Wolof (1)
- Korean (1)
- Sinhala (1)
- Khmer (1)
- Afrikaans (1)
- Sesotho (1)
- Setswana (1)
- isiXhosa (1)
- L2 English (1)
- Spanish accent (1)
- Czech (1)
- MRI (1)
- real-time MRI (rtMRI) (1)
- volumetric MRI (1)
- Ohio (1)
- Newcastle (1)
- interview (1)
- sociolinguistic (1)
- sociophonetic (1)
- African (1)
- Cameroon (1)
- Chad (1)
- Congo (1)
- Gabon (1)
- Niger (1)
Resource type
- Dataset (32)
- Journal Article (1)
- Software (1)
- Web Page (2)