Results | YorVoice Catalogue

Large Javanese ASR training data set

Javanese ASR training data set containing ~185K utterances. This data set contains transcribed audio data for Javanese. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors. This dataset was collected by Google in...

View on openslr.org

Large Nepali ASR training data set

Nepali ASR training data set containing ~157K utterances. This data set contains transcribed audio data for Nepali. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, anonymized UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors.

View on openslr.org

Large Sinhala ASR training data set

Sinhala ASR training data set containing ~185K utterances. This data set contains transcribed audio data for Sinhala. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, anonymized UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors.

View on openslr.org

Large Sundanese ASR training data set

Sundanese ASR training data set containing ~220K utterances. This data set contains transcribed audio data for Sundanese. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors. This dataset was collected by Google in Indonesia.

View on openslr.org

Zeroth-Korean

Korean Open-source Speech Corpus for Speech Recognition by Zeroth Project. The data set contains transcriebed audio data for Korean. There are 51.6 hours transcribed Korean audio for training data (22,263 utterances, 105 people, 3000 sentences) and 1.2 hours transcribed Korean audio for testing data (457 utterances, 10 people). This corpus also contains pre-trained/designed language model,...

View on openslr.org

ALFFA (African Languages in the Field: speech Fundamentals and Automation)

This data is transcribed speech data, in Amharic and Swahili and Wolof.

View on openslr.org

Aishell

Aishell is an open-source Chinese Mandarin speech corpus published by Beijing Shell Shell Technology Co.,Ltd. 400 people from different accent areas in China are invited to participate in the recording, which is conducted in a quiet indoor environment using high fidelity microphone and downsampled to 16kHz. The manual transcription accuracy is above 95%, through professional speech annotation...

View on openslr.org

African Accented French

African Accented French Corpus This corpus consists of approximately 22 hours of speech recordings. Transcripts are provided for all the recordings. The corpus can be divided into 3 parts: 1. Yaounde Collected by a team from the U.S. Military Academy's Center for Technology Enhanced Language Learning (CTELL) in 2003 in Yaoundé, Cameroon. It has recordings from 84 speakers, 48 male and 36...

View on openslr.org

The Sociolinguistic Archive and Analysis Project (SLAAP)

Tyler Kendall

The Sociolinguistic Archive and Analysis Project, at North Carolina State University, is an interactive web-based archive of sociolinguistic recordings, with integrated media playing and annotation features, as well as phonetic analysis and corpus analysis tools designed for enabling and improving empirical linguistic inquiry. The archive continues to grow over time. It currently contains (as...

View on slaap.chass.ncsu.edu

The UCLA Phonetics Lab Archive

Peter Ladefoged

For over half a century, the UCLA Phonetics Laboratory has collected recordings of hundreds of languages from around the world, providing source materials for phonetic and phonological research, of value to scholars, speakers of the languages, and language learners alike. The materials on this site comprise audio recordings illustrating phonetic structures from over 200 languages with phonetic...

View on archive.phonetics.ucla.edu

VoxAngeles

E Chodroff, B. Pažon, A. Baker + 1 others

VoxAngeles is a corpus of audited phonetic transcriptions and phone-level alignments of the UCLA Phonetics Lab Archive (Ladefoged et al., 2009, http://archive.phonetics.ucla.edu/), along with phonetic measurements including word and phone durations, vowel f0 and vowel formants. The audited portion of the corpus currently contains data from 95 languages across 21 language families. Unaudited...

View on github.com

Acted clear speech corpus

Catherine Mayo, Catherine Mayo

Single male native British English talker recorded producing 25 TIMIT sentences in 5 conditions, two natural: (i) quiet, (ii) while the talker listened to high-intensity speech-shaped noise, and three acted: (i) as if to a non-native listener, (ii) as if to a computer speech-recognition system, (iii) as if to an infant. Accompanied by automatic and hand-corrected phone-level transcription.

View on datashare.ed.ac.uk

Diachronic Electronic Corpus of Tyneside English (DECTE)

Karen P. Corrigan, Isabelle Buchstaller, Adam Mearns + 1 others

DECTE is an amalgamation of the existing Newcastle Electronic Corpus of Tyneside English (NECTE), created between 2001 and 2005, and NECTE2, a collection of interviews conducted in the Tyneside area since 2007. It thereby constitutes a rare example of a publicly available on-line corpus presenting dialect material spanning five decades.

View on research.ncl.ac.uk

University College London’s Archive of Stuttered Speech (UCLASS)

This site allows visitors to access recordings of speakers who stutter and background details about these speakers and the conditions in which the recordings were made. The recordings are available in various formats. The main two sets of recordings were made in normal speaking conditions and the final one was made when the sound of the speaker’s voice was altered as he or she spoke. The three...

View on www.uclass.psychol.ucl.ac.uk

Speech Accessibility Project

The current data package includes 1,090 hours of recorded speech (as .wav files) from about 1,130 participants, including those with ALS, cerebral palsy, Down syndrome, Parkinson’s disease and those who have had a stroke. The download also includes text of the original speech prompts and a transcript of the participants’ responses. A subset includes annotations describing the speech...

View on speechaccessibilityproject.beckman.illinois.edu

Your search

Results 36 resources

Explore

Audio Data

Derived & Measured Data

Speech Production Data

Tags

Resource type