Results | YorVoice Catalogue

High quality TTS data for Javanese

Multi-speaker TTS data for Javanese (jv-ID). This data set contains high-quality transcribed audio data for Javanese. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might still...

View on openslr.org

High quality TTS data for Khmer

Multi-speaker TTS data for Khmer (km-KH). This data set contains high-quality transcribed audio data for Khmer. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might still be...

View on openslr.org

High quality TTS data for Nepali

Multi-speaker TTS data for Nepali (ne-NP). This data set contains high-quality transcribed audio data for Nepali. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might still be...

View on openslr.org

High quality TTS data for Sundanese

Multi-speaker TTS data for Sundanese (su-ID). This data set contains high-quality transcribed audio data for Sundanese. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number. The data set has been manually quality checked, but there might...

View on openslr.org

Large Bengali ASR training data set

Bengali ASR training data set containing ~196K utterances. This data set contains transcribed audio data for Bengali. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, anonymized UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors.

View on openslr.org

Large Javanese ASR training data set

Javanese ASR training data set containing ~185K utterances. This data set contains transcribed audio data for Javanese. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors. This dataset was collected by Google in...

View on openslr.org

Large Nepali ASR training data set

Nepali ASR training data set containing ~157K utterances. This data set contains transcribed audio data for Nepali. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, anonymized UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors.

View on openslr.org

Large Sinhala ASR training data set

Sinhala ASR training data set containing ~185K utterances. This data set contains transcribed audio data for Sinhala. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, anonymized UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors.

View on openslr.org

Large Sundanese ASR training data set

Sundanese ASR training data set containing ~220K utterances. This data set contains transcribed audio data for Sundanese. The data set consists of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID, UserID and the transcription of audio in the file. The data set has been manually quality checked, but there might still be errors. This dataset was collected by Google in Indonesia.

View on openslr.org

Zeroth-Korean

Korean Open-source Speech Corpus for Speech Recognition by Zeroth Project. The data set contains transcriebed audio data for Korean. There are 51.6 hours transcribed Korean audio for training data (22,263 utterances, 105 people, 3000 sentences) and 1.2 hours transcribed Korean audio for testing data (457 utterances, 10 people). This corpus also contains pre-trained/designed language model,...

View on openslr.org

ALFFA (African Languages in the Field: speech Fundamentals and Automation)

This data is transcribed speech data, in Amharic and Swahili and Wolof.

View on openslr.org

Aishell

Aishell is an open-source Chinese Mandarin speech corpus published by Beijing Shell Shell Technology Co.,Ltd. 400 people from different accent areas in China are invited to participate in the recording, which is conducted in a quiet indoor environment using high fidelity microphone and downsampled to 16kHz. The manual transcription accuracy is above 95%, through professional speech annotation...

View on openslr.org

African Accented French

African Accented French Corpus This corpus consists of approximately 22 hours of speech recordings. Transcripts are provided for all the recordings. The corpus can be divided into 3 parts: 1. Yaounde Collected by a team from the U.S. Military Academy's Center for Technology Enhanced Language Learning (CTELL) in 2003 in Yaoundé, Cameroon. It has recordings from 84 speakers, 48 male and 36...

View on openslr.org

PhonemeDF: A Synthetic Speech Dataset for Audio Deepfake Detection and Naturalness Evaluation

Vamshi Nallaguntla, Aishwarya Fursule, Shruti Kshirsagar + 1 others

PhonemeDF is a large-scale phoneme-level parallel dataset of real and synthetic speech (approximately 730 hours), designed for audio deepfake detection and speech naturalness evaluation. The dataset consists of real speech samples derived from a subset of the LibriSpeech corpus (train-clean-100) and corresponding synthetic speech generated using four Text-to-Speech (TTS) systems (MeloTTS,...

View on zenodo.org

PartialSpoof Database - Partially Spoofed Audio Dataset for Anti-spoofing

Lin Zhang, Xin Wang, Erica Cooper + 3 others

All existing databases of spoofed speech contain attack data that is spoofed in its entirety. In practice, it is entirely plausible that successful attacks can be mounted with utterances that are only partially spoofed. By definition, partially-spoofed utterances contain a mix of both spoofed and bona fide segments, which will likely degrade the performance of countermeasures trained with...

View on zenodo.org

Your search

Results 78 resources

Explore

Audio Data

Derived & Measured Data

Speech Production Data

Teaching Resources

Tags

Resource type