Your search
Results 23 resources
-
USC-TIMIT is a database of speech production data under ongoing development, which currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English, and electromagnetic articulography data from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus. In...
-
Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving...
-
These transcripts and video files are samples of Spanish and English caregiver (almost always mother)-child interaction collected at child ages 2 ½, 3, and 3 ½ years as part of a 10-year longitudinal study of the language and literacy development of U.S.-born children raised in Spanish-speaking homes. Each recording is approximately 30 minutes in length. The caregiver and target child are...
-
The MSP-Podcast corpus contains speech segments from podcast recordings which are perceptually annotated using crowdsourcing. The collection of this corpus is an ongoing process. Version 1.11 of the corpus has 151,654 speaking turns (237 hours and 56 mins). The proposed partition attempts to create speaker-independent datasets for Train, Development, Test1, Test2, and Test3 sets.
-
This dataset contains 350 parallel utterances spoken by 10 native Mandarin speakers, and 10 English speakers with 5 emotional states (neutral, happy, angry, sad and surprise). The transcripts are provided.
-
The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of British English, both spoken and written, from the late twentieth century. Access the data here: https://llds.ling-phil.ox.ac.uk/llds/xmlui/handle/20.500.14106/2554
-
The Voices Obscured in Complex Environmental Settings (VOiCES) corpus is a creative commons speech dataset targeting acoustically challenging and reverberant environments with robust labels and truth data for transcription, denoising, and speaker identification. This is one of the largest corpora to date that has transcriptions and simulatenously recorded real-world noise. The details: -...
-
This CSTR VCTK Corpus includes speech data uttered by 110 English speakers with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper, the rainbow passage and an elicitation paragraph used for the speech accent archive.
Explore
Audio
-
Accent/Region
(8)
- American English (1)
- Australian English (2)
- British English (4)
- World Englishes (1)
- Child Speech (1)
- Conversation (6)
- Emotional Speech (3)
- Forensic (2)
-
Language
(14)
- Bi-/Multilingual (1)
- English (14)
- L2+ (1)
- Language Learning (1)
- Mandarin (1)
- Multiple (1)
- Spanish (1)
- Multi-Speaker (11)
- Multi-Style (1)
- Speech in Noise (2)
Derived & Measured Data
Software, Processing & Utilities
Speech Production & Articulation
- Articulography (2)
- MRI (6)
- Ultrasound (2)
- Video (3)
Teaching Resources
Vocal Anatomy
- Vocal Tract (6)
Tags
- English
- audio data (17)
- adult (16)
- read speech (11)
- male (11)
- female (9)
- transcribed (5)
- MRI (5)
- conversation (5)
- articulatory data (4)
- real-time MRI (rtMRI) (4)
- spontaneous speech (4)
- English accents (3)
- British (3)
- perceptually annotated (3)
- video (3)
- interview (3)
- Newcastle (3)
- angry (2)
- emotional speech (2)
- happy (2)
- sad (2)
- articulation (2)
- American English (2)
- electromagnetic articulography (EMA) (2)
- speech production (2)
- Australian (2)
- forensic (2)
- phonetic labels (2)
- British English (2)
- rainbow passage (1)
- environmental noise (1)
- noisy audio (1)
- reverberation (1)
- Mandarin (1)
- surprise (1)
- podcast (1)
- Spanish (1)
- bilingual (1)
- child speech (1)
- child-centered audio (1)
- mother-child interaction (1)
- multimodal (1)
- volumetric MRI (1)
- jaw scans (1)
- International Phonetic Alphabet (IPA) (1)
- lip video (1)
- teaching resource (1)
- ultrasound tongue imaging (UTI) (1)
- arousal (1)
- dominance (1)
- valence (1)
- telephone (1)
- rtMRI (1)
- segmentation (1)
- Derby (1)
- Leeds (1)
- Manchester (1)
- York (1)
- audiovisual (1)
- digits (1)
- whisper (1)
- Ohio (1)
- Non-native speech (1)
- adaptation (1)
- diapix (1)
- Middlesbrough (1)
- Sunderland (1)
- L2 English (1)
- L2 speech (1)
- annotated (1)
- language learning (1)
- professional voice (1)
- silent speech (1)
- ultrasound (1)
Resource type
- Dataset (16)
- Journal Article (2)
- Report (1)
- Software (1)
- Web Page (3)