Your search
Results 78 resources
-
A dataset of ultrasound and audio recordings from children with speech sound disorders. The UXSSD dataset contains 8 speakers (2 female and 6 male), aged 5-10 years.
-
A dataset of ultrasound and audio recordings from typically developing children. The UXTD dataset contains 58 speakers (31 female and 27 male), aged 5-12 years.
-
The Arizona Child Acoustic Database is a longitudinal collection of audio samples from children between the ages of 2-7 years. The long-range goal of this project is to provide new insight into the physical mechanisms of vocal sound production during a critical period of growth and development. These data are being used to inform our efforts of building a model of speech production for child talkers.
-
This site allows visitors to access recordings of speakers who stutter and background details about these speakers and the conditions in which the recordings were made. The recordings are available in various formats. The main two sets of recordings were made in normal speaking conditions and the final one was made when the sound of the speaker’s voice was altered as he or she spoke. The three...
-
The current data package includes 1,090 hours of recorded speech (as .wav files) from about 1,130 participants, including those with ALS, cerebral palsy, Down syndrome, Parkinson’s disease and those who have had a stroke. The download also includes text of the original speech prompts and a transcript of the participants’ responses. A subset includes annotations describing the speech...
-
The Buckeye Corpus of conversational speech contains high-quality recordings from 40 speakers in Columbus OH conversing freely with an interviewer. The speech has been orthographically transcribed and phonetically labeled. The audio and text files, together with time-aligned phonetic labels, are stored in a format for use with speech analysis software (Xwaves and Wavesurfer).
-
The MSP-AVW is an audiovisual whisper corpus for audiovisual speech recognition purpose. The MSP-AVW corpus contains data from 20 female and 20 male speakers. For each subject, three sessions are recorded consisting of read sentences, isolated digits and spontaneous speech. The data is recorded under neutral and whisper conditions. The corpus was collected in a 13ft x 13ft ASHA certified...
-
This 3-year project investigates language change in five urban dialects of Northern England—Derby, Newcastle, York, Leeds and Manchester. Data collection method: Linguistic analysis of speech data (conversational, word list) from samples of different northern English urban communities. Data collection consisted of interviews, which included (1) some structured questions about the interviewee...
-
Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to...
-
Abstract The study of articulatory gestures has a wide spectrum of applications, notably in speech production and recognition. Sets of phonemes, as well as their articulation, are language-specific; however, existing MRI databases mostly include English speakers. In our present work, we introduce a dataset acquired with MRI from 10 healthy native French speakers. A corpus...
-
This database contains two non-contemporaneous recordings of each of 68 female speakers of Standard Chinese (a.k.a. Mandarin and Putonghua). 60 of the speakers are from north eastern China, and 8 are from southern China. Each speaker was recorded in three speaking styles: - casual telephone conversation (cnv) - information exchange task over the telephone (fax) - pseudo-police-style interview (int)
-
Multi-laboratory evaluation of forensic voice comparison systems under conditions reflecting those of a real forensic case. There is increasing pressure on forensic laboratories to validate the performance of forensic analysis systems before they are used to assess strength of evidence for presentation in court (including pressure from the recently released report by the President’s Council...
-
Forensic database of voice recordings of 500+ Australian English speakers (AusEng 500+). This database contains 3899 recordings totalling 310 hours of speech from 555 Australian-English speakers. 324 female speakers: - 91 recorded in one recording session - 69 recorded in two separate recording sessions - 159 recorded in three recording sessions - 5 recorded in more than three recording...
-
The MSP-Conversation corpus contains interactions annotated with time-continuous emotional traces for arousal (calm to active), valence (negative to positive), and dominance (weak to strong). Time-continuous annotations offer the flexibility to explore emotional displays at different temporal resolutions while leveraging contextual information. Release 1.0 contains 74 conversations with...
-
Twenty five countries have Arabic as an official language, but the dialects spoken vary greatly, and even within one country different accents are heard. Many features create the impression of 'a different accent', including how particular sounds are pronounced, where stress falls in a word, and what intonation pattern is used. There is extensive prior research on the first two of these for...
Explore
Audio Data
-
Accent/Region
(2)
- British English (2)
- Accents (8)
- Child Speech (10)
- Conversation (15)
- Electroglottography / Electrolaryngography (1)
- Emotional Speech (3)
- Forensic (5)
-
Language
(16)
- African Languages (1)
- Bi-/Multilingual (1)
- English (9)
- French (1)
- German (1)
- Korean (1)
- L2+ (1)
- Language Learning (1)
- Mandarin (1)
- Multiple (2)
- Spanish (1)
- Pathological (8)
- Singing (2)
- Speech in Noise (5)
- Synthetic Speech (8)
Derived & Measured Data
- Vocal Tract (1)
Speech Production Data
- Articulography (3)
- Brain Imaging (1)
- EEG (1)
- MRI (9)
- Ultrasound (10)
- Video (3)
-
Vocal Anatomy
(10)
- Larynx and Glottis (1)
- Mandible and Maxilla (1)
- Vocal Tract (8)
Teaching Resources
Tags
- audio data
- transcribed (31)
- adult (24)
- English (23)
- male (22)
- read speech (21)
- female (18)
- spontaneous speech (15)
- child speech (11)
- conversation (11)
- synthetic speech (8)
- speech-language pathology (8)
- real-time MRI (rtMRI) (7)
- ultrasound (7)
- interview (7)
- magnetic resonance imaging (MRI) (7)
- deepfake (6)
- individual variability (6)
- video (5)
- articulatory data (5)
- older adult (4)
- Mandarin (4)
- forensic (4)
- telephone (4)
- volumetric MRI (4)
- British (3)
- perceptually annotated (3)
- American English (3)
- electromagnetic articulography (EMA) (3)
- speech production (3)
- ultrasound tongue imaging (UTI) (3)
- vowels (3)
- annotated (3)
- speech sound disorder (3)
- L2 English (3)
- text-to-speech (TTS) (3)
- French (3)
- speech in noise (3)
- open-source (2)
- English accents (2)
- singing (2)
- angry (2)
- audiovisual (2)
- emotional speech (2)
- happy (2)
- sad (2)
- Spanish (2)
- pathological speech (2)
- multi-language (2)
- held vowel (2)
- voice conversion (VC) (2)
- Sudanese (2)
- Nepali (2)
- Javanese (2)
- Bengali (2)
- British English (2)
- map task (2)
- articulation (2)
- multimodal (2)
- vocal tract shape (2)
- lip video (2)
- teaching resource (2)
- sociophonetic (2)
- Australian (2)
- logical access (1)
- physical access (1)
- spoof (1)
- speech recognition (1)
- rainbow passage (1)
- labelled (1)
- non-speech (1)
- disgust (1)
- podcast (1)
- bilingual (1)
- child-centered audio (1)
- mother-child interaction (1)
- dysarthria (1)
- digits (1)
- whisper (1)
- Amyotrophic Lateral Sclerosis (ALS) (1)
- Down syndrome (1)
- Parkinson's disease (1)
- cerebral palsy (1)
- stroke (1)
- stutter (1)
- longitudinal (1)
- typically developing (1)
- cleft (1)
- L2 speech (1)
- language learning (1)
- electroglottography (EGG) (1)
- intraoral pressure (1)
- validation (1)
- tenor (1)
- vibrato (1)
- Scottish English (1)
- coarticulation (1)
- within-speaker variability (1)
- professional voice (1)
- silent speech (1)
- 3D head meshes (1)
- German (1)
- acoustic pharyngometry (1)
- electroencephalography (EEG) (1)
- external craniofacial anthropometry (1)
- rhinometry (1)
- syllable sequences (1)
- partial spoof (1)
- phone-level alignment (1)
- Chinese (1)
- Amharic (1)
- Swahili (1)
- Wolof (1)
- Korean (1)
- Sinhala (1)
- Khmer (1)
- Afrikaans (1)
- Sesotho (1)
- Setswana (1)
- isiXhosa (1)
- Spanish accent (1)
- Czech (1)
- Japanese (1)
- Southern standard British English (SSBE) (1)
- Bradford (1)
- Kirklees (1)
- Wakefield (1)
- West Yorkshire (1)
- consonants (1)
- dentition (1)
- mandible (1)
- maxilla (1)
- International Phonetic Alphabet (IPA) (1)
- Arabic (1)
- accent variability (1)
- dialect variability (1)
- Putonghua (1)
- MRI (1)
- Derby (1)
- Leeds (1)
- Manchester (1)
- Newcastle (1)
- York (1)
- Ohio (1)
- phonetic labels (1)
- DICOM (1)
- brain activity (1)
- functional magnetic resonance imaging (fMRI) (1)
- vocal imitation (1)
- sociolinguistic (1)
- World Englishes (1)
- dyadic (1)
- African (1)
- Cameroon (1)
- Chad (1)
- Congo (1)
- Gabon (1)
- Niger (1)
Resource type
- Dataset (70)
- Journal Article (3)
- Report (1)
- Web Page (4)