Your search
Results 11 resources
-
This collection contains behavioural and brain activation data from 3 laboratory studies of speech imitation. Each of the three studies involved behavioural and imaging (MRI) test sessions in which participants were familiarised with novel auditory speech targets, and were asked to imitate them as closely as possible. Across the three studies, there were variations in the type of sounds...
-
This dataset contains simultaneous recordings of electroglottography (EGG recorded with Glottal Enterprises EG2-PCX2), unfiltered audio, and intraoral pressure (recorded with Glottal Enterprises PG-60) from 14 subjects. It is meant to facilitate the validation of physical models of glottal control during voicing, in which the glottal/source waveform for speech is controlled by a combination of...
-
We introduce the Speak & Improve Corpus 2025, a dataset of L2 learner English data with holistic scores and language error annotation, collected from open (spontaneous) speaking tests on the Speak & Improve learning platform. The aim of the corpus release is to address a major challenge to developing L2 spoken language processing systems, the lack of publicly available data with high-quality...
-
The MSP-AVW is an audiovisual whisper corpus for audiovisual speech recognition purpose. The MSP-AVW corpus contains data from 20 female and 20 male speakers. For each subject, three sessions are recorded consisting of read sentences, isolated digits and spontaneous speech. The data is recorded under neutral and whisper conditions. The corpus was collected in a 13ft x 13ft ASHA certified...
-
This 3-year project investigates language change in five urban dialects of Northern England—Derby, Newcastle, York, Leeds and Manchester. Data collection method: Linguistic analysis of speech data (conversational, word list) from samples of different northern English urban communities. Data collection consisted of interviews, which included (1) some structured questions about the interviewee...
-
Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to...
-
Twenty five countries have Arabic as an official language, but the dialects spoken vary greatly, and even within one country different accents are heard. Many features create the impression of 'a different accent', including how particular sounds are pronounced, where stress falls in a word, and what intonation pattern is used. There is extensive prior research on the first two of these for...
-
Dynamic Dialects contains an articulatory video-based corpus of speech samples from world-wide accents of English. Videos in this corpus contain synchronised audio, ultrasound-tongue-imaging video and video of the moving lips. We are continuing to augment the database. The website contains three main resources: - A clickable Accent Map: clicking on points of the map will open up links to...
-
The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of British English, both spoken and written, from the late twentieth century. Access the data here: https://llds.ling-phil.ox.ac.uk/llds/xmlui/handle/20.500.14106/2554
-
The Voices Obscured in Complex Environmental Settings (VOiCES) corpus is a creative commons speech dataset targeting acoustically challenging and reverberant environments with robust labels and truth data for transcription, denoising, and speaker identification. This is one of the largest corpora to date that has transcriptions and simulatenously recorded real-world noise. The details: -...
-
This CSTR VCTK Corpus includes speech data uttered by 110 English speakers with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper, the rainbow passage and an elicitation paragraph used for the speech accent archive.
Explore
Audio
-
Language
- Arabic (1)
- English (8)
- L2+ (1)
- Language Learning (2)
- Mandarin (1)
-
Accent/Region
(5)
- Arabic (1)
- British English (4)
- World Englishes (1)
- Conversation (1)
- Electroglottography / Electrolaryngography (1)
- Multi-Speaker (5)
- Multi-Style (1)
- Pathological (1)
- Speech in Noise (1)
Speech Production & Articulation
- Brain Imaging (1)
- MRI (1)
- Ultrasound (2)
- Video (1)
Teaching Resources
Vocal Anatomy
Tags
- read speech
- audio data (10)
- adult (7)
- English (6)
- female (5)
- male (5)
- spontaneous speech (4)
- English accents (2)
- British (2)
- ultrasound tongue imaging (UTI) (2)
- rainbow passage (1)
- environmental noise (1)
- noisy audio (1)
- reverberation (1)
- transcribed (1)
- accent map (1)
- lip video (1)
- teaching resource (1)
- Arabic (1)
- accent variability (1)
- dialect variability (1)
- older adult (1)
- sociophonetic (1)
- Derby (1)
- Leeds (1)
- Manchester (1)
- Newcastle (1)
- York (1)
- conversation (1)
- audiovisual (1)
- digits (1)
- video (1)
- whisper (1)
- L2 English (1)
- L2 speech (1)
- annotated (1)
- interview (1)
- language learning (1)
- electroglottography (EGG) (1)
- intraoral pressure (1)
- validation (1)
- Mandarin (1)
- dysarthria (1)
- pathological speech (1)
- speech-language pathology (1)
- vowels (1)
- MRI (1)
- brain activity (1)
- fMRI (1)
- rtMRI (1)
- vocal imitation (1)
Resource type
- Dataset (6)
- Journal Article (2)
- Report (1)
- Web Page (2)