Your search
Results 10 resources
-
English is the most widely spoken language in the world, used daily by millions of people as a first or second language in many different contexts. As a result, there are many varieties of English. Although the great many advances in English automatic speech recognition (ASR) over the past decades, results are usually reported based on test datasets which fail to represent the diversity of...
-
The Sociolinguistic Archive and Analysis Project, at North Carolina State University, is an interactive web-based archive of sociolinguistic recordings, with integrated media playing and annotation features, as well as phonetic analysis and corpus analysis tools designed for enabling and improving empirical linguistic inquiry. The archive continues to grow over time. It currently contains (as...
-
A multi-speaker corpus of ultrasound images of the tongue and video images of the lips The Tongue and Lips (TaL) corpus is a multi-speaker corpus of ultrasound images of the tongue and video images of lips. This corpus contains synchronised imaging data of extraoral (lips) and intraoral (tongue) articulators from 82 native speakers of English. The TaL corpus consists of two datasets: - TaL1...
-
We introduce the Speak & Improve Corpus 2025, a dataset of L2 learner English data with holistic scores and language error annotation, collected from open (spontaneous) speaking tests on the Speak & Improve learning platform. The aim of the corpus release is to address a major challenge to developing L2 spoken language processing systems, the lack of publicly available data with high-quality...
-
The current data package includes 1,090 hours of recorded speech (as .wav files) from about 1,130 participants, including those with ALS, cerebral palsy, Down syndrome, Parkinson’s disease and those who have had a stroke. The download also includes text of the original speech prompts and a transcript of the participants’ responses. A subset includes annotations describing the speech...
-
The MSP-AVW is an audiovisual whisper corpus for audiovisual speech recognition purpose. The MSP-AVW corpus contains data from 20 female and 20 male speakers. For each subject, three sessions are recorded consisting of read sentences, isolated digits and spontaneous speech. The data is recorded under neutral and whisper conditions. The corpus was collected in a 13ft x 13ft ASHA certified...
-
Forensic database of voice recordings of 500+ Australian English speakers (AusEng 500+). This database contains 3899 recordings totalling 310 hours of speech from 555 Australian-English speakers. 324 female speakers: - 91 recorded in one recording session - 69 recorded in two separate recording sessions - 159 recorded in three recording sessions - 5 recorded in more than three recording...
-
Twenty five countries have Arabic as an official language, but the dialects spoken vary greatly, and even within one country different accents are heard. Many features create the impression of 'a different accent', including how particular sounds are pronounced, where stress falls in a word, and what intonation pattern is used. There is extensive prior research on the first two of these for...
-
Dynamic Dialects contains an articulatory video-based corpus of speech samples from world-wide accents of English. Videos in this corpus contain synchronised audio, ultrasound-tongue-imaging video and video of the moving lips. We are continuing to augment the database. The website contains three main resources: - A clickable Accent Map: clicking on points of the map will open up links to...
-
Expressive Anechoic Recordings of Speech (EARS). Highlights: - 100 h of speech data from 107 speakers - high-quality recordings at 48 kHz in an anechoic chamber - high speaker diversity with speakers from different ethnicities and age range from 18 to 75 years - full dynamic range of human speech, ranging from whispering to yelling - 18 minutes of freeform monologues per speaker - sentence...
Explore
Audio
-
Accent/Region
(5)
- American English (1)
- Arabic (1)
- Australian English (1)
- British English (1)
- World Englishes (2)
- Conversation (1)
- Emotional Speech (1)
- Forensic (1)
-
Language
(5)
- Arabic (1)
- English (4)
- L2+ (1)
- Language Learning (1)
- Multi-Speaker (4)
- Multi-Style (2)
- Pathological (1)
Speech Production & Articulation
- Ultrasound (2)
- Video (2)
Teaching Resources
Tags
- spontaneous speech
- audio data (9)
- read speech (8)
- English (4)
- adult (3)
- interview (3)
- video (3)
- whisper (2)
- sociophonetic (2)
- conversation (2)
- female (2)
- male (2)
- L2 English (2)
- annotated (2)
- transcribed (2)
- anechoic (1)
- emotional speech (1)
- fast speech (1)
- high pitch (1)
- loud speech (1)
- low pitch (1)
- shout (1)
- slow speech (1)
- accent map (1)
- lip video (1)
- teaching resource (1)
- ultrasound tongue imaging (UTI) (1)
- Arabic (1)
- accent variability (1)
- dialect variability (1)
- older adult (1)
- Australian (1)
- forensic (1)
- audiovisual (1)
- digits (1)
- L2 speech (1)
- language learning (1)
- Amyotrophic Lateral Sclerosis (ALS) (1)
- Down syndrome (1)
- Parkinson's disease (1)
- cerebral palsy (1)
- speech-language pathology (1)
- stroke (1)
- professional voice (1)
- silent speech (1)
- ultrasound (1)
- American English (1)
- sociolinguistic (1)
- World Englishes (1)
- dyadic (1)