Your search
Results 15 resources
-
This speech corpus contains recordings for 104 monolingual native southern British English speakers aged between 8 and 85 years old while they engaged in a problem-solving picture-based ‘spot the difference’ task (Diapix) with a conversational partner in four listening conditions. In NORM (quiet, no masking), participants heard each other normally. In SPSN (speech-shaped noise), participants...
-
This collection contains the quantitative data resulting from the analysis of the elderLUCID audio corpus – a set of speech recordings collected for 83 adults aged 19 to 84 years inclusive. Recordings were made while participants carried out two types of collaborative tasks with a conversational partner who was a young adult of the same sex: (1) a ‘spot the difference’ picture task (‘diapix’)...
-
Fully-annotated corpus of spontaneous speech dialogues for children. Diapix task recorded as a stereo wav files with one speaker per channel. 96 children aged between 9 to 14 years old Non-bilingual native Southern British English speakers
-
The Nijmegen Corpus of Casual Czech contains 30 hours of high-quality recordings featuring 60 Czech speakers conversing among friends. The speech has been orthographically transcribed.
-
The Nijmegen Corpus of Casual French contains 35 hours of high-quality recordings featuring 46 French speakers conversing among friends. The speech has been orthographically annotated by professional transcribers.
-
The Nijmegen Corpus of Casual Spanish contains around 30 hours of high-quality recordings featuring 52 Spanish speakers from Madrid conversing among friends. The speech has been orthographically annotated by professional transcribers.
-
English is the most widely spoken language in the world, used daily by millions of people as a first or second language in many different contexts. As a result, there are many varieties of English. Although the great many advances in English automatic speech recognition (ASR) over the past decades, results are usually reported based on test datasets which fail to represent the diversity of...
-
The Buckeye Corpus of conversational speech contains high-quality recordings from 40 speakers in Columbus OH conversing freely with an interviewer. The speech has been orthographically transcribed and phonetically labeled. The audio and text files, together with time-aligned phonetic labels, are stored in a format for use with speech analysis software (Xwaves and Wavesurfer).
-
This 3-year project investigates language change in five urban dialects of Northern England—Derby, Newcastle, York, Leeds and Manchester. Data collection method: Linguistic analysis of speech data (conversational, word list) from samples of different northern English urban communities. Data collection consisted of interviews, which included (1) some structured questions about the interviewee...
-
This database contains two non-contemporaneous recordings of each of 68 female speakers of Standard Chinese (a.k.a. Mandarin and Putonghua). 60 of the speakers are from north eastern China, and 8 are from southern China. Each speaker was recorded in three speaking styles: - casual telephone conversation (cnv) - information exchange task over the telephone (fax) - pseudo-police-style interview (int)
-
Forensic database of voice recordings of 500+ Australian English speakers (AusEng 500+). This database contains 3899 recordings totalling 310 hours of speech from 555 Australian-English speakers. 324 female speakers: - 91 recorded in one recording session - 69 recorded in two separate recording sessions - 159 recorded in three recording sessions - 5 recorded in more than three recording...
-
The MSP-Conversation corpus contains interactions annotated with time-continuous emotional traces for arousal (calm to active), valence (negative to positive), and dominance (weak to strong). Time-continuous annotations offer the flexibility to explore emotional displays at different temporal resolutions while leveraging contextual information. Release 1.0 contains 74 conversations with...
-
These transcripts and video files are samples of Spanish and English caregiver (almost always mother)-child interaction collected at child ages 2 ½, 3, and 3 ½ years as part of a 10-year longitudinal study of the language and literacy development of U.S.-born children raised in Spanish-speaking homes. Each recording is approximately 30 minutes in length. The caregiver and target child are...
-
The West Yorkshire Regional English Database (WYRED) consists of approximately 200 hours of high-quality audio recordings of 180 West Yorkshire (British English) speakers. All participants are male between the ages of 18-30, and are divided evenly (60 per region) across three boroughs within West Yorkshire (Northern England): Bradford, Kirklees, and Wakefield. Speakers participated in four...
Explore
Audio Data
- Conversation
- Accents (4)
- Child Speech (3)
- Forensic (4)
-
Language
(2)
- Bi-/Multilingual (1)
- English (2)
- Spanish (1)
- Speech in Noise (4)
Tags
- audio data
- conversation (11)
- English (8)
- transcribed (8)
- spontaneous speech (8)
- adult (6)
- individual variability (4)
- interview (4)
- male (4)
- child speech (3)
- speech in noise (3)
- forensic (3)
- telephone (3)
- female (3)
- Spanish (2)
- older adult (2)
- British English (2)
- map task (2)
- read speech (2)
- bilingual (1)
- child-centered audio (1)
- mother-child interaction (1)
- perceptually annotated (1)
- Czech (1)
- French (1)
- Southern standard British English (SSBE) (1)
- Bradford (1)
- Kirklees (1)
- Wakefield (1)
- West Yorkshire (1)
- Australian (1)
- Mandarin (1)
- Putonghua (1)
- British (1)
- Derby (1)
- English accents (1)
- Leeds (1)
- Manchester (1)
- Newcastle (1)
- York (1)
- American English (1)
- Ohio (1)
- phonetic labels (1)
- L2 English (1)
- World Englishes (1)
- dyadic (1)
- video (1)
Resource type
- Dataset (15)