Your search
Results 7 resources
-
English is the most widely spoken language in the world, used daily by millions of people as a first or second language in many different contexts. As a result, there are many varieties of English. Although the great many advances in English automatic speech recognition (ASR) over the past decades, results are usually reported based on test datasets which fail to represent the diversity of...
-
A multi-speaker corpus of ultrasound images of the tongue and video images of the lips The Tongue and Lips (TaL) corpus is a multi-speaker corpus of ultrasound images of the tongue and video images of lips. This corpus contains synchronised imaging data of extraoral (lips) and intraoral (tongue) articulators from 82 native speakers of English. The TaL corpus consists of two datasets: - TaL1...
-
BAGLS-RT is an extension of the BAGLS dataset (DOI 10.5281/zenodo.3762320) intended for (re-)training glottis segmentation models.
-
BAGLS is a benchmark dataset intended to compare performance across automatic glottis segmentation methods.
-
The MSP-AVW is an audiovisual whisper corpus for audiovisual speech recognition purpose. The MSP-AVW corpus contains data from 20 female and 20 male speakers. For each subject, three sessions are recorded consisting of read sentences, isolated digits and spontaneous speech. The data is recorded under neutral and whisper conditions. The corpus was collected in a 13ft x 13ft ASHA certified...
-
This is a corpus of articulatory data of different forms (EMA, MRI, video, 3D scans of upper/lower jaw, audio etc.) acquired from one male British English speaker.
-
CREMA-D is a data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified). Actors spoke from a selection of 12 sentences. The sentences were presented using one of six different emotions (Anger, Disgust, Fear, Happy,...
Explore
Audio
-
Accent/Region
(1)
- World Englishes (1)
- Conversation (1)
- Emotional Speech (1)
-
Language
(1)
- English (1)
- Multi-Speaker (1)
- Multi-Style (1)
Benchmarks & Validation
- Glottis (2)
Speech Production & Articulation
- Articulography (1)
- MRI (1)
- Ultrasound (1)
- Video (3)
Vocal Anatomy
- Larynx and Glottis (2)
- Vocal Tract (1)
Tags
- video
- audio data (5)
- read speech (4)
- male (3)
- English (3)
- spontaneous speech (3)
- adult (2)
- audiovisual (2)
- female (2)
- benchmark (2)
- glottis (2)
- segmentation (2)
- videoendoscopy (2)
- angry (1)
- disgust (1)
- emotional speech (1)
- happy (1)
- older adult (1)
- sad (1)
- British (1)
- MRI (1)
- electromagnetic articulography (EMA) (1)
- jaw scans (1)
- digits (1)
- whisper (1)
- professional voice (1)
- silent speech (1)
- ultrasound (1)
- L2 English (1)
- World Englishes (1)
- conversation (1)
- dyadic (1)
Resource type
- Dataset (7)