Results | YorVoice Catalogue

The Edinburgh International Accents of English Corpus

Ramon Sanabria, Nina Markl, Andrea Carmantini + 4 others

English is the most widely spoken language in the world, used daily by millions of people as a first or second language in many different contexts. As a result, there are many varieties of English. Although the great many advances in English automatic speech recognition (ASR) over the past decades, results are usually reported based on test datasets which fail to represent the diversity of...

View on datashare.ed.ac.uk

The Tongue and Lips Corpus

M. S. Ribeiro, J. Sanger, J.-X. Zhang + 4 others

A multi-speaker corpus of ultrasound images of the tongue and video images of the lips The Tongue and Lips (TaL) corpus is a multi-speaker corpus of ultrasound images of the tongue and video images of lips. This corpus contains synchronised imaging data of extraoral (lips) and intraoral (tongue) articulators from 82 native speakers of English. The TaL corpus consists of two datasets: - TaL1...

View on ultrasuite.github.io

Re-Training Extension of the Benchmark for Automatic Glottis Segmentation (BAGLS-RT)

Michael Döllinger, Tobias Schraut, Lea Henrich + 9 others

BAGLS-RT is an extension of the BAGLS dataset (DOI 10.5281/zenodo.3762320) intended for (re-)training glottis segmentation models.

View on zenodo.org

Benchmark for Automatic Glottis Segmentation (BAGLS)

Pablo Gómez, Andreas M Kist, Patrick Schlegel + 10 others

BAGLS is a benchmark dataset intended to compare performance across automatic glottis segmentation methods.

View on zenodo.org

Audiovisual Whisper (AVW) Corpus

The MSP-AVW is an audiovisual whisper corpus for audiovisual speech recognition purpose. The MSP-AVW corpus contains data from 20 female and 20 male speakers. For each subject, three sessions are recorded consisting of read sentences, isolated digits and spontaneous speech. The data is recorded under neutral and whisper conditions. The corpus was collected in a 13ft x 13ft ASHA certified...

View on ecs.utdallas.edu

mngu0

Korin Richmond

This is a corpus of articulatory data of different forms (EMA, MRI, video, 3D scans of upper/lower jaw, audio etc.) acquired from one male British English speaker.

View on www.mngu0.org

Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)

Houwei Cao, David G Cooper, Michael K Keutmann + 3 others

CREMA-D is a data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified). Actors spoke from a selection of 12 sentences. The sentences were presented using one of six different emotions (Anger, Disgust, Fear, Happy,...

View on github.com

Your search

Results 7 resources

Explore

Audio Data

Speech Production Data

Tags

Resource type