Results | YorVoice Catalogue

Ultrax Typically Developing Children

A dataset of ultrasound and audio recordings from typically developing children. The UXTD dataset contains 58 speakers (31 female and 27 male), aged 5-12 years.

View on ultrasuite.github.io

Diachronic Electronic Corpus of Tyneside English (DECTE)

Karen P. Corrigan, Isabelle Buchstaller, Adam Mearns + 1 others

DECTE is an amalgamation of the existing Newcastle Electronic Corpus of Tyneside English (NECTE), created between 2001 and 2005, and NECTE2, a collection of interviews conducted in the Tyneside area since 2007. It thereby constitutes a rare example of a publicly available on-line corpus presenting dialect material spanning five decades.

View on research.ncl.ac.uk

University College London’s Archive of Stuttered Speech (UCLASS)

This site allows visitors to access recordings of speakers who stutter and background details about these speakers and the conditions in which the recordings were made. The recordings are available in various formats. The main two sets of recordings were made in normal speaking conditions and the final one was made when the sound of the speaker’s voice was altered as he or she spoke. The three...

View on www.uclass.psychol.ucl.ac.uk

Audiovisual Whisper (AVW) Corpus

The MSP-AVW is an audiovisual whisper corpus for audiovisual speech recognition purpose. The MSP-AVW corpus contains data from 20 female and 20 male speakers. For each subject, three sessions are recorded consisting of read sentences, isolated digits and spontaneous speech. The data is recorded under neutral and whisper conditions. The corpus was collected in a 13ft x 13ft ASHA certified...

View on ecs.utdallas.edu

A comparative study of language change in Northern Englishes

William Haddican, Paul Foulkes

This 3-year project investigates language change in five urban dialects of Northern England—Derby, Newcastle, York, Leeds and Manchester. Data collection method: Linguistic analysis of speech data (conversational, word list) from samples of different northern English urban communities. Data collection consisted of interviews, which included (1) some structured questions about the interviewee...

View on reshare.ukdataservice.ac.uk

An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech

Yudong Yang, Rongfeng Su, Shaofeng Zhao + 4 others

Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to...

View on www.nature.com

Real-time speech MRI datasets with corresponding articulator ground-truth segmentations

Matthieu Ruthven, Agnieszka M. Peplinski, David M. Adams + 2 others

Abstract The use of real-time magnetic resonance imaging (rt-MRI) of speech is increasing in clinical practice and speech science research. Analysis of such images often requires segmentation of articulators and the vocal tract, and the community is turning to deep-learning-based methods to perform this segmentation. While there are publicly available rt-MRI datasets of speech,...

View on www.nature.com

Multimodal dataset of real-time 2D and static 3D MRI of healthy French speakers

Karyna Isaieva, Yves Laprie, Justine Leclère + 3 others

Abstract The study of articulatory gestures has a wide spectrum of applications, notably in speech production and recognition. Sets of phonemes, as well as their articulation, are language-specific; however, existing MRI databases mostly include English speakers. In our present work, we introduce a dataset acquired with MRI from 10 healthy native French speakers. A corpus...

View on www.nature.com

Forensic Voice Comparison Databases: forensic_eval_01

Geoffrey Stewart Morrison, Ewald Enzinger

Multi-laboratory evaluation of forensic voice comparison systems under conditions reflecting those of a real forensic case. There is increasing pressure on forensic laboratories to validate the performance of forensic analysis systems before they are used to assess strength of evidence for presentation in court (including pressure from the recently released report by the President’s Council...

View on forensic-voice-comparison.net

Forensic Voice Comparison Databases: AusEng 500+

G. S. Morrison, C. Zhang, E. Enzinger + 8 others

Forensic database of voice recordings of 500+ Australian English speakers (AusEng 500+). This database contains 3899 recordings totalling 310 hours of speech from 555 Australian-English speakers. 324 female speakers: - 91 recorded in one recording session - 69 recorded in two separate recording sessions - 159 recorded in three recording sessions - 5 recorded in more than three recording...

View on forensic-voice-comparison.net

mngu0

Korin Richmond

This is a corpus of articulatory data of different forms (EMA, MRI, video, 3D scans of upper/lower jaw, audio etc.) acquired from one male British English speaker.

View on www.mngu0.org

USC Speech and Vocal Tract Morphology MRI Database

Tanner Sorensen, Zisis Skordilis, Asterios Toutios + 9 others

The USC Speech and Vocal Tract Morphology MRI Database consists of real-time magnetic resonance images of dynamic vocal tract shaping during read and spontaneous speech with concurrently recorded denoised audio, and 3D volumetric MRI of vocal tract shapes during vowels and continuant consonants sustained for 7 seconds, from 17 speakers.

View on sail.usc.edu

USC-EMO-MRI: An emotional speech production database

Jangwon Kim, Asterios Toutios, Yoon-Chul Kim + 3 others

USC-EMO-MRI is an emotional speech production database which includes real-time magnetic resonance imaging data with synchronized speech audio from five male and five female actors, each producing a passage and a set of sentences in multiple repetitions, while enacting four different target emotions (neutral, happy, angry, sad). The database includes emotion quality evaluation from at least...

View on sail.usc.edu

USC-TIMIT

Shrikanth Narayanan, Asterios Toutios, Vikram Ramanarayanan + 12 others

USC-TIMIT is a database of speech production data under ongoing development, which currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English, and electromagnetic articulography data from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus. In...

View on sail.usc.edu

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images

Yongwan Lim, Asterios Toutios, Yannick Bliesener + 16 others

Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving...

View on figshare.com

Your search

Results 33 resources

Explore

Audio Data

Derived & Measured Data

Software, Processing & Utilities

Speech Production Data

Tags

Resource type