Your search
Results 54 resources
-
English is the most widely spoken language in the world, used daily by millions of people as a first or second language in many different contexts. As a result, there are many varieties of English. Although the great many advances in English automatic speech recognition (ASR) over the past decades, results are usually reported based on test datasets which fail to represent the diversity of...
-
The Sociolinguistic Archive and Analysis Project, at North Carolina State University, is an interactive web-based archive of sociolinguistic recordings, with integrated media playing and annotation features, as well as phonetic analysis and corpus analysis tools designed for enabling and improving empirical linguistic inquiry. The archive continues to grow over time. It currently contains (as...
-
This dataset contains the synthetic stimuli used in the study published in the paper "A Comparative Study of 3D and 1D Acoustic Simulations of the Higher Frequencies of Speech". The goal of this study was to evaluate the accuracy of the acoustic wave propagation in the vocal tract in a source-filter synthesis paradigm with two perceptual experiments. The high frequencies (above 4 kHz) of the...
-
A multi-speaker corpus of ultrasound images of the tongue and video images of the lips The Tongue and Lips (TaL) corpus is a multi-speaker corpus of ultrasound images of the tongue and video images of lips. This corpus contains synchronised imaging data of extraoral (lips) and intraoral (tongue) articulators from 82 native speakers of English. The TaL corpus consists of two datasets: - TaL1...
-
This collection contains behavioural and brain activation data from 3 laboratory studies of speech imitation. Each of the three studies involved behavioural and imaging (MRI) test sessions in which participants were familiarised with novel auditory speech targets, and were asked to imitate them as closely as possible. Across the three studies, there were variations in the type of sounds...
-
This database includes clinically-verified 208 voice samples, from 150 pathological voices and 58 healthy voices. The database also includes information such as gender, age, pathology, lifestyle habits (e.g. smoking, alcohol and coffee consummation), occupational status, and the results of two specific medical questionnaires: the Voice Handicap Index (VHI) and Reflux Symptom Index...
-
This dataset contains Stereo-Lithographic (STL) surface models of a human vocal tract, derived Finite-Element-Models, numerical results, and scripts for analyzing these results and (re-)running the computation. In the main folder, this dataset contains: 1) Python files (*fig*.py) for the creation of figures and tables (*tab*.py) 2) Python files (*.py) for analyzing Finite-Element (FE)...
-
This database was created through generous funding from The Voice Foundation's Advancing Scientific Voice Research Grant and contains voice samples which have been rated by experienced voice professionals (at least 3 different raters with a minimum of 3 years’ clinical experience) in order to provide educators with standardized materials to better train pre-service clinical voice...
-
For over half a century, the UCLA Phonetics Laboratory has collected recordings of hundreds of languages from around the world, providing source materials for phonetic and phonological research, of value to scholars, speakers of the languages, and language learners alike. The materials on this site comprise audio recordings illustrating phonetic structures from over 200 languages with phonetic...
-
VoxAngeles is a corpus of audited phonetic transcriptions and phone-level alignments of the UCLA Phonetics Lab Archive (Ladefoged et al., 2009, http://archive.phonetics.ucla.edu/), along with phonetic measurements including word and phone durations, vowel f0 and vowel formants. The audited portion of the corpus currently contains data from 95 languages across 21 language families. Unaudited...
-
Coarticulation, one of the central issues in experimental phonetic research, refers to the articulatory overlap of neighbouring sounds, resulting in acoustic and perceptual modifications of these sounds. Studies of the development of coarticulatory patterns in children have produced conflicting results concerning adult-child differences. This research compares coarticulatory properties of...
-
Single male native British English talker recorded producing 25 TIMIT sentences in 5 conditions, two natural: (i) quiet, (ii) while the talker listened to high-intensity speech-shaped noise, and three acted: (i) as if to a non-native listener, (ii) as if to a computer speech-recognition system, (iii) as if to an infant. Accompanied by automatic and hand-corrected phone-level transcription.
-
BAGLS-RT is an extension of the BAGLS dataset (DOI 10.5281/zenodo.3762320) intended for (re-)training glottis segmentation models.
-
BAGLS is a benchmark dataset intended to compare performance across automatic glottis segmentation methods.
-
SVQTD (Singing Voice Quality and Technique Database) is a classical tenor singing dataset collected from YouTube, it is mainly used to support supervised machine learning performing paralinguistic singing attribute recognition tasks. In SVQTD, there are nearly 4000 vocal solo segments with $4 - 20$ seconds long, totaling 10.7 hours. These segmenets are partitioned from 400 audios of 6 famous...
Explore
Audio
-
Accent/Region
(11)
- American English (2)
- Arabic (1)
- Australian English (2)
- British English (4)
- World Englishes (3)
- Child Speech (8)
- Conversation (9)
- Directed Speech (1)
- Electroglottography / Electrolaryngography (1)
- Emotional Speech (3)
- Forensic (5)
-
Language
(17)
- Arabic (1)
- Bi-/Multilingual (1)
- English (12)
- Language Learning (1)
- Mandarin (1)
- Multiple (2)
- Spanish (1)
- Multi-Speaker (11)
- Multi-Style (2)
- Pathological (8)
- Singing (1)
- Speech in Noise (2)
- Synthetic Speech (2)
Benchmarks & Validation
- Glottis (2)
Derived & Measured Data
- Formant Measurements (2)
- Phone-Level Alignments (1)
- Vocal Tract (3)
- Voice Quality Measures (1)
Software, Processing & Utilities
Speech Production & Articulation
- Articulography (2)
- Brain Imaging (1)
- MRI (8)
- Ultrasound (9)
- Video (3)
- X-Ray (1)
Teaching Resources
Vocal Anatomy
- Hyoid (1)
- Larynx and Glottis (3)
- Mandible (1)
- Vocal Tract (8)
Tags
- audio data (38)
- adult (24)
- male (22)
- female (19)
- read speech (17)
- English (16)
- spontaneous speech (9)
- child speech (9)
- transcribed (9)
- MRI (8)
- speech-language pathology (8)
- video (7)
- ultrasound (7)
- conversation (6)
- interview (5)
- articulatory data (5)
- real-time MRI (rtMRI) (5)
- volumetric MRI (4)
- vocal tract shape (4)
- vowels (4)
- forensic (3)
- telephone (3)
- emotional speech (3)
- American English (3)
- speech production (3)
- Newcastle (3)
- DICOM (3)
- annotated (3)
- speech sound disorder (3)
- whisper (2)
- synthetic speech (2)
- angry (2)
- audiovisual (2)
- happy (2)
- older adult (2)
- sad (2)
- articulation (2)
- multimodal (2)
- electromagnetic articulography (EMA) (2)
- perceptually annotated (2)
- British (2)
- lip video (2)
- teaching resource (2)
- ultrasound tongue imaging (UTI) (2)
- sociophonetic (2)
- Australian (2)
- English accents (2)
- phonetic labels (2)
- British English (2)
- STL files (2)
- finite element method (FEM) (2)
- child (2)
- computed tomography (CT) (2)
- benchmark (2)
- glottis (2)
- segmentation (2)
- videoendoscopy (2)
- formant measurement (2)
- multi-language (2)
- pathological speech (2)
- numerical acoustic modelling (2)
- Southern standard British English (SSBE) (1)
- map task (1)
- anechoic (1)
- fast speech (1)
- high pitch (1)
- loud speech (1)
- low pitch (1)
- shout (1)
- slow speech (1)
- deepfake (1)
- logical access (1)
- physical access (1)
- spoof (1)
- disgust (1)
- Spanish (1)
- bilingual (1)
- child-centered audio (1)
- mother-child interaction (1)
- consonants (1)
- jaw scans (1)
- accent map (1)
- International Phonetic Alphabet (IPA) (1)
- Arabic (1)
- accent variability (1)
- dialect variability (1)
- arousal (1)
- dominance (1)
- valence (1)
- Mandarin (1)
- Putonghua (1)
- Derby (1)
- Leeds (1)
- Manchester (1)
- York (1)
- digits (1)
- Ohio (1)
- Non-native speech (1)
- adaptation (1)
- diapix (1)
- Middlesbrough (1)
- Sunderland (1)
- longitudinal (1)
- typically developing (1)
- x-ray (1)
- x-ray microbeam (1)
- electroglottography (EGG) (1)
- intraoral pressure (1)
- validation (1)
- mandible (1)
- hyoid (1)
- back placement (1)
- chest resonance (1)
- classical (1)
- front placement (1)
- head resonance (1)
- open throat (1)
- roughness (1)
- singing (1)
- tenor (1)
- vibrato (1)
- open-source (1)
- Amyotrophic Lateral Sclerosis (ALS) (1)
- Down syndrome (1)
- Parkinson's disease (1)
- cerebral palsy (1)
- stroke (1)
- stutter (1)
- cleft (1)
- Lombard speech (1)
- clear speech (1)
- computer-directed speech (1)
- infant-directed speech (1)
- non-native-directed speech (1)
- speech in noise (1)
- Scottish English (1)
- coarticulation (1)
- within-speaker variability (1)
- phone duration (1)
- phone-level alignment (1)
- pitch (1)
- CAPE-V (1)
- GRBAS (1)
- clinical (1)
- voice quality (1)
- Python (1)
- vocal tract transfer function (1)
- held vowel (1)
- brain activity (1)
- fMRI (1)
- rtMRI (1)
- vocal imitation (1)
- professional voice (1)
- silent speech (1)
- sociolinguistic (1)
- L2 English (1)
- World Englishes (1)
- dyadic (1)