Search
Full catalogue 113 resources
-
There have been considerable research efforts in the area of vocal tract modeling but there is still a small body of information regarding direct 3-D measurements of the vocal tract shape. The purpose of this study was to acquire, using magnetic resonance imaging (MRI), an inventory of speaker-specific, three-dimensional, vocal tract air space shapes that correspond to a particular set of...
-
BAGLS-RT is an extension of the BAGLS dataset (DOI 10.5281/zenodo.3762320) intended for (re-)training glottis segmentation models.
-
BAGLS is a benchmark dataset intended to compare performance across automatic glottis segmentation methods.
-
VOICEBOX is a speech processing toolbox consists of MATLAB routines that are maintained by and mostly written by Mike Brookes, Department of Electrical & Electronic Engineering, Imperial College, Exhibition Road, London SW7 2BT, UK. The routines are available as a GitHub repository (or a zip archive but often slightly out-of-date) and are made available under the terms of the GNU Public...
-
SVQTD (Singing Voice Quality and Technique Database) is a classical tenor singing dataset collected from YouTube, it is mainly used to support supervised machine learning performing paralinguistic singing attribute recognition tasks. In SVQTD, there are nearly 4000 vocal solo segments with $4 - 20$ seconds long, totaling 10.7 hours. These segmenets are partitioned from 400 audios of 6 famous...
-
Relationships between a listener's identification of a spoken vowel and its properties as revealed from acoustic measurement of its sound wave have been a subject of study by many investigators. Both the utterance and the identification of a vowel depend upon the language and dialectal backgrounds and the vocal and auditory characteristics of the individuals concerned. The purpose of this...
-
Purpose The anatomic origin for prepubertal vowel acoustic differences between male and female subjects remains unknown. The purpose of this study is to examine developmental sex differences in vocal tract (VT) length and its oral and pharyngeal portions. Method Nine VT variables were measured from 605 imaging studies...
-
The purpose of this study was to determine the developmental trajectory of the four corner vowels' fundamental frequency (fo) and the first four formant frequencies (F1–F4), and to assess when speaker-sex differences emerge. Five words per vowel, two of which were produced twice, were analyzed for fo and estimates of the first four formants frequencies from 190 (97 female, 93 male) typically...
-
This paper presents a large-scale study of subglottal resonances (SGRs) (the resonant frequencies of the tracheo-bronchial tree) and their relations to various acoustical and physiological characteristics of speakers. The paper presents data from a corpus of simultaneous microphone and accelerometer recordings of consonant-vowel-consonant (CVC) words embedded in a carrier phrase spoken by 25...
-
Experimental determinations of the acoustic properties of the subglottal airway, from the trachea below the larynx to the lungs, may provide useful information for detecting airway pathologies and aid in the understanding of vocal fold auto-oscillation. Here, minimally invasive, high precision impedance measurements are made through the lips (7 men, 3 women) over the range 14–4200 Hz during...
-
The frequencies, magnitudes, and bandwidths of vocal tract resonances are all important in understanding and synthesizing speech. High precision acoustic impedance spectra of the vocal tracts of 10 subjects were measured from 10 Hz to 4.2 kHz by injecting a broadband acoustic signal through the lips. Between 300 Hz and 4 kHz the acoustic resonances R (impedance minima measured through the...
-
A zip archive of several series of DICOM files from two ex-vivo hyoid specimens: one adult and one child. Each specimen was scanned at different slice thicknesses, as described and used in Cotter et al., 2015.
-
A zip archive of several series of DICOM files from three ex-vivo mandible specimens: two adult and one child. Each specimen was scanned at different slice thicknesses, as described and used in Whyms et al., 2013.
Explore
Audio
-
Accent/Region
(13)
- American English (2)
- Arabic (1)
- Australian English (2)
- British English (6)
- World Englishes (3)
- Child Speech (9)
- Conversation (9)
- Directed Speech (1)
- Electroglottography / Electrolaryngography (1)
- Emotional Speech (5)
- Forensic (5)
-
Language
(27)
- Arabic (1)
- Bi-/Multilingual (1)
- English (19)
- French (1)
- L2+ (1)
- Language Learning (2)
- Mandarin (3)
- Multiple (2)
- Multiple (2)
- Spanish (1)
- Multi-Speaker (18)
- Multi-Style (2)
- Pathological (9)
- Singing (2)
- Speech in Noise (3)
- Synthetic Speech (2)
Benchmarks & Validation
- Glottis (2)
Derived & Measured Data
- Formant Measurements (7)
- Fundamental Frequency (2)
- Phone-Level Alignments (1)
- Subglottal Tract (3)
- Vocal Tract (10)
- Vocal Tract Resonances (1)
- Voice Quality Measures (1)
Software, Processing & Utilities
- Articulatory Data Processing (2)
- Feature Extraction (4)
- Image and Volume Segmentation (3)
- Numerical Acoustic Modelling (3)
- Phone Apps (1)
- Speech Processing (5)
- Transcription (3)
- Utilities (4)
Speech Production & Articulation
- Articulography (2)
- Brain Imaging (1)
- MRI (11)
- Ultrasound (10)
- Video (3)
- X-Ray (1)
Teaching Resources
- 3D Models (2)
- Articulation Data (3)
- Tutorials (2)
- Videos (2)
Vocal Anatomy
- Hyoid (1)
- Larynx and Glottis (3)
- Mandible (2)
- Mechanical Properties (1)
- Vocal Tract (11)
Tags
- audio data (46)
- adult (40)
- male (33)
- female (28)
- read speech (23)
- English (23)
- transcribed (13)
- vowels (11)
- MRI (11)
- formant measurement (10)
- spontaneous speech (10)
- child speech (10)
- speech-language pathology (9)
- speech processing (7)
- video (7)
- ultrasound (7)
- teaching resource (6)
- interview (6)
- real-time MRI (rtMRI) (6)
- conversation (6)
- child (6)
- MATLAB (5)
- open-source (5)
- articulatory data (5)
- volumetric MRI (5)
- American English (5)
- vocal tract shape (5)
- segmentation (5)
- automatic speech recognition (ASR) (4)
- speech recognition (4)
- emotional speech (4)
- rtMRI (4)
- annotated (4)
- vocal tract area function (4)
- STL files (3)
- forensic (3)
- telephone (3)
- speaker diarization (3)
- audio processing (3)
- transcription (3)
- Python (3)
- English accents (3)
- British (3)
- angry (3)
- happy (3)
- older adult (3)
- sad (3)
- Mandarin (3)
- perceptually annotated (3)
- speech production (3)
- ultrasound tongue imaging (UTI) (3)
- Newcastle (3)
- DICOM (3)
- computed tomography (CT) (3)
- pathological speech (3)
- speech sound disorder (3)
- numerical acoustic modelling (3)
- source-filter model (2)
- tube model (2)
- Praat (2)
- phonetics (2)
- child-centered audio (2)
- audio (2)
- convert (2)
- file format (2)
- feature extraction (2)
- speech to text (2)
- speech activity detection (2)
- voice activity detection (2)
- whisper (2)
- synthetic speech (2)
- singing (2)
- audiovisual (2)
- articulation (2)
- multimodal (2)
- International Phonetic Alphabet (IPA) (2)
- electromagnetic articulography (EMA) (2)
- lip video (2)
- sociophonetic (2)
- Australian (2)
- phonetic labels (2)
- British English (2)
- L2 English (2)
- finite element method (FEM) (2)
- mandible (2)
- impedance (2)
- vocal tract length (2)
- subglottal tract (2)
- fundamental frequency (2)
- benchmark (2)
- glottis (2)
- videoendoscopy (2)
- multi-language (2)
- 3D print (1)
- Southern standard British English (SSBE) (1)
- map task (1)
- TextGrid (1)
- software (1)
- spectrogram (1)
- speech analysis (1)
- language development (1)
- language environment analysis (LENA) (1)
- word count estimation (1)
- record (1)
- stream (1)
- cepstral peak prominence (CPP) (1)
- harmonic-to-noise ratio (HNR) (1)
- C++ (1)
- classification (1)
- emotion recognition (1)
- speaker identification (1)
- conversational AI (1)
- overlapped speech detection (1)
- speaker embedding (1)
- anechoic (1)
- fast speech (1)
- high pitch (1)
- loud speech (1)
- low pitch (1)
- shout (1)
- slow speech (1)
- deepfake (1)
- logical access (1)
- physical access (1)
- spoof (1)
- speaker detection (1)
- two-class recognizer (1)
- rainbow passage (1)
- labelled (1)
- non-speech (1)
- environmental noise (1)
- noisy audio (1)
- reverberation (1)
- disgust (1)
- surprise (1)
- podcast (1)
- Spanish (1)
- bilingual (1)
- mother-child interaction (1)
- speech rate (1)
- syllable (1)
- syllable nuclei (1)
- consonants (1)
- jaw scans (1)
- accent map (1)
- speech synthesis (1)
- Arabic (1)
- accent variability (1)
- dialect variability (1)
- arousal (1)
- dominance (1)
- valence (1)
- Putonghua (1)
- image processing (1)
- French (1)
- Derby (1)
- Leeds (1)
- Manchester (1)
- York (1)
- digits (1)
- Ohio (1)
- Non-native speech (1)
- adaptation (1)
- diapix (1)
- Middlesbrough (1)
- Sunderland (1)
- speech acoustics (1)
- longitudinal (1)
- formant tracking (1)
- anatomy (1)
- app (1)
- larynx (1)
- typically developing (1)
- x-ray (1)
- x-ray microbeam (1)
- L2 speech (1)
- language learning (1)
- electroglottography (EGG) (1)
- intraoral pressure (1)
- validation (1)
- hyoid (1)
- antiresonance (1)
- vocal tract resonance (1)
- resonance (1)
- corner vowels (1)
- developmental trajectory (1)
- sexual dimorphism (1)
- loudness (1)
- subglottal pressure (1)
- back placement (1)
- chest resonance (1)
- classical (1)
- front placement (1)
- head resonance (1)
- open throat (1)
- roughness (1)
- tenor (1)
- vibrato (1)
- dysarthria (1)
- Amyotrophic Lateral Sclerosis (ALS) (1)
- Down syndrome (1)
- Parkinson's disease (1)
- cerebral palsy (1)
- stroke (1)
- stutter (1)
- cleft (1)
- liquids (1)
- nasals (1)
- plosives (1)
- morphometric (1)
- Lombard speech (1)
- clear speech (1)
- computer-directed speech (1)
- infant-directed speech (1)
- non-native-directed speech (1)
- speech in noise (1)
- Scottish English (1)
- coarticulation (1)
- within-speaker variability (1)
- phone duration (1)
- phone-level alignment (1)
- pitch (1)
- CAPE-V (1)
- GRBAS (1)
- clinical (1)
- voice quality (1)
- area function (1)
- vocal fold model (1)
- vocal tract transfer function (1)
- held vowel (1)
- brain activity (1)
- fMRI (1)
- vocal imitation (1)
- professional voice (1)
- silent speech (1)
- sociolinguistic (1)
- World Englishes (1)
- dyadic (1)
Resource type
- Conference Paper (1)
- Dataset (54)
- Journal Article (21)
- Preprint (2)
- Report (1)
- Software (19)
- Web Page (15)