Search
Full catalogue 155 resources
-
VOICEBOX is a speech processing toolbox consists of MATLAB routines that are maintained by and mostly written by Mike Brookes, Department of Electrical & Electronic Engineering, Imperial College, Exhibition Road, London SW7 2BT, UK. The routines are available as a GitHub repository (or a zip archive but often slightly out-of-date) and are made available under the terms of the GNU Public...
-
SVQTD (Singing Voice Quality and Technique Database) is a classical tenor singing dataset collected from YouTube, it is mainly used to support supervised machine learning performing paralinguistic singing attribute recognition tasks. In SVQTD, there are nearly 4000 vocal solo segments with $4 - 20$ seconds long, totaling 10.7 hours. These segmenets are partitioned from 400 audios of 6 famous...
-
Relationships between a listener's identification of a spoken vowel and its properties as revealed from acoustic measurement of its sound wave have been a subject of study by many investigators. Both the utterance and the identification of a vowel depend upon the language and dialectal backgrounds and the vocal and auditory characteristics of the individuals concerned. The purpose of this...
-
Purpose The anatomic origin for prepubertal vowel acoustic differences between male and female subjects remains unknown. The purpose of this study is to examine developmental sex differences in vocal tract (VT) length and its oral and pharyngeal portions. Method Nine VT variables were measured from 605 imaging studies...
-
The purpose of this study was to determine the developmental trajectory of the four corner vowels' fundamental frequency (fo) and the first four formant frequencies (F1–F4), and to assess when speaker-sex differences emerge. Five words per vowel, two of which were produced twice, were analyzed for fo and estimates of the first four formants frequencies from 190 (97 female, 93 male) typically...
-
This paper presents a large-scale study of subglottal resonances (SGRs) (the resonant frequencies of the tracheo-bronchial tree) and their relations to various acoustical and physiological characteristics of speakers. The paper presents data from a corpus of simultaneous microphone and accelerometer recordings of consonant-vowel-consonant (CVC) words embedded in a carrier phrase spoken by 25...
-
Experimental determinations of the acoustic properties of the subglottal airway, from the trachea below the larynx to the lungs, may provide useful information for detecting airway pathologies and aid in the understanding of vocal fold auto-oscillation. Here, minimally invasive, high precision impedance measurements are made through the lips (7 men, 3 women) over the range 14–4200 Hz during...
-
The frequencies, magnitudes, and bandwidths of vocal tract resonances are all important in understanding and synthesizing speech. High precision acoustic impedance spectra of the vocal tracts of 10 subjects were measured from 10 Hz to 4.2 kHz by injecting a broadband acoustic signal through the lips. Between 300 Hz and 4 kHz the acoustic resonances R (impedance minima measured through the...
-
A zip archive of several series of DICOM files from two ex-vivo hyoid specimens: one adult and one child. Each specimen was scanned at different slice thicknesses, as described and used in Cotter et al., 2015.
-
A zip archive of several series of DICOM files from three ex-vivo mandible specimens: two adult and one child. Each specimen was scanned at different slice thicknesses, as described and used in Whyms et al., 2013.
-
This dataset contains simultaneous recordings of electroglottography (EGG recorded with Glottal Enterprises EG2-PCX2), unfiltered audio, and intraoral pressure (recorded with Glottal Enterprises PG-60) from 14 subjects. It is meant to facilitate the validation of physical models of glottal control during voicing, in which the glottal/source waveform for speech is controlled by a combination of...
-
Abstract A detailed understanding of how the acoustic patterns of speech sounds are generated by the complex 3D shapes of the vocal tract is a major goal in speech research. The Dresden Vocal Tract Dataset (DVTD) presented here contains geometric and (aero)acoustic data of the vocal tract of 22 German speech sounds (16 vowels, 5 fricatives, 1 lateral), each from one male and one...
Explore
Audio Data
-
Accent/Region
(3)
- British English (2)
- World Englishes (1)
- Accents (9)
- Child Speech (11)
- Conversation (17)
- Directed Speech (1)
- Electroglottography / Electrolaryngography (1)
- Emotional Speech (5)
- Forensic (5)
-
Language
(18)
- African Languages (1)
- Bi-/Multilingual (1)
- English (11)
- French (1)
- German (1)
- Korean (1)
- L2+ (1)
- Language Learning (1)
- Mandarin (2)
- Multiple (2)
- Spanish (1)
- Pathological (9)
- Singing (3)
- Speech in Noise (7)
- Synthetic Speech (11)
Derived & Measured Data
- Formant Measurements (7)
- Fundamental Frequency (2)
- Phone-Level Alignments (1)
- Subglottal Tract (3)
- Vocal Tract (10)
- Voice Quality Measures (1)
Software, Processing & Utilities
- Feature Extraction (4)
- Image and Volume Segmentation (3)
- Numerical Acoustic Modelling (3)
- Phone Apps (1)
- Speech Processing (5)
- Transcription (3)
- Utilities (4)
Speech Perception Data
- Brain Imaging (2)
Speech Production Data
- Articulography (3)
- Brain Imaging (2)
- EEG (1)
- MRI (14)
- Ultrasound (10)
- Video (3)
-
Vocal Anatomy
(23)
- Hyoid (1)
- Larynx and Glottis (3)
- Mandible and Maxilla (3)
- Mechanical Properties (1)
- Models (2)
- Vocal Tract (13)
- X-Ray (1)
Teaching Resources
- 3D Models (2)
- Articulation Data (3)
- Tutorials (2)
- Videos (2)
Tags
- audio data (78)
- adult (41)
- transcribed (36)
- male (34)
- English (30)
- female (29)
- read speech (25)
- spontaneous speech (16)
- magnetic resonance imaging (MRI) (13)
- child speech (12)
- real-time MRI (rtMRI) (12)
- conversation (12)
- vowels (11)
- synthetic speech (11)
- formant measurement (10)
- speech-language pathology (9)
- deepfake (8)
- vocal tract shape (8)
- speech processing (7)
- video (7)
- ultrasound (7)
- interview (7)
- individual variability (7)
- teaching resource (6)
- segmentation (6)
- child (6)
- volumetric MRI (6)
- MATLAB (5)
- open-source (5)
- older adult (5)
- Mandarin (5)
- American English (5)
- articulatory data (5)
- automatic speech recognition (ASR) (4)
- speech recognition (4)
- emotional speech (4)
- speech production (4)
- annotated (4)
- British English (4)
- vocal tract area function (4)
- speech in noise (4)
- text-to-speech (TTS) (4)
- French (4)
- forensic (4)
- telephone (4)
- functional magnetic resonance imaging (fMRI) (4)
- numerical acoustic modelling (3)
- STL files (3)
- speaker diarization (3)
- audio processing (3)
- transcription (3)
- Python (3)
- spoof (3)
- English accents (3)
- singing (3)
- British (3)
- angry (3)
- happy (3)
- sad (3)
- perceptually annotated (3)
- electromagnetic articulography (EMA) (3)
- MRI (3)
- pathological speech (3)
- ultrasound tongue imaging (UTI) (3)
- Newcastle (3)
- speech sound disorder (3)
- L2 English (3)
- computed tomography (CT) (3)
- mandible (3)
- DICOM (3)
- multi-language (3)
- Japanese (3)
- source-filter model (2)
- tube model (2)
- Praat (2)
- phonetics (2)
- child-centered audio (2)
- file format conversion (2)
- feature extraction (2)
- speech to text (2)
- speech activity detection (2)
- voice activity detection (2)
- whisper (2)
- audiovisual (2)
- Spanish (2)
- International Phonetic Alphabet (IPA) (2)
- vocal tract length (2)
- subglottal tract (2)
- fundamental frequency (2)
- benchmark (2)
- glottis (2)
- videoendoscopy (2)
- phone-level alignment (2)
- finite element method (FEM) (2)
- held vowel (2)
- voice conversion (VC) (2)
- Chinese (2)
- Sudanese (2)
- Nepali (2)
- Javanese (2)
- Bengali (2)
- map task (2)
- articulation (2)
- multimodal (2)
- lip video (2)
- sociophonetic (2)
- Australian (2)
- phonetic labels (2)
- speech perception (2)
- area function (1)
- vocal fold model (1)
- 3D print (1)
- TextGrid (1)
- software (1)
- spectrogram (1)
- speech analysis (1)
- language development (1)
- language environment analysis (LENA) (1)
- word count estimation (1)
- record audio (1)
- stream audio (1)
- cepstral peak prominence (CPP) (1)
- harmonic-to-noise ratio (HNR) (1)
- C++ (1)
- classification (1)
- emotion recognition (1)
- speaker identification (1)
- conversational AI (1)
- overlapped speech detection (1)
- speaker embedding (1)
- anechoic (1)
- fast speech (1)
- high pitch (1)
- loud speech (1)
- low pitch (1)
- shout (1)
- slow speech (1)
- logical access (1)
- physical access (1)
- speaker detection (1)
- two-class recognizer (1)
- rainbow passage (1)
- labelled (1)
- non-speech (1)
- environmental noise (1)
- noisy audio (1)
- reverberation (1)
- disgust (1)
- surprise (1)
- podcast (1)
- bilingual (1)
- mother-child interaction (1)
- speech rate (1)
- syllable (1)
- syllable nuclei (1)
- speech synthesis (1)
- image processing (1)
- dysarthria (1)
- digits (1)
- Amyotrophic Lateral Sclerosis (ALS) (1)
- Down syndrome (1)
- Parkinson's disease (1)
- cerebral palsy (1)
- stroke (1)
- stutter (1)
- Non-native speech (1)
- adaptation (1)
- diapix (1)
- Middlesbrough (1)
- Sunderland (1)
- speech acoustics (1)
- longitudinal (1)
- formant tracking (1)
- anatomy (1)
- app (1)
- larynx (1)
- typically developing (1)
- cleft (1)
- x-ray (1)
- x-ray microbeam (1)
- L2 speech (1)
- language learning (1)
- electroglottography (EGG) (1)
- intraoral pressure (1)
- validation (1)
- hyoid (1)
- antiresonance (1)
- vocal tract resonance (1)
- corner vowels (1)
- developmental trajectory (1)
- sexual dimorphism (1)
- loudness (1)
- subglottal pressure (1)
- tenor (1)
- vibrato (1)
- liquids (1)
- nasals (1)
- plosives (1)
- morphometric (1)
- Lombard speech (1)
- clear speech (1)
- computer-directed speech (1)
- infant-directed speech (1)
- non-native-directed speech (1)
- Scottish English (1)
- coarticulation (1)
- within-speaker variability (1)
- phone duration (1)
- pitch (1)
- CAPE-V (1)
- GRBAS (1)
- clinical (1)
- voice quality (1)
- vocal tract transfer function (1)
- professional voice (1)
- silent speech (1)
- 3D head meshes (1)
- German (1)
- acoustic pharyngometry (1)
- electroencephalography (EEG) (1)
- external craniofacial anthropometry (1)
- rhinometry (1)
- syllable sequences (1)
- partial spoof (1)
- ASVspoof (1)
- Amharic (1)
- Swahili (1)
- Wolof (1)
- Korean (1)
- Sinhala (1)
- Khmer (1)
- Afrikaans (1)
- Sesotho (1)
- Setswana (1)
- isiXhosa (1)
- Spanish accent (1)
- Czech (1)
- Southern standard British English (SSBE) (1)
- Bradford (1)
- Kirklees (1)
- Wakefield (1)
- West Yorkshire (1)
- consonants (1)
- dentition (1)
- maxilla (1)
- Arabic (1)
- accent variability (1)
- dialect variability (1)
- Putonghua (1)
- Derby (1)
- Leeds (1)
- Manchester (1)
- York (1)
- Ohio (1)
- brain activity (1)
- vocal imitation (1)
- sociolinguistic (1)
- World Englishes (1)
- dyadic (1)
- African (1)
- Cameroon (1)
- Chad (1)
- Congo (1)
- Gabon (1)
- Niger (1)
- evolution of speech (1)
- speech motor control (1)
- anatomical measurements (1)
Resource type
- Conference Paper (1)
- Dataset (94)
- Journal Article (23)
- Preprint (2)
- Report (1)
- Software (19)
- Web Page (15)