Full catalogue
An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech
Resource type
Authors/contributors
- Yang, Yudong (Author)
- Su, Rongfeng (Author)
- Zhao, Shaofeng (Author)
- Wei, Jianguo (Author)
- Ng, Manwa Lawrence (Author)
- Yan, Nan (Author)
- Wang, Lan (Author)
Title
An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech
Abstract
Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to construct a multimodal database designed for Mandarin speech. The dataset integrates synchronized ultrasound images of lingual movement, and the corresponding audio recordings and text annotations elicited from 43 healthy speakers and 11 patients with dysarthria through speech tasks (including vowels, monosyllables, and sentences), with a total duration of 22.31 hours. In addition, a customized helmet structure was employed to stabilize the ultrasound probe, precisely controlling for head movement and minimizing displacement interference, The proposed database carries apparent values in automatic speech recognition, silent interface development, and research in speech pathology and linguistics.
Publication
Scientific Data
Volume
12
Issue
1
Pages
607
Date
2025-04-11
Journal Abbr
Sci Data
Language
en
ISSN
2052-4463
Accessed
25/04/2025, 18:50
Library Catalog
DOI.org (Crossref)
Citation
Yang, Y., Su, R., Zhao, S., Wei, J., Ng, M. L., Yan, N., & Wang, L. (2025). An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech. Scientific Data, 12(1), 607. https://doi.org/10.1038/s41597-025-04917-w
Audio
Speech Production & Articulation
Tags
Link to this record