Your search

  • Real-time magnetic resonance imaging (rtMRI) is a technique that provides high-contrast videographic data of human anatomy in motion. Applied to the vocal tract, it is a powerful method for capturing the dynamics of speech and other vocal behaviours by imaging structures internal to the mouth and throat. These images provide a means of studying the physiological basis for speech, singing,...

  • VoxForge is an open speech dataset that was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).

  • Common Voice is a project to help make voice recognition open to everyone. Developers need an enormous amount of voice data to build voice recognition technologies, and currently most of that data is expensive and proprietary. We want to make voice data freely and publicly available, and make sure the data represents the diversity of real people. Together we can make voice recognition better for everyone.

  • Silero VAD: pre-trained enterprise-grade Voice Activity Detector

  • openSMILE (open-source Speech and Music Interpretation by Large-space Extraction) is a complete and open-source toolkit for audio analysis, processing and classification especially targeted at speech and music applications, e.g. automatic speech recognition, speaker identification, emotion recognition, or beat tracking and chord detection.

Last update from database: 28/05/2025, 04:10 (UTC)