Full catalogue

Return to list of results

Page 7 of 155

FoR: Fake or Real

Resource type

Authors/contributors

Reimao, Ricardo (Author)
Tzerpos, Vassilios (Author)

Title

FoR: Fake or Real

Abstract

The Fake-or-Real (FoR) dataset is a collection of more than 195,000 utterances from real humans and computer generated speech. The dataset can be used to train classifiers to detect synthetic speech. The dataset aggregates data from the latest TTS solutions (such as Deep Voice 3 and Google Wavenet TTS) as well as a variety of real human speech, including the Arctic Dataset (http://festvox.org/cmu_arctic/), LJSpeech Dataset (https://keithito.com/LJ-Speech-Dataset/), VoxForge Dataset (http://www.voxforge.org) and our own speech recordings. The dataset is published in four versions: for-original, for-norm, for-2sec and for-rerec. The first version, named for-original, contains the files as collected from the speech sources, without any modification (balanced version). The second version, called for-norm, contains the same files, but balanced in terms of gender and class and normalized in terms of sample rate, volume and number of channels. The third one, named for-2sec is based on the second one, but with the files truncated at 2 seconds. The last version, named for-rerec, is a rerecorded version of the for-2second dataset, to simulate a scenario where an attacker sends an utterance through a voice channel (i.e. a phone call or a voice message).

Citation Key

_bj

URL

https://bil.eecs.yorku.ca/datasets/

Citation

Reimao, R., & Tzerpos, V. (n.d.). FoR: Fake or Real [Dataset]. Retrieved https://bil.eecs.yorku.ca/datasets/

Audio Data

Synthetic Speech