Griko-Italian Parallel Speech Corpus
330 utterances from a true documentation setting
About
This very small parallel speech corpus presents speech in the endangered language Griko and translations to Italian. It is made of 330 sentences, with the following information levels: speech, machine extracted pseudo-phones, transcriptions, translations and sentence alignment.
Downloading the data
Citing us
When using our dataset, please cite the following paper:
@inproceedings{zanonboito18_sltu,
title = {A Small Griko-Italian Speech Translation Corpus},
author = {Marcely Zanon Boito and Antonios Anastasopoulos and Aline Villavicencio and Laurent Besacier and Marika Lekakou},
year = {2018},
booktitle = {6th Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2018)},
pages = {36--41},
doi = {10.21437/SLTU.2018-8},
}