In this paper we examine the use of an HMM-based polyglot synthesizer for languages for which very limited or no speech data is available. In a former study, we presented a system that combines monolingual corpora from several languages to create a polyglot synthesizer. With this synthesizer we can synthesize any of the languages included in the training data with the same output voice and speech quality. In this paper, we approximate the sounds of non-included languages, by those available in the polyglot training data. Since the phonetic inventory of a polyglot synthesizer is wider than that of a monolingual one, the approximation of such non-included sounds becomes more accurate and thus the perceptual intelligibility increases. Moreover, the performance of a polyglot synthesizer can be further improved by adding a reduced amount of data from the target language.
@inproceedings{latorre05_interspeech, title = {Cross-language synthesis with a polyglot synthesizer}, author = {Javier Latorre and Koji Iwano and Sadaoki Furui}, year = {2005}, booktitle = {Interspeech 2005}, pages = {1477--1480}, doi = {10.21437/Interspeech.2005-521}, issn = {2958-1796},}
Cite as:Latorre, J., Iwano, K., Furui, S. (2005) Cross-language synthesis with a polyglot synthesizer. Proc. Interspeech 2005, 1477-1480, doi: 10.21437/Interspeech.2005-521