The Merlin toolkit

Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).

The system is written in Python and relies on the Theano numerical computation library.

Merlin comes with recipes (in the spirit of the Kaldi automatic speech recognition toolkit) to show you how to build state-of-the art systems.

Merlin is free software, distributed under an Apache License Version 2.0, allowing unrestricted commercial and non-commercial use alike.

Current version

You can obtain Merlin from GitHub. Please use the facilities on GitHub to report bugs and contribute to the code.

Citation

If you publish work based on Merlin, please cite:

Zhizheng Wu, Oliver Watts, Simon King, “Merlin: An Open Source Neural Network Speech Synthesis System” in Proc. 9th ISCA Speech Synthesis Workshop (SSW9), September 2016, Sunnyvale, CA, USA.

Current Personnel

Zhizheng Wu (now with Apple, formerly with CSTR)
Oliver Watts (CSTR)
Srikanth Ronanki (CSTR)
Simon King (CSTR)

Written by Simon King on September 25, 2016