The Merlin toolkit
Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).
The system is written in Python and relies on the Theano numerical computation library.
Merlin comes with recipes (in the spirit of the Kaldi automatic speech recognition toolkit) to show you how to build state-of-the art systems.
Merlin is free software, distributed under an Apache License Version 2.0, allowing unrestricted commercial and non-commercial use alike.
Current version
You can obtain Merlin from GitHub. Please use the facilities on GitHub to report bugs and contribute to the code.
Citation
If you publish work based on Merlin, please cite:
Zhizheng Wu, Oliver Watts, Simon King, “Merlin: An Open Source Neural Network Speech Synthesis System” in Proc. 9th ISCA Speech Synthesis Workshop (SSW9), September 2016, Sunnyvale, CA, USA.
Current Personnel
- Zhizheng Wu (now with Apple, formerly with CSTR)
- Oliver Watts (CSTR)
- Srikanth Ronanki (CSTR)
- Simon King (CSTR)