Text-to-Speech System Quickly Mimics Hundreds of Accents

By ETCentric
May 26, 2017

As another example of the significant advances we have been following in artificial intelligence and deep learning, Chinese search giant Baidu has introduced Deep Voice 2, the second iteration of its compelling text-to-speech system. The company introduced Deep Voice just three months ago, with the ability to produce speech “in near real time” that was “nearly indistinguishable from an actual human voice,” according to The Verge. While the first system was limited to learning one voice at a time, “and required many hours of audio or more from which to build a sample,” the updated version “can learn the nuances of a person’s voice with just half an hour of audio, and a single system can learn to imitate hundreds of different speakers.”

For more information, visit the Baidu Research page for Deep Voice or download the research paper Deep Voice: Real-Time Neural Text-to-Speech.

Text-to-Speech System Quickly Mimics Hundreds of Accents

No Comments Yet