Baidu says that unlike previous text-to-speech systems, Deep Voice 2 finds shared qualities between the training voices entirely on its own, and without any previous guidance. "Deep voice 2 can learn from hundreds of voices and imitate them perfectly," a blog post says.

In a research paper (PDF), Baidu concludes that its neural network can create voice pretty effectively even from small voice samples from hundreds of different speakers. All of which to say, it might not be long before we start hearing digital assistants that are more representative of the voices users encounter in their day-to-day lives.

To hear how far the tech has come and for more information of how the team got to this point, hit the source links below.