Tacotron, Google AI Can Mimic Your Voice
What are the possible ways by which Artificial Intelligence can be explored? This can open the doors to new world. While the tech leaders involved in an argument over the future of AI on human race Google was working on the answer of above question. Last month Google published a research paper sharing a new advancement in the Artificial Intelligence field. According to the research paper, Google successfully built a system that can mimic the human voice accurately.
Tacotron 2, the text-to-speech system built by Google is AI enabled to generate the near to human voice. The speech generation system consists of Google's two-deep neural networks for successfully imitating the human voice. The first network is used to translate the text into Spectrogram, which is the process to represent audio frequencies over time. Then, the text information sends to the WaveNet, which belongs to the Alphabet’s AI research lab DeepMind. The WaveNet converts the given text into corresponding speech using the AI technology.
“This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from the text. The system is composed of a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed by a modified WaveNet model acting as a vocoder to synthesize time domain waveforms from those spectrograms.” Google wrote in the research paper.
The voice generated by the Tacotron is similar to the human voice and Google has also given the samples which you can listen on the page. There are two voice samples given by the company one represents the actual human voice and other is AI generated voice. However, it is not mentioned which one belongs to whom but the source of the page reveals a filename with ‘GEN’ which most possibly represents the AI -generated voices. The researchers also put the Tacotron 2 against the hard punctuation test which it did well, such as pronouncing the important words, capital letters with more stress.
Google is using Wavenet to generate voices for Google Assistant since 2016 and keep exploring the AI for further integration. Tacotron 2 could be used to enhance the Google Assistant real soon but there is not a specific time frame for it. However, the system is only trained to generate a female voice for now and adding a male support and more functionality to it will take time.
No comments