In the last few months, Microsoft has embarked on a mission to incorporate artificial intelligence (AI) in its suite of products, ranging from consumer-focused Microsoft Office to Copilot 365 for businesses. At its latest Ignite 2023 conference, the technology giant announced several new AI-based products such as Copilot Studio, and Windows AI Studio, while also renaming Bing Chat to simply Copilot. The company also launched a text-to-speech avatar program called Azure AI Speech which can help create talking avatar videos. It is being rolled out in the public preview. Know all about this new feature.
The Azure AI Speech is a text-to-speech avatar that allows you to convert text into a 2D video of a human-like speaking avatar. Microsoft says the Neural text-to-speech Avatar models are trained by deep neural networks based on the human video recording samples, and the voice of the avatar is provided by a text-to-speech voice model. Users can use text inputs to build training videos, product introductions, customer testimonials, and more, enabling more digital interactions.
The Azure AI Speech avatar content generation workflow involves 3 steps - the text analyzer, the TTS audio synthesizer, and the TTS avatar video synthesizer. First, the text input is provided by the user and the text analyzer outputs it in the form of a phoneme sequence. Then, the TTS audio synthesizer predicts the acoustic features of the input text and synthesizes the voice. Both of these features are powered by text-to-speech voice models.
Lastly, the neural text-to-speech avatar model predicts the image of lip sync with the acoustic features, so that the synthetic video is generated.
The Azure AI Speech service is being offered in two tiers. The first is
Read more on tech.hindustantimes.com