ChatGPT continues to evolve past simple text prompts. Last week, OpenAI announced plans to integrate its latest image generator, and today it added voice and image capabilities to its app.
The chatbot can now accept images in prompts, as well as speak on its own with hyper-realistic AI-generated voices. The goal is to offer a "new, more intuitive type of interface," OpenAI says— and to entice subscriptions for its $20-per-month ChatGPT Plus service.
Curiously, most examples in the announcement are about how children and families can use these features. Perhaps OpenAI has identified a new target audience—or an antidote to the apocalyptic predictions currently associated with its product.
The promotional video for the new AI voice generation capability begins with someone asking to hear a bedtime story about "the super duper sunflower hedgehog named Larry." ChatGPT replies: "Larry was a unique hedgehog unlike any other. He had bright sunflower petals instead of spines." The user then asks, "What was Larry's house like?" It's easy to imagine kids enjoying asking it questions to expand on the story, giving parents a break.
(At its recent fall event, Amazon previewed a similar feature: Explore with Alexa, an exclusive addition to Amazon Kids+ that will let kids ask Alexa questions about animals and nature.)
"The new voice technology—capable of crafting realistic synthetic voices from just a few seconds of real speech—opens doors to many creative and accessibility-focused applications," OpenAI says. The company also points out you can use it to "settle a dinner table debate," or have conversations on-the-go.
To transcribe the voice prompts, OpenAI will use Whisper, its open-source speech recognition system model. It warns,
Read more on pcmag.com