Last Updated: September 26, 2023, 08:30 IST
ChatGPT is moving to the next stage of its evolution as OpenAI this week has announced that the popular AI chatbot is getting new capabilities which includes talking and hearing your prompts.
These capabilities offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about, the company said in a statement.
Altman even shared his excitement about the new updates via this post:
OpenAI is offering voice and images in ChatGPT for the Plus subscribers as well as the enterprise users in the next couple of weeks. It also pointed out that the voice feature on ChatGPT will be available on both iOS and Android devices and you need to manually enable the feature in the phone settings. OpenAI also confirmed that the images feature on ChatGPT will work on all platforms.
So, what has OpenAI done to make the ChatGPT speak and hear your prompts?
It says the new features like voice are powered by a new text-to-speech model, which can offer audio that sounds very human-like audio. All it needs is just text and a few seconds of sample speech to let the magic work.
“We collaborated with professional voice actors to create each of the voices. We also use Whisper, our open-source speech recognition system, to transcribe your spoken words into text,” said OpenAI.
Coming to images, ChatGPT is using the power of the GPT version 3.5 and 4 to train and skill with a wide range of images that have text as well.
OpenAI is obviously excited but equally concerned about the possible misuse of these features. “These capabilities also present new risks, such as the potential for malicious actors to impersonate public figures or commit fraud,” the company noted.
“We’ve also taken technical measures to significantly limit ChatGPT’s ability to analyze and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals’ privacy,” said the company.