AI news
June 26, 2024

ChatGPT's New Voice Mode Is Delayed By At Least A Month

Fans are furious after OpenAI confirmed the rollout of Voice Mode is delayed.

Jim Clyde Monge
Jim Clyde Monge

The highly anticipated Voice Mode support in ChatGPT is delayed.

Originally scheduled for late June, the release has been postponed by one month. This delay was confirmed in a post on OpenAI’s Discord channel.

According to a post on OpenAI’s Discord channel, the new feature was expected to be rolled out in late June but will be delayed by one month.
Image by Jim Clyde Monge

According to OpenAI, the delay is due to several improvements needed before the launch. They stated,

We’re sharing an update on the advanced Voice Mode we demoed during our Spring Update, which we remain very excited about:
We had planned to start rolling this out in alpha to a small group of ChatGPT Plus users in late June, but need one more month to reach our bar to launch. For example, we’re improving the model’s ability to detect and refuse certain content. We’re also working on improving the user experience and preparing our infrastructure to scale to millions while maintaining real-time responses.

OpenAI plans to use an iterative deployment strategy for Voice Mode. This means they will start with a small group of alpha testers to gather feedback and make improvements before expanding access.

As part of our iterative deployment strategy, we’ll start the alpha with a small group of users to gather feedback and expand based on what we learn. We are planning for all Plus users to have access in the fall. Exact timelines depend on meeting our high safety and reliability bar. We are also working on rolling out the new video and screen sharing capabilities we demoed separately, and will keep you posted on that timeline.
ChatGPT’s advanced Voice Mode can understand and respond with emotions and non-verbal cues, moving us closer to real-time, natural conversations with AI. Our mission is to bring these new experiences to you thoughtfully.

The Controversy

The delay has sparked speculation about the reasons behind it. Honestly, I am in doubt about OpenAI’s reason why they are delaying the rollout of Voice Mode.

Remember that Hollywood actress of the 2014 hit film “Her”, Scarlet Johansson, who voiced the character of Samantha, has expressed anger over the ChatGPT’s chatbot voice that “sounded so eerily similar” to hers.

Remember that Hollywood actress of the 2014 hit film “Her”, Scarlet Johansson, who voiced the character of Samantha, has expressed anger over the ChatGPT’s chatbot voice that “sounded so eerily similar” to hers.
Image by Jim Clyde Monge

She stated,

As a result of their actions, I was forced to hire legal counsel, who wrote two letters to Mr. Altman and OpenAI, setting out what they had done and asking them to detail the exact process by which they created the “Sky” voice. Consequently, OpenAI reluctantly agreed to take down the “Sky” voice.

The public’s reaction to Johansson’s statement and OpenAI’s actions has been mixed. While some people sympathize with Johansson and agree that her rights were violated, others believe that OpenAI’s use of a similar-sounding voice falls within acceptable bounds of creativity and innovation.

Days later after the blunder with Johansson, the voice feature in ChatGPT went down for some users, others reported that the headphone icon went missing.

Is this what’s caused the delay of the Voice Mode feature rollout?

What is Voice Mode in GPT-4o?

For those who are not familiar what the new Voice Mode feature in ChatGPT, OpenAI introduced GPT-4o in May 2024, a new model that’s smarter, cheaper, faster, better at coding, multi-modal, and mind-blowingly fast.

One of the standout features during the announcement was the Voice Mode. It can respond to audio inputs in as little as 232 milliseconds on average, which is comparable to human response times in conversation. It’s quick and the AI voice strangely mimics a real human voice.

Check out this sample video from OpenAI that showcases some capabilities of GPT-4o’s voice mode.

Aside from the speed, one advantage of GPT-4o compared to the previous model is that it’s faster and 50% cheaper in the API.

The best part? GPT-4o was made available to all ChatGPT users for FREE.

After the announcement, Sam Altman announced that the new voice mode will be live in the coming weeks for ChatGPT Plus users.

It turns out that paying users have to wait for months.

ChatGPT Premium Subscribers Plummeting

Now that ChatGPT with GPT-4o is completely free to use, paying users, including me, decided to cancel the subscription and use the free tier instead.

I only decided to cancel my subscription today because I was hoping for OpenAI to release the Voice Mode this month. Given that the only benefit I get from paying 20 USD a month is the higher message limit and advanced data analysis (which I rarely use), I guess it’s no longer worth it.

I’ll probably restore my monthly subscription once the Voice Mode rolls out because I am eager to try it out.

For those looking for alternatives, there are other AI tools available. For data analysis and visualization, Anthropic released Claude 3.5 Sonnet, which is a great free alternative. The Artifacts feature in Claude 3.5 Sonnet is more powerful than ChatGPT, in my opinion. If you’re interested, you can read more about Claude 3.5 Sonnet in the article below.

Anthropic Introduces Claude 3.5 Sonnet — The Most Powerful Language Model
Anthropic is trying to win back its users with Claude 3.5 Sonnet, which they claim is better than GPT-4o. It’s faster…

Get your brand or product featured on Jim Monge's audience