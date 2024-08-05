OpenAI has postponed the launch of its realistic voice conversation experience from late June to July. The Microsoft-backed AI startup announced on Tuesday via a post on X that it is beginning to roll out this advanced voice mode to a limited group of ChatGPT Plus users. During its Spring Launch event in May, OpenAI showcased the new Voice Mode on ChatGPT, which utilizes GPT-4o’s audio and video capabilities. The highly anticipated Voice Mode is now available, but only to a select few.

Key Highlights:

The Advanced Voice Mode (AVM) from OpenAI is now launched.

This new version promises improvements over the previous one.

AVM is currently available only to a select group of ChatGPT Plus users.

During the alpha phase, only a limited number of features will be accessible.

Launch of Advanced Voice Mode in Alpha

The company revealed in a post on X that it is launching Voice Mode in alpha for a limited group of ChatGPT Plus users. This advanced voice assistant will be able to respond to emotions and manage interruptions, offering a more intelligent interaction experience.

Here is what the post stated:

“We’re starting to roll out advanced Voice Mode to a small group of ChatGPT Plus users. Advanced Voice Mode offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions”.

OpenAI Introduces New Voice Capabilities for ChatGPT Plus Users

OpenAI is rolling out advanced voice capabilities for ChatGPT Plus users, aiming to revolutionize user interactions. This new feature allows users to speak directly to ChatGPT and receive immediate, real-time responses. Additionally, users can now interrupt ChatGPT while it’s speaking, replicating the fluidity and natural flow of human conversation—an area where AI assistants have traditionally struggled.

OpenAI highlighted this update as part of its ongoing efforts to enhance the user experience. This advancement follows their June initiative to improve the model’s content detection and refusal capabilities. As the company continues to innovate and expand its infrastructure, it is positioning itself at the forefront of the rapidly evolving AI industry.

Understanding Voice Mode

Voice Mode is an innovative voice assistant designed to facilitate interactive conversations with ChatGPT. Utilizing a state-of-the-art text-to-speech model, it produces a voice that closely mimics human speech. However, the initial version of Voice Mode encountered controversy, notably when actress Scarlett Johansson considered legal action due to the unauthorized use of her voice.

The revamped Voice Mode, now equipped with GPT-4o’s enhanced video and audio capabilities, promises substantial performance enhancements. Previously, the system experienced delays of 2.8 seconds for GPT-3.5 and 5.4 seconds for GPT-4, due to its reliance on three separate models. The latest iteration integrates a single model for processing audio, vision, and text, streamlining operations.

The updated assistant will include four preset voices and will phase out the Sky voice type. New filters have also been implemented to prevent the generation of copyrighted content such as music.

This new Voice Mode can assist with screen content and use the phone camera for context-based responses, though these features will not be available in the alpha version. Screen and video-sharing capabilities are slated for future releases.

OpenAI is committed to continuously improving the model based on user feedback and plans to publish a detailed report on GPT-4o’s performance, including safety assessments and limitations, in August.

Accessing Voice Mode Alpha

Voice Mode Alpha is currently available to a select group of ChatGPT Plus users, as announced by OpenAI. To become a ChatGPT Plus subscriber, you can pay $20 per month. If chosen for the alpha, you will receive an email with instructions and a notification in the mobile app. If you haven't received a notification yet, don't worry; OpenAI will continue adding users on a rolling basis. The real-time Voice Mode is expected to be widely available for all ChatGPT Plus users this fall.

