Advertisment

OpenAI launches Advanced Voice Mode with vision for ChatGPT

The new feature can also interpret what's on a device's screen via screen sharing. For example, it can explain settings menus or provide suggestions for solving a math problem.

author-image
Social Samosa
New Update
OpenAI launches Advanced Voice Mode

Seven months after OpenAI previewed its real-time video feature for ChatGPT, the company has officially launched the capability. 

During a livestream, OpenAI introduced a new update to the Advanced Voice Mode, which now includes vision capabilities. ChatGPT users with a subscription to ChatGPT Plus, Team, or Pro can use their phones to point at objects, and ChatGPT will respond almost immediately.

The new feature can also interpret what's on a device's screen via screen sharing. For example, it can explain settings menus or provide suggestions for solving a math problem.

To use the new feature, users can tap the voice icon next to the chat bar and then click the video icon at the bottom left to start the video. To share the screen, tap the three-dot menu and select ‘Share Screen.’

The company said the rollout of this feature began on Thursday, December 12 and should be completed within the next week. However, not all users will have access right away. ChatGPT Enterprise and Edu subscribers won’t be able to use it until January, and there’s no timeline for availability in the EU, Switzerland, Iceland, Norway, or Liechtenstein.

In a recent demo on CNN’s ‘60 Minutes,’ OpenAI President Greg Brockman showed the feature by having ChatGPT quiz Anderson Cooper on his knowledge of anatomy. As Cooper drew on a blackboard, ChatGPT correctly identified what he was drawing, commenting on the location and shape of the brain.

Despite some success, the feature made a mistake when solving a geometry problem, showing it still has room for improvement.

The release of Advanced Voice Mode with vision was delayed multiple times, partly because OpenAI announced the feature before it was fully ready. Initially, OpenAI planned to launch it in the spring, but it took more months to finalise.

Along with the new vision capabilities, the company also introduced a festive ‘Santa Mode,’ which lets users select Santa’s voice as a preset in ChatGPT. This can be found by clicking the snowflake icon next to the prompt bar in the app.

Rivals like Google and Meta are working on similar features for their own AI systems. This week, Google introduced its Project Astra, which also offers real-time video analysis for Android users.

 

ChatGPT-4o OpenAI ChatGPT ChatGPT Santa Mode OpenAI ChatGPT Advanced Voice Mode