Stories
Slash Boxes
Comments

SoylentNews is people

Submission Preview

Link to Story

ChatGPT update enables its AI to “see, hear, and speak,“ according to OpenAI

Accepted submission by Freeman at 2023-09-26 18:09:08 from the I'm sorry Dave I'm afraid I can't do that dept.
News

https://arstechnica.com/information-technology/2023/09/chatgpt-goes-multimodal-with-image-recognition-and-speech-synthesis/ [arstechnica.com]

On Monday, OpenAI announced [openai.com] a significant update to ChatGPT that enables its GPT-3.5 and GPT-4 AI models to analyze images and react to them as part of a text conversation. Also, the ChatGPT mobile app will add speech synthesis options that, when paired with its existing speech recognition features, will enable fully verbal conversations with the AI assistant, OpenAI says.

OpenAI is planning to roll out these features in ChatGPT to Plus and Enterprise subscribers "over the next two weeks." It also notes that speech synthesis is coming to iOS and Android only, and image recognition will be available on both the web interface and the mobile apps.
[...]
Despite their drawbacks, in marketing materials, OpenAI is billing these new features as giving ChatGPT the ability to "see, hear, and speak." Not everyone is happy about the anthropomorphism and potential hype language involved. On X, Hugging Face AI researcher Dr. Sasha Luccioni posted [x.com], "The always and forever PSA: stop treating AI models like humans. No, ChatGPT cannot 'see, hear and speak.' It can be integrated with sensors that will feed it information in different modalities."

While ChatGPT and its associated AI models are clearly not human—and hype is a very real thing in marketing—if the updates perform as shown, they potentially represent a significant expansion in capabilities for OpenAI's computer assistant.


Original Submission