Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 8 submissions in the queue.
posted by hubie on Saturday March 08, @10:07AM   Printer-friendly
from the dystopia-is-now! dept.

https://arstechnica.com/ai/2025/03/users-report-emotional-bonds-with-startlingly-realistic-ai-voice-demo/

In late 2013, the Spike Jonze film Her imagined a future where people would form emotional connections with AI voice assistants. Nearly 12 years later, that fictional premise has veered closer to reality with the release of a new conversational voice model from AI startup Sesame that has left many users both fascinated and unnerved.

"I tried the demo, and it was genuinely startling how human it felt," wrote one Hacker News user who tested the system.
[...]
In late February, Sesame released a demo for the company's new Conversational Speech Model (CSM) that appears to cross over what many consider the "uncanny valley" of AI-generated speech
[...]
"At Sesame, our goal is to achieve 'voice presence'—the magical quality that makes spoken interactions feel real, understood, and valued," writes the company in a blog post.
[...]
Sometimes the model tries too hard to sound like a real human. In one demo posted online by a Reddit user called MetaKnowing, the AI model talks about craving "peanut butter and pickle sandwiches."
[...]
"I've been into AI since I was a child, but this is the first time I've experienced something that made me definitively feel like we had arrived," wrote one Reddit user.
[...]
Many other Reddit threads express similar feelings of surprise, with commenters saying it's "jaw-dropping" or "mind-blowing."
[...]
Mark Hachman, a senior editor at PCWorld, wrote about being deeply unsettled by his interaction with the Sesame voice AI. "Fifteen minutes after 'hanging up' with Sesame's new 'lifelike' AI, and I'm still freaked out," Hachman reported.
[...]
Others have compared Sesame's voice model to OpenAI's Advanced Voice Mode for ChatGPT, saying that Sesame's CSM features more realistic voices, and others are pleased that the model in the demo will roleplay angry characters, which ChatGPT refuses to do.
[...]
Under the hood, Sesame's CSM achieves its realism by using two AI models working together (a backbone and a decoder) based on Meta's Llama architecture that processes interleaved text and audio. Sesame trained three AI model sizes, with the largest using 8.3 billion parameters (an 8 billion backbone model plus a 300 million parameter decoder) on approximately 1 million hours of primarily English audio.

[...] Despite CSM's technological impressiveness, advancements in conversational voice AI carry significant risks for deception and fraud. The ability to generate highly convincing human-like speech has already supercharged voice phishing scams, allowing criminals to impersonate family members, colleagues, or authority figures with unprecedented realism.
[...]
Unlike current robocalls that often contain tell-tale signs of artificiality, next-generation voice AI could eliminate these red flags entirely.
[...]
It has inspired some people to share a secret word or phrase with their family for identity verification.
[...]
OpenAI itself held back its own voice technology from wider deployment over fears of misuse.

Sesame sparked a lively discussion on Hacker News about its potential uses and dangers.
[...]
In one case, a parent recounted how their 4-year-old daughter developed an emotional connection with the AI model, crying after not being allowed to talk to it again.
[...]
The company says it plans to open-source "key components" of its research under an Apache 2.0 license, enabling other developers to build upon their work.
[...]
You can try the Sesame demo on the company's website, assuming that it isn't too overloaded with people who want to simulate a rousing [argument].

[Last link in article added by submitter.]


Original Submission

This discussion was created by hubie (1068) for logged-in users only. Log in and try again!
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 3, Informative) by pkrasimirov on Saturday March 08, @11:19AM (3 children)

    by pkrasimirov (3358) Subscriber Badge on Saturday March 08, @11:19AM (#1395696)

    For people like me who don't want to actually talk to a machine, here's an example of someone who did it:

    https://www.youtube.com/watch?v=fgvRn86B5X0 [youtube.com]

    • (Score: 0) by Anonymous Coward on Saturday March 08, @01:44PM

      by Anonymous Coward on Saturday March 08, @01:44PM (#1395707)
      If this is a real test of Sesame than it seems to me it's an update on the Jive filter.
    • (Score: 2) by AnonTechie on Saturday March 08, @09:11PM (1 child)

      by AnonTechie (2275) on Saturday March 08, @09:11PM (#1395732) Journal

      Well, I tried it out, and it seemed quite realistic, although the responses were limited. With time, this will definitely improve. However, I worry that such technology will be used to scam vulnerable people.

      --
      Albert Einstein - "Only two things are infinite, the universe and human stupidity, and I'm not sure about the former."
(1)