Slash Boxes

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 15 submissions in the queue.

Submission Preview

Link to Story

Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio

Accepted submission by Freeman at 2023-01-10 15:04:25 from the My Voice is no longer my password dept.
News []

On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E [] that can closely simulate a person's voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize audio of that person saying anything—and do it in a way that attempts to preserve the speaker's emotional tone.

Its creators speculate that VALL-E could be used for high-quality text-to-speech applications, speech editing where a recording of a person could be edited and changed from a text transcript (making them say something they originally didn't), and audio content creation when combined with other generative AI models like GPT-3 [].

Original Submission