SoylentNews
SoylentNews is people
https://soylentnews.org/

Title    Microsoft's New AI Can Simulate Anyone's Voice With Three Seconds of Audio
Date    Monday January 16 2023, @07:56AM
Author    Fnord666
Topic   
from the my-voice-is-no-longer-my-password dept.
https://soylentnews.org/article.pl?sid=23/01/15/0541236

Freeman writes:

Text-to-speech model can preserve speaker's emotional tone and acoustic environment:

On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person's voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize audio of that person saying anything—and do it in a way that attempts to preserve the speaker's emotional tone.

Its creators speculate that VALL-E could be used for high-quality text-to-speech applications, speech editing where a recording of a person could be edited and changed from a text transcript (making them say something they originally didn't), and audio content creation when combined with other generative AI models like GPT-3.


Original Submission

Links

  1. "Freeman" - https://soylentnews.org/~Freeman/
  2. "Text-to-speech model can preserve speaker's emotional tone and acoustic environment" - https://arstechnica.com/information-technology/2023/01/microsofts-new-ai-can-simulate-anyones-voice-with-3-seconds-of-audio/
  3. "VALL-E" - https://valle-demo.github.io/
  4. "GPT-3" - https://arstechnica.com/information-technology/2022/11/openai-conquers-rhyming-poetry-with-new-gpt-3-update/
  5. "Original Submission" - https://soylentnews.org/submit.pl?op=viewsub&subid=58127

© Copyright 2025 - SoylentNews, All Rights Reserved

printed from SoylentNews, Microsoft's New AI Can Simulate Anyone's Voice With Three Seconds of Audio on 2025-04-28 00:38:40