This new AI can simulate your voice from just 3 seconds of audio

January 12, 2023 Technology Leave a comment 1 Views

Microsoft’s new language model Vall-E is reportedly able to imitate any voice using just a three-second sample recording.

The recently released AI tool was tested on 60,000 hours of English speech data. Researchers said in a paper out of Cornell University that it could replicate the emotions and tone of a speaker.

Those findings were apparently true even when creating a recording of words that the original speaker never actually said.

“Vall-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. Experiment results show that Vall-E significantly outperforms the state-of-the-art zero-shot [text to speech] system in terms of speech naturalness and speaker similarity,” the authors wrote. “In addition, we find Vall-E could preserve the speaker’s emotion and acoustic environment of the acoustic prompt in synthesis.”

ANDROID SPYWARE STRIKES AGAIN TARGETING FINANCIAL INSTITUTIONS AND YOUR MONEY

Microsoft Corporation booth signage is displayed at CES 2023 at the Las Vegas Convention Center on January 6, 2023, in Las Vegas, Nevada.
((Photo by David Becker/Getty Images))

The Vall-E samples shared on GitHub are eerily similar to the speaker prompts, although they range in quality.

In one synthesized sentence from the Emotional Voices Database, Vall-E sleepily says the sentence: “We have to reduce the number of plastic bags.”

DISNEY CHARACTERS COMING TO AMAZON ALEXA WITH ‘HEY DISNEY’ COMMAND

Microsoft’s new language model Vall-E is reportedly able to imitate any voice using just a three-second sample recording.
(iStock)

However, the research in text-to-speech AI comes with a warning.

“Since Vall-E could synthesize speech that maintains speaker identity, it may carry potential risks in misuse of the model, such as spoofing voice identification or impersonating a specific speaker,” the researchers say on that web page. “We conducted the experiments under the assumption that the user agree to be the target speaker in speech synthesis. When the model is generalized to unseen speakers in the real world, it should include a protocol to ensure that the speaker approves the use of their voice and a synthesized speech detection model.”

Corporate signage of Microsoft Corp at Microsoft India Development Center, in Noida, India, on Friday, Nov. 11, 2022.
(Photographer: Prakash Singh/Bloomberg via Getty Images)

CLICK HERE TO GET THE FOX NEWS APP

At the moment, Vall-E, which Microsoft calls a “neural codec language model,” is not available to the public.

Julia Musto is a reporter for Fox News and Fox Business Digital.

Latest Breaking News Online News Portal

This new AI can simulate your voice from just 3 seconds of audio

Related Articles

Check Also

Larian Studios shocks fans by not planning any Baldur’s Gate 3 DLC or expansions, with no Baldur’s Gate 4 in sight. Time for something new!

5 tips to control emotional eating during holiday gatherings

Fmr NBC Host & Jen Psaki’s ABSURD Election Take Proves Liberal Media Is Truly Dumb

Ratchet Congresswoman Gets BIG MAD During Hearing About Ending DEI 😆

How I KNOW Timmy’s NOT SMART (and Probably a Communist)

It's Okay, When the GOVERNMENT Does It

5 tips to control emotional eating during holiday gatherings

Overboard cruise passenger spent hours in Gulf of Mexico before he was rescued

World Cup hits and misses: Maguire stands tall | Foden sits it out

Best Black Friday 2022 headphone and earbud deals

Pediatric ICUs face bed shortage amid RSV surge: “It’s not hyperbole to call it a crisis”

5 tips to control emotional eating during holiday gatherings

Overboard cruise passenger spent hours in Gulf of Mexico before he was rescued

World Cup hits and misses: Maguire stands tall | Foden sits it out

Best Black Friday 2022 headphone and earbud deals

Pediatric ICUs face bed shortage amid RSV surge: “It’s not hyperbole to call it a crisis”

European Union lawmaker detained for questioning after being linked to corruption scandal

Intel shares gain 7% as chipmaker returns to profitability

Princess Beatrice wore ‘dainty’ £3,066 bracelets at wedding and ‘started a trend’

Ex-NBA player criticizes Knicks’ Julius Randle for kissing wife right after playoff game

Giants clinch first playoff berth since 2016 in blowout win vs Colts

5 tips to control emotional eating during holiday gatherings

Fmr NBC Host & Jen Psaki’s ABSURD Election Take Proves Liberal Media Is Truly Dumb

Ratchet Congresswoman Gets BIG MAD During Hearing About Ending DEI 😆

How I KNOW Timmy’s NOT SMART (and Probably a Communist)

It's Okay, When the GOVERNMENT Does It

This new AI can simulate your voice from just 3 seconds of audio

Related posts:

Related Articles

Check Also