September 27, 2023
Spotify is using AI to drive podcast language translation in what sounds like the podcaster’s own voice, which has obvious implications for film and television dubbing. Working with podcast notables including Dax Shepard, Monica Padman and Bill Simmons, Spotify used AI to mimic their voices in Spanish, French and German for several episodes. The proprietary Spotify technology uses OpenAI’s new text-to-speech voice-generation technology as well as its open-source Whisper speech recognition system, which transcribes spoken words into text. The result, Spotify says, is “more authentic” and “more personal and natural” than traditional dubbing.
“Currently, Spotify says, it has more than 100 million podcast listeners globally and more than 5 million podcast titles available in 170-plus markets,” writes Variety, which says it is “the largest U.S. podcast network based on reach from Q4 2022 to Q2 2023, according to Edison Research.”
Forbes writes that “other companies have started to use generative AI for its products in recent months,” noting that active AI tech developer Meta Platforms earlier this year unveiled AudioCraft, “a tool that allows users to create AI-generated music and sounds.”
YouTube in August announced it is teaming with Universal Music Group to explore and assess frameworks for artist compensation for generative AI music that is derivative or based on a particular performer’s voice, instrumental sound or songs.
According to Forbes, OpenAI is “gradually releasing its image and voice capabilities for ChatGPT” while warning of deepfake risks, “including ‘the potential for malicious actors to impersonate public figures or commit fraud.’”