Meta AI Seamless Translator Converts Nearly 100 Languages

The research division of Meta AI has developed Seamless Communication, a suite of artificial intelligence models that generate what the company says is natural and authentic communication across languages, facilitating what amounts to real-time universal speech translation. The models were released with accompanying research papers and data. The flagship model, Seamless, merges capabilities from a trio of models — SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 — into a single system that can translate between almost 100 spoken and written languages, preserving idioms, emotion and the speaker’s vocal style, Meta says. Continue reading Meta AI Seamless Translator Converts Nearly 100 Languages

Spotify Uses AI to Copy Host Voices for Podcast Translations

Spotify is using AI to drive podcast language translation in what sounds like the podcaster’s own voice, which has obvious implications for film and television dubbing. Working with podcast notables including Dax Shepard, Monica Padman and Bill Simmons, Spotify used AI to mimic their voices in Spanish, French and German for several episodes. The proprietary Spotify technology uses OpenAI’s new text-to-speech voice-generation technology as well as its open-source Whisper speech recognition system, which transcribes spoken words into text. The result, Spotify says, is “more authentic” and “more personal and natural” than traditional dubbing. Continue reading Spotify Uses AI to Copy Host Voices for Podcast Translations

Google Shows Off Impressive Range of AI at NY Media Event

Google Research is touting new advances in artificial intelligence, which can now generate its own code and write fiction, in addition to better text-to-video and language translation. At a New York media event at Google’s Pier 57 office — which opened earlier this year to become the company’s third Manhattan outpost — roughly a dozen projects in various stages of development were on display, with robot learning, LaMDA (language model for dialogue applications) and text-generated 3D images sharing the spotlight with practical AI for things like disaster management, weather forecasts and healthcare. Continue reading Google Shows Off Impressive Range of AI at NY Media Event

OpenAI Rolls Out Open-Source Speech Recognition System

OpenAI has released a new open source AI speech recognition model called Whisper that can recognize and translate audio at levels it says compare in accuracy and robustness to human abilities. Case uses include transcription of speeches, interviews, podcasts and conversations. “Moreover, it enables transcription in multiple languages, as well as translation from those languages into English,” says OpenAI, which is open-sourcing models and inference code on GitHub “to serve as a foundation for building useful applications and for further research on robust speech processing.” Continue reading OpenAI Rolls Out Open-Source Speech Recognition System

AI Is Still a Work in Progress When It Comes to Auto-Dubbing

Auto-dubbing, which uses artificial intelligence to translate content into different languages, is a technology on which the global entertainment industry has increasingly come to rely in finding audiences among the planet’s 7.2 billion people, speaking more than 7,000 languages in roughly 200 countries. Companies like Flawless, Deepdub and Papercup use different approaches to offload to computers much of the labor required to fill that distribution pipeline. Another company, Spherex, emphasizes cultural awareness and the need for heightened sensitivity in pursuit of hits that travel across borders. Continue reading AI Is Still a Work in Progress When It Comes to Auto-Dubbing

IBM, Harvard University Develop New Tool for AI Translation

At the IEEE Conference on Visual Analytics Science and Technology in Berlin, IBM and Harvard University researchers presented Seq2Seq-Vis, a tool to debug machine translation tools. Translation tools rely on neural networks, which, because they are opaque, make it difficult to determine how mistakes were made. For that reason, it’s known as the “black box problem.” Seq2Seq-Vis allows deep-learning app creators to visualize AI’s decision-making process as it translates a sequence of words from one language to another. Continue reading IBM, Harvard University Develop New Tool for AI Translation

LinkedIn Unveils Language Translation Tool and QR Codes

LinkedIn is introducing two new features: the ability to use QR codes for quickly sharing profiles and contact details, and a “See Translation” button that will translate posts into different languages. Currently available for iOS and Android, the QR codes offer users a quick option for accessing someone’s profile or sharing their own code via messaging apps, email, websites or printed materials such as business cards, conference badges and company brochures. The translation tool, available for more than 60 languages, is offered through LinkedIn’s desktop and mobile web versions (and soon via iOS and Android). Continue reading LinkedIn Unveils Language Translation Tool and QR Codes