QuickVid Uses AI to Create Short Videos from Text Prompts

QuickVid is a new AI-driven text-to-video platform aiming for a mass market user base. The tool draws on various generative AI systems to automatically create short-form videos for YouTube, Instagram, TikTok and other platforms. Created by former Meta Platforms programmer Daniel Habib “in a matter of weeks,” QuickVid is quite rudimentary, though Habib says he plans to continue fine tuning and adding features. Unlike Google and Meta have done with their nascent text-to-video systems, QuickVid has bypassed the formalities of research papers and industry previews and jumped directly to a public-facing website. Continue reading QuickVid Uses AI to Create Short Videos from Text Prompts

Google Brings Personalization Features to Your News Update

Google is adding new features to Your News Update, its news aggregation service, to personalize 90-minute news feeds from each user’s preferred sources. The goal is to create a seamless listening experience akin to a customized song playlist. Each news playlist, similar to those on public radio, will begin with short clips about the major headlines moving into longer stories. The end product, available only in the U.S., will compile radio, podcast clips and text-to-speech translations tailored to the individual user. Continue reading Google Brings Personalization Features to Your News Update

Facebook Reveals New AI-Powered Text-to-Speech System

Facebook introduced an AI text-to-speech system (TTS) that produces a second of audio in 500 milliseconds. According to Facebook, the system, which is used with a new approach to data collection, powered the creation of a British accent-inflected voice in six months, versus over a year required for other voices. The TTS is now used for Facebook’s Portal smart display brand. The system can be hosted in real time via ordinary processors and is also available as a service for other apps, including Facebook’s VR. Continue reading Facebook Reveals New AI-Powered Text-to-Speech System

Amazon Licenses Original Interactive Audio Series for Alexa

Amazon has inked an exclusive license for “Tala’s World,” a seven-episode young adult adventure series produced by audio startup Xandra, which has produced Alexa skills for HBO, Sesame Workshop and Ubisoft. In the new adventure series, listeners help elf-like character Blobby find his missing best friend Tala by making decisions, collecting clues, and interrogating suspects. Available exclusively on Alexa, Amazon recently released the first episode and plans to release the second episode on December 13. Continue reading Amazon Licenses Original Interactive Audio Series for Alexa

Google and IBM Create Advanced Text-to-Speech Systems

Both IBM and Google recently advanced development of Text-to-Speech (TTS) systems to create high-quality digital speech. OpenAI found that, since 2012, the compute power needed to train TTS models has exploded to more than 300,000 times. IBM created a much less compute-intensive model for speech synthesis, stating that it is able to do so in real-time and adapt to new speaking styles with little data. Google and Imperial College London created a generative adversarial network (GAN) to create high-quality synthetic speech. Continue reading Google and IBM Create Advanced Text-to-Speech Systems

Publishers and Authors Guild Oppose Audible Text Feature

Audible, the audiobook app owned by Amazon, is using machine learning to transcribe audio recordings, so listeners can also read along with the narrator. Audible is promoting it as an educational feature, but some publishers are up in arms, demanding their books be excluded because captions are “unauthorized and brazen infringements of the rights of authors and publishers.” Publishers are concerned that this will lead to fewer people buying physical or e-books if they can get the text with an Audible audiobook. Continue reading Publishers and Authors Guild Oppose Audible Text Feature

Text-to-Speech System Quickly Mimics Hundreds of Accents

As another example of the significant advances we have been following in artificial intelligence and deep learning, Chinese search giant Baidu has introduced Deep Voice 2, the second iteration of its compelling text-to-speech system. The company introduced Deep Voice just three months ago, with the ability to produce speech “in near real time” that was “nearly indistinguishable from an actual human voice,” according to The Verge. While the first system was limited to learning one voice at a time, “and required many hours of audio or more from which to build a sample,” the updated version “can learn the nuances of a person’s voice with just half an hour of audio, and a single system can learn to imitate hundreds of different speakers.” Continue reading Text-to-Speech System Quickly Mimics Hundreds of Accents

Google Redoubles its Cloud Ambitions, Offering AI Programs

Cloud computing is booming, and Google is losing ground to Amazon and Microsoft. As the business of renting computer servers to outside businesses grows more lucrative, Google has decided to promote its artificial intelligence software to enterprise customers. Now, potential customers of Google’s cloud offering can also take advantage two software programs — converting text to speech and extracting meaning from text — that, up until now, have only been used internally. Rivals Amazon and Microsoft offer competing AI products. Continue reading Google Redoubles its Cloud Ambitions, Offering AI Programs

VideoDubber Automatically Dubs Video into 30+ Languages

Foreign film fans may have a new reason to get excited. Israeli startup VideoDubber is introducing a new technology that could address complaints of subtitles in media content. The company claims that its TruDub technology can automatically dub films, TV shows and video into more than 30 languages including Arabic, Chinese, Spanish, and four dialects of English. The service uses synthetic voices that it says sound natural since they are based on professional voice talent. Continue reading VideoDubber Automatically Dubs Video into 30+ Languages

Newsbeat Creates Custom Radio Show Based on Your Interests

Last week the Tribune Company released a new iOS and Android app called Newsbeat, which plans to change how we consume our daily news by offering a more personalized podcast-like experience. Newsbeat has access to more than 7,000 sources from major newspapers to smaller blogs. Users can specify what types of stories and publications they are interested in, and the app will create a customized newscast by using Pandora-like artificial intelligence technology. Continue reading Newsbeat Creates Custom Radio Show Based on Your Interests

New StorEbook Reader Uses Natural Voices to Tell Stories

The Web-based reader “StorEbook” has expanded on the idea of computers interacting with users via voice technology. During last week’s Foundry event, the audio book’s “voice synthesis engine” was demonstrated as it recited the classic tale “Goldilocks and the Three Bears.” The Web-based app, which uses AT&T’s Natural Voices, provides story characters with multiple voices, creating a new dynamic to the idea of “story time.” Continue reading New StorEbook Reader Uses Natural Voices to Tell Stories

Mobile: Amazon Acquires Voice Recognition Company IVONA

Amazon has acquired IVONA Software for an undisclosed sum. Amazon already uses IVONA voice recognition software on the Kindle Fire, which helps users navigate the touchscreen and enables other voice commands. Amazon may now integrate the software into other Kindle products, and could also use the technology to create a competitor to Siri as rumors persist that Amazon could be working on a smartphone. Continue reading Mobile: Amazon Acquires Voice Recognition Company IVONA