ElevenLabs Text-to-Voice AI Tools Now Available for Mobile

ElevenLabs is bringing its powerful AI voice tools to mobile. Previously, the company’s apps and voice libraries were only available via the Web. Now iOS and Android users can tap ElevenLabs tech on the go with a “faster, intuitive, more powerful experience built natively for mobile” rather than awkwardly through a mobile browser. Combining mobility with creativity, the app lets users create realistic voiceovers for social media or narrate video using ElevenLabs’ text-to-speech models — including Eleven v3, now in alpha, which lets users fine-tune vocalizations using tags. The company has also introduced a new voice assistant, 11ai. Continue reading ElevenLabs Text-to-Voice AI Tools Now Available for Mobile

Meta Unveils New AI Advertising Tools as Part of Advantage+

Meta Platforms has announced new generative AI features developed for marketers that produce video advertisements. Announced as part of Meta’s Advantage+ ad suite, marketers can now use up to 20 product stills to create multi-scene video ads with music and text overlay. The company has also added generative AI voices and virtual try-ons to its Advantage+ marketer toolkit. The upgrades were announced at the Cannes Lions International Festival of Creativity, where the topic of Meta using AI not only to create ads but to algorithmically serve them to target audiences was a topic of conversation. Continue reading Meta Unveils New AI Advertising Tools as Part of Advantage+

Google Search Live Features Conversational Voice Capability

Google has launched Search Live with voice-input, a two-way conversational query function for exploring online resources. Presently available via the Google app for Android and iOS to U.S. users enrolled in Google Labs’ AI Mode experiment, Search Live is designed to handle complex, multi-part questions. Google suggests the new feature is “perfect for when you’re on the go or multitasking, like if you’re packing for a trip.” The discursive voice feature follows Google’s general rollout of AI Mode, recently launched to compete against products such as OpenAI’s ChatGPT Search and Perplexity AI. Continue reading Google Search Live Features Conversational Voice Capability

Midjourney Launches V7 Image Generator with Voice Prompts

Generative AI program Midjourney has issued V7 in alpha, marking its first new model in almost a year. Notable updates include personalization turned on by default, which users must first set up — a process Midjourney says takes 5 minutes — and can then toggle on or off at any time. Another new flagship feature, Draft Mode, lets users render lower resolution images at “half the cost and 10 times the speed,” according to Midjourney, emphasizing “it’s so fast that we change the prompt bar to a ‘conversational mode’ when you’re using it on Web.” Draft Mode also supports voice prompts. Continue reading Midjourney Launches V7 Image Generator with Voice Prompts

Sam Altman Reveals Plans to Simplify OpenAI’s Product Line

OpenAI has decided to simplify its product offerings. A month after announcing the in-development GPT-o3 as its next frontier model, the company has canceled it as a standalone release, explaining that it would be integrated into the upcoming GPT-5 instead. “A top goal for us is to unify o-series models and GPT-series models by creating systems that can use all our tools, know when to think for a long time or not, and generally be useful for a very wide range of tasks,” OpenAI co-founder and CEO Sam Altman wrote in a social media post this week. Expected to ship later this year, the GPT-5 models will incorporate voice, canvas, search, deep research and more, OpenAI says. Continue reading Sam Altman Reveals Plans to Simplify OpenAI’s Product Line

T-Mobile Launches Starlink-Based Mobile Service for Everyone

T‑Mobile is acting to eliminate mobile dead zones by launching T-Mobile Starlink, which it says is “the first and only space‑based mobile network in the U.S. that automatically connects to your phone in areas no cellular network reaches.” For now, the service offers SMS text messaging, with “data and voice calls coming later,” according to T-Mobile. The beta is open to everyone, “even Verizon and AT&T customers,” with registration required for free access through July, at which point added fees will kick in for all but those on the T-Mobile Go5G Next plan, on sale now for $150 per month. Continue reading T-Mobile Launches Starlink-Based Mobile Service for Everyone

CES: Google TV Integrates Gemini AI for a Conversational Feel

Google TV is incorporating Gemini AI to make it easier to converse with a voice assistant as well as generating helpful onscreen information. These new Google TV devices will also feature an upgraded, Gemini-powered voice experience capable of handling more complex voice commands. “You and your family will be able to gather together and have a natural conversation with your TV,” Google announced at CES 2025, where it shared a preview of the new capabilities. The Gemini model also lets Google TV users create customized artwork, control smart home devices and get an overview of the day’s news. Continue reading CES: Google TV Integrates Gemini AI for a Conversational Feel

OpenAI Announces $200 Monthly Subscription for ChatGPT Pro

OpenAI has launched ChatGPT Pro, a $200 per month subscription plan that provides unlimited access to the full version of o1, its new large reasoning model, and all other OpenAI models. The toolkit includes o1-mini, GPT-4o and Advanced Voice. It also includes the new o1 pro mode, “a version of o1 that uses more compute to think harder and provide even better answers to the hardest problems,” OpenAI explains, describing the high-end subscription plan as a path to “research-grade intelligence” for a way for scientists, engineers, enterprise, academics and others who use AI to accelerate productivity. Continue reading OpenAI Announces $200 Monthly Subscription for ChatGPT Pro

Hume AI Introduces Voice Control and Claude Interoperability

Artificial voice startup Hume AI has had a busy Q4, introducing Voice Control, a no-code artificial speech interface that gives users control over 10 voice dimensions ranging from “assertiveness” to “buoyancy” and “nasality.” The company also debuted an interface that “creates emotionally intelligent voice interactions” with Anthropic’s foundation model Claude that has prompted one observer to ponder the possibility that keyboards will become a thing of the past when it comes to controlling computers. Both advances expand on Hume’s work with its own foundation model, Empathic Voice Interface 2 (EVI 2), which adds emotional timbre to AI voices. Continue reading Hume AI Introduces Voice Control and Claude Interoperability

Nvidia AI Model Fugatto a Breakthrough in Generative Sound

Nvidia has unveiled an AI sound model research project called Fugatto that “can create any combination of music, voices and sounds” based on text and audio inputs. Described by Nvidia as “the world’s most flexible sound machine,” many appear to agree that the new model represents an audio breakthrough, with the potential to generate a wide array of sounds that have not previously existed. While popular sound models from companies including Suno and ElevenLabs “can compose a song or modify a voice, none have the dexterity of the new offering,” Nvidia claims. Continue reading Nvidia AI Model Fugatto a Breakthrough in Generative Sound

DeepL Voice Translates 33 Languages to Captions in Real Time

DeepL, a German company that gained a profile with online text translation, has released DeepL Voice, a B2B tool that translates to captions in real time. DeepL Voice debuts in two iterations: DeepL Voice for Meetings, which allows participants to speak in their preferred language while serving colleagues with captions, and DeepL Voice for Conversations, which works on mobile devices, facilitating in-person, one-on-one conversations “with customers, colleagues or anyone else, in the language that works best for them,” the company explains, noting that real-time voice translation offers specific challenges. Continue reading DeepL Voice Translates 33 Languages to Captions in Real Time

Runway’s Act-One Facial Capture Could Be a ‘Game Changer’

Runway is launching Act-One motion capture system that uses video and voice recordings to map human facial expressions onto characters using the company’s latest model, Gen-3 Alpha. Runway calls it “a significant step forward in using generative models for expressive live action and animated content.” Compared to past facial capture techniques — which typically require complex rigging — Act-One is driven directly and only by the performance of an actor, requiring “no extra equipment,” making it more likely to capture and preserve an authentic, nuanced performance, according to the company. Continue reading Runway’s Act-One Facial Capture Could Be a ‘Game Changer’

Microsoft’s Copilot AI Assistant Update Adds Voice and Vision

Microsoft announced that its Copilot AI assistant has received a major overhaul, gaining voice and vision capabilities. Copilot also now has a virtual news reader mode to present headlines, as well as the ability to see what you see and to interact in a more conversational manner. Before a general release, these tools will be trialed among a subset of Copilot Pro users “to gather feedback” and make them “better and safer.” Microsoft AI Executive VP and CEO Mustafa Suleyman says the changes herald “a calmer, more helpful and supportive era of technology, quite unlike anything we’ve seen before.” Continue reading Microsoft’s Copilot AI Assistant Update Adds Voice and Vision

Amazon Is Inviting Audible Narrators to Create AI Voice Clones

Amazon is aiming to speed up production of its Audible audiobooks by inviting a small group of narrators to clone their voices using generative artificial intelligence. The U.S. beta test will roll out later this year according to Amazon, which announced the move on Audible’s creator marketplace. “There is a vast catalog of books that does not yet exist in audio and as we explore ways to bring more books to life on Audible, we’re committed to thoughtfully balancing the interests of authors, narrators, publishers, and listeners,” Amazon explains. Continue reading Amazon Is Inviting Audible Narrators to Create AI Voice Clones

ElevenLabs Reader App Is Available Globally in 32 Languages

New York-based ElevenLabs is going global with its generative AI text-to-speech reader app, which can narrate writings in 32 languages with thousands of voices from which to choose. The audio startup promises “high quality, human-like” AI voices that are “emotionally and contextually aware,” adapting delivery of written cues “to achieve a high emotional range.” ElevenLabs has focused on “creative workflow,” with a voice isolator and audio effects generator tools. Its catalog includes the voices of celebrities Judy Garland, Laurence Olivier, James Dean and Burt Reynolds. Custom models for translation and voiceover work using contemporary actors is a future possibility. Continue reading ElevenLabs Reader App Is Available Globally in 32 Languages