Adobe Adds Generative Audio and Text-to-Avatar to Firefly AI

Adobe’s Firefly Video model has introduced new updates including Generate Sound Effects, in beta, and a text-to-avatar feature that lets users turn scripts into avatar-led videos “in just a few clicks.” Firefly becomes the second video model to generate audio, joining Veo 3, although unlike Google’s AI video tool Firefly does not yet generate dialogue. What it can do is output foley-like sound and sound effects, while text-to-avatar can generate speech. As with Firefly’s generative visuals, Adobe says Generate Sound Effects is “commercially safe,” which means they are trained only on licensed or publicly available material. Continue reading Adobe Adds Generative Audio and Text-to-Avatar to Firefly AI

OpenAI Contracts Google Cloud and Debuts ChatGPT Agent

OpenAI is adding Google Cloud to its list of global infrastructure providers for ChatGPT after relying exclusively on Microsoft Azure since the chatbot’s 2022 launch until January 2025 when Stargate was announced. Oracle and CoreWeave are also OpenAI cloud providers. Oracle is a Stargate investor, as is Nvidia, which holds a minority interest in CoreWeave. OpenAI has been active as it heads toward a December deadline for transitioning to a for-profit company. Meanwhile, ChatGPT is integrating a payment system to receive commissions on sales it initiates, and yesterday OpenAI launched a new AI agent that can perform complex tasks within a user’s browser. Continue reading OpenAI Contracts Google Cloud and Debuts ChatGPT Agent

Google Offers Gemini AI Subscribers Photo-to-Video Function

Google has added photo-to-video capability to its Gemini AI app. Powered by Veo 3, Google’s latest generative video model, launched in May, Gemini AI can now turn images into 8-second videos complete with AI-generated sound including speech, environmental sounds and background noises. Available now via the Web to anyone with a $20 per month Google AI Pro subscription or those on the $125 per quarter Google AI Ultra plan, the new feature is also being released to mobile users this month for both iOS and Android devices. The videos are finished as 720p resolution MP4 files in 16:9 landscape format. Continue reading Google Offers Gemini AI Subscribers Photo-to-Video Function

Runway AI Intros Game Worlds Generator in Limited Preview

AI startup Runway has a new tool called Game Worlds that lets users generate simple video game worlds using images and text-based prompts. At the moment, Runway Game Worlds can only help generate simple text-based interactive adventures that include pictures, but the company has plans to enable more complex game creation by the end of the year. Runway CEO Cristóbal Valenzuela says the company is interested in partnering with video game companies who are willing to provide game data that can be used to train the company’s models in exchange for generative capabilities. Continue reading Runway AI Intros Game Worlds Generator in Limited Preview

Alibaba’s Qwen VLo Generative AI Shows Images in Progress

Chinese e-commerce giant Alibaba has released a new multimodal model called Qwen VLo that can understand and generate images. Available for free in preview through Qwen Chat, it can use image or text prompts to generate pictures, and accepts text in multiple languages, including Chinese and English. It can also edit, change backgrounds and switch styles, handling multiple image edits in sequence. An upgrade over January’s Qwen 2.5-VL release, Qwen VLo uses progressive generation, allowing users to see the image creation in progress, and Alibaba says it’s particularly good at making inline adjustments to fine-tune images. Continue reading Alibaba’s Qwen VLo Generative AI Shows Images in Progress

TikTok Offers Bulletin Boards for Direct-to-Many Broadcasts

In an effort to be more brand and creator friendly, TikTok is launching a broadcast channel feature called Bulletin Boards that shares in-app message updates. Essentially serving as one-to-many DM chats that fans can follow, Bulletin Boards can include text, images and video, with text limited to 1,000 characters and 20 bulletins daily. While fans can react by posting emoji to Bulletin Board posts, they cannot otherwise reply. The move comes as TikTok seeks to expand its brand toolkit, even updating its Symphony advertising suite to allow brands to create content that mimics material posted by influencers. Continue reading TikTok Offers Bulletin Boards for Direct-to-Many Broadcasts

Google Bows Gemini Command Line Interface for Developers

In a move to attract more developers to Gemini, Google is releasing an open-source command line interface (CLI) that will be free for most developers. CLIs offer a means to communicate with operating systems, and can be used as alternatives or complementary to an integrated developer environment (IDE). Gemini CLI has agentic capabilities and can code and “so much more,” according to Google, which lists content generation, problem solving, deep research and task management among its uses. Gemini CLI provides “lightweight access to Gemini, giving you the most direct path from your prompt to our model.” Continue reading Google Bows Gemini Command Line Interface for Developers

Meta Unveils New AI Advertising Tools as Part of Advantage+

Meta Platforms has announced new generative AI features developed for marketers that produce video advertisements. Announced as part of Meta’s Advantage+ ad suite, marketers can now use up to 20 product stills to create multi-scene video ads with music and text overlay. The company has also added generative AI voices and virtual try-ons to its Advantage+ marketer toolkit. The upgrades were announced at the Cannes Lions International Festival of Creativity, where the topic of Meta using AI not only to create ads but to algorithmically serve them to target audiences was a topic of conversation. Continue reading Meta Unveils New AI Advertising Tools as Part of Advantage+

Google Gemini Robotics On-Device Controls Robots Locally

Google DeepMind has released a new vision-language-action (VLA) model, Gemini Robotics On-Device, that can operate robots locally, controlling their movements without requiring an Internet connection or the cloud. Google says the software provides “general-purpose dexterity and fast task adaptation,” building on the March release of the first Gemini Robotics VLA model, which brought “Gemini 2.0’s multimodal reasoning and real-world understanding into the physical world.” Since the model operates independent of a data network, it’s useful for latency sensitive applications as well as low or no connectivity environments. Google is also releasing a Gemini Robotics SDK for developers. Continue reading Google Gemini Robotics On-Device Controls Robots Locally

Google Search Live Features Conversational Voice Capability

Google has launched Search Live with voice-input, a two-way conversational query function for exploring online resources. Presently available via the Google app for Android and iOS to U.S. users enrolled in Google Labs’ AI Mode experiment, Search Live is designed to handle complex, multi-part questions. Google suggests the new feature is “perfect for when you’re on the go or multitasking, like if you’re packing for a trip.” The discursive voice feature follows Google’s general rollout of AI Mode, recently launched to compete against products such as OpenAI’s ChatGPT Search and Perplexity AI. Continue reading Google Search Live Features Conversational Voice Capability

Adobe Unveils Firefly Generative AI App for iOS and Android

The redesigned Firefly AI app Adobe released in April with third-party model support is now available on iOS and Android. Text-to-video and background editing are among the features included in the new mobile package, which Adobe claims will help users capture inspiration as it strikes with “the freedom to generate images and videos wherever you are.” Adobe says those of all skill levels will be able to use the app, which was designed “to complement the ways we already interact with our phones.” The company is also rolling out its AI-powered online moodboard creator — Firefly Boards — in public beta, now with video functionality. Continue reading Adobe Unveils Firefly Generative AI App for iOS and Android

Google Is Testing ‘Hosted’ GenAI Audio Summaries in Search

Google is testing podcast-like audio search summaries generated by AI. Audio Overviews uses Google’s latest Gemini models to generate “quick, conversational audio overviews for certain search queries.” It can be enabled through Google Labs, the company’s public-facing portal to AI experiments. An Audio Overview “can help you get a lay of the land, offering a convenient, hands-free way to absorb information,” Google says, noting that the feature displays search results “right within the audio player” to make it easy to delve further. Google already had AI audio summaries in NotebookLM and Gemini. Like those, Search features AI discussion “hosts.” Continue reading Google Is Testing ‘Hosted’ GenAI Audio Summaries in Search

Meta Rolls Out AI Video Editor Available via App and the Web

Meta is launching a generative AI video editing tool, available in the Meta AI app, via the Meta.AI website and in the Edits app for Facebook and Instagram. Users are now able to transform 10 seconds of video using preset AI prompts that can change an outfit, location, style and more. The company says the feature is “inspired by” its Movie Gen models and promises it is the “first step toward our goal of bringing you AI video generation and editing across our apps and products,” with Meta AI video editing able to handle individualized text prompts later this year. The free tool is now available in the U.S. and about a dozen countries around the world. Continue reading Meta Rolls Out AI Video Editor Available via App and the Web

Qualcomm Chip Could Be a ‘Breakthrough’ for Smart Glasses

Qualcomm has made no secret of its belief that smart glasses are going to be a significant future product, and during the Augmented World Expo in Long Beach, California this week, the chipmaker shared its vision for the sector, demonstrating eyewear using its new Snapdragon processor. According to the company, the AR1+ Gen 1 is 26 percent smaller than earlier chips and runs artificial intelligence tools independent of Internet or smartphone connectivity. Qualcomm’s goal is to help smart glasses become “fully independent devices” that can do processing and complete agentic tasks with or without connectivity. Continue reading Qualcomm Chip Could Be a ‘Breakthrough’ for Smart Glasses

Apple visionOS 2026 Features Spatial Widgets, Better Avatars

During WWDC at Apple Park in California this week, the company unveiled visionOS 26 updates for its mixed reality Vision Pro headset that will up the ante for both consumer and enterprise users, with new spatial widgets and more realistic avatar Personas among the noteworthy updates. The customizable widgets will appear to blend into a headset wearer’s physical environment, “integrating seamlessly into a user’s space” and reappearing exactly where the user left them each time the Apple Vision Pro is activated. A great deal of effort has gone into improved iPhone integration, including the ability to initiate calls directly from the headset. Continue reading Apple visionOS 2026 Features Spatial Widgets, Better Avatars