By
Paula ParisiJuly 7, 2025
AI startup Runway has a new tool called Game Worlds that lets users generate simple video game worlds using images and text-based prompts. At the moment, Runway Game Worlds can only help generate simple text-based interactive adventures that include pictures, but the company has plans to enable more complex game creation by the end of the year. Runway CEO Cristóbal Valenzuela says the company is interested in partnering with video game companies who are willing to provide game data that can be used to train the company’s models in exchange for generative capabilities. Continue reading Runway AI Intros Game Worlds Generator in Limited Preview
By
Paula ParisiJuly 2, 2025
Chinese e-commerce giant Alibaba has released a new multimodal model called Qwen VLo that can understand and generate images. Available for free in preview through Qwen Chat, it can use image or text prompts to generate pictures, and accepts text in multiple languages, including Chinese and English. It can also edit, change backgrounds and switch styles, handling multiple image edits in sequence. An upgrade over January’s Qwen 2.5-VL release, Qwen VLo uses progressive generation, allowing users to see the image creation in progress, and Alibaba says it’s particularly good at making inline adjustments to fine-tune images. Continue reading Alibaba’s Qwen VLo Generative AI Shows Images in Progress
By
Paula ParisiJune 3, 2025
Google is celebrating 10 years of Google Photos by introducing a redesign of the Photos editor, including helpful new tools. The Photos editor gets some AI editing features previously available only on Pixel phones as part of its generative AI Magic Editor. The Photos platform is also expanding access to its AI-powered text-to-image Reimagine and automatic framing and related features first introduced with the Pixel 9. The company explains there are currently more than 1.5 billion monthly Photos users that have stored 9+ trillion photos and videos. The updates reflect Google’s AI push as it continues to integrate Gemini across its growing family of products and services. Continue reading Google Photos Rolling Out Redesign and New AI Editing Tools
By
Paula ParisiMay 15, 2025
TikTok AI Alive is a new image-to-video feature that can add sequential expression to selfies and add progressive hues to sunsets. Accessible through the platform’s Story Camera, AI Alive uses intelligent editing tools that give anyone, regardless of experience, “the ability to transform static images into captivating, short-form videos enhanced with movement, atmospheric and creative effects.” TikTok says it is prioritizing safety and transparency by adding a label to AI Alive stories, which will also have C2PA metadata embedded, traveling with the content even when it’s downloaded and shared elsewhere. Continue reading TikTok Offering ‘AI Alive’ Image-to-Video Generator in Stories
By
Paula ParisiMay 2, 2025
Online graphic design platform Freepik, has unveiled F Lite, a text-to-image generator that the company says was trained only on licensed content, making it safe for commercial use. The 10 billion-parameter F Lite — currently available in two openly-licensed versions — was developed in partnership with Fal.ai, a San Francisco-based AI startup that uses a proprietary inference engine and APIs to enable fast training, inference, and scaling of image, video, audio, and multimodal AI models. Freepik Head of AI Iván de Prado describes F Lite as “a significant milestone in open, responsible AI.” Continue reading Freepik Introduces a Responsibly Trained AI Image Generator
By
Paula ParisiMarch 27, 2025
OpenAI has activated the multimodal image generation capabilities of GPT-4o, making it available to ChatGPT users on the Plus, Pro, Team and Free tiers. It replaces DALL-E 3 as the default image generator for the popular chatbot. GPT-4o’s accuracy with text, understanding of symbols and precision with prompts combined with well multimodal capabilities that allow the model to take cues from visual material have transformed its image capabilities from largely unpredictable to “consistent and context-aware,” resulting in “a practical tool with precision and power,” claims OpenAI. Continue reading OpenAI Delivers Native GPT-4o Image Generator to ChatGPT
By
Paula ParisiMarch 14, 2025
Snapchat has introduced AI Video Lenses for those paying $16 per month for its Platinum tier. Powered by Snap’s custom-built generative video model, the initial three releases are a fox that perches on your shoulder, rambunctious racoons and a large bouquet of flowers with a zoom out effect. After selecting an AI Video Lens and applying it to a Snap, the AI video generates in the background, auto-saving save to Memories while users are free to continue messaging and Snapping on the app. The resulting video can be shared with friends or to Stories and Spotlight. Continue reading Snap Launches Generative AI Video Lenses for Platinum Subs
By
Paula ParisiFebruary 28, 2025
Alibaba has open-sourced its Wan 2.1 video- and image-generating AI models, heating up an already competitive space. The Wan 2.1 family, which has four models, is said to produce “highly realistic” images and videos from text and images. The company has since December been previewing a new reasoning model, QwQ-Max, indicating it will be open-sourced when fully released. The move comes after another Chinese AI company, DeepSeek, released its R1 reasoning model for free download and use, triggering demand for more open-source artificial intelligence. Continue reading Highly Realistic Alibaba GenVid Models Are Available for Free
By
Paula ParisiFebruary 24, 2025
Barely two weeks after the launch of its OmniHuman-1 AI model, ByteDance has released Goku, a new artificial intelligence designed to create photorealistic video featuring humanoid actors. Goku uses text prompts to create among other things, realistic product videos without the need for human actors. This last is a boon for ByteDance social media unit TikTok. Goku is open source, trained on a large dataset of roughly 36 million video-text pairs and 160 million image-text pairs. Goku’s debut is received as more bad news for OpenAI in the form of added competition, but a positive step for global enterprise. Continue reading ByteDance’s Goku Video Model Is Latest in Chinese AI Streak
By
Paula ParisiFebruary 7, 2025
Snap has created a lightweight AI text-to-image model that will run on-device, expected to power some Snapchat mobile features in the months ahead. Using an iPhone 16 Pro Max, the model can produce high-resolution images in approximately 1.4 seconds, running on the phone, which reduces computational costs. Snap says the research model “is the continuation of our long-term investment in cutting edge AI and ML technologies that enable some of today’s most advanced interactive developer and consumer experiences.” Among the Snapchat AI features the new model will enhance are AI Snaps and AI Bitmoji Backgrounds. Continue reading Snap Develops a Lightweight Text-to-Video AI Model In-House
By
Paula ParisiDecember 6, 2024
Google DeepMind’s new Genie 2 is a large foundation world model that generates interactive 3D worlds that are being likened to video games. “Games play a key role in the world of artificial intelligence research,” says Google DeepMind, noting “their engaging nature, challenges and measurable progress make them ideal environments to safely test and advance AI capabilities.” Based on a simple prompt image, Genie 2 is capable of producing “an endless variety of action-controllable, playable 3D environments” — suitable for training and evaluating embodied agents — that can be played by a human or AI agent using keyboard and mouse inputs. Continue reading DeepMind Genie 2 Creates Worlds That Emulate Video Games
By
Paula ParisiAugust 22, 2024
Google DeepMind has made its latest AI image generator, Imagen 3, free for use in the U.S. via the company’s ImageFX platform. Imagen 3 will be available in multiple versions, “each optimized for different types of tasks, from generating quick sketches to high-resolution images.” Google announced Imagen 3 at Google I/O in March, and in June made it available to enterprise users through Vertex. Using simplified natural language text input rather than “complex prompt engineering,” Google says Imagen 3 generates high-quality images in a range styles, from photorealistic, painterly and textured to whimsically cartoony. Continue reading Google DeepMind Releases Imagen 3 for Free to U.S. Users
By
Paula ParisiAugust 20, 2024
ByteDance has debuted a text-to-video mobile app in its native China that is available on the company’s TikTok equivalent there, Douyin. Called Jimeng AI, there is speculation that it will be coming to North America and Europe soon via TikTok or ByteDance’s CapCut editing tool, possibly beating competing U.S. technologies like OpenAI’s Sora to market. Jimeng (translation: “dream”) uses text prompts to generate short videos. For now, its responsiveness is limited to prompts written in Chinese. In addition to entertainment, the app is described as applicable to education, marketing and other purposes. Continue reading ByteDance Intros Jimeng AI Text-to-Video Generator in China
By
Paula ParisiAugust 19, 2024
Grok-2 and Grok-2 mini, the latest generative chatbots from Elon Musk’s xAI, create images with seemingly few guardrails. Early pictures of notable personalities such as Bill Gates, Donald Trump and Kamala Harris in questionable or compromising settings may not appear photorealistic to a trained eye, but they are still described in many cases to be quite realistic. Powered by the FLUX.1 AI model from Black Forest Labs, Grok-2 and Grok-2 mini are available in beta on X social for Premium and Premium+ subscribers and will be coming to xAI’s enterprise API later this month, according to the company. Continue reading xAI’s Grok-2 Generates Realistic Images with Few Guardrails
By
Paula ParisiAugust 6, 2024
A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models