Google Takes New Approach to Create Video with Lumiere AI

Google has come up with a new approach to high resolution AI video generation with Lumiere. While most GenAI video models output individual high resolution frames at various points in the sequence (called “distant keyframes”), fill in the missing frames with low-res images to create motion (known as “temporal super-resolution,” or TSR), then up-res that connective tissue (“spatial super-resolution,” or SSR) of non-overlapping frames, Lumiere takes what Google calls a “Space-Time U-Net architecture,” which processes all frames at once, “without a cascade of TSR models, allowing us to learn globally coherent motion.” Continue reading Google Takes New Approach to Create Video with Lumiere AI

Social Startup Plai Labs Debuts Free Text-to-Video Generator

The entrepreneurs behind the Myspace social network and gaming company Jam City have shifted their focus to generative AI and web3 with a new venture, Plai Labs, a social platform that provides AI tools for collaboration and connectivity. Plai Labs has released a free text-to-video generator, PlaiDay, which will compete with other GenAI video tools from the likes of OpenAI (DALL-E 2), Google (Imagen), Meta Platforms (Make-A-Video) and Stable Diffusion. But PlaiDay hopes to set itself apart by offering the ability to personalize videos with selfie likenesses. Continue reading Social Startup Plai Labs Debuts Free Text-to-Video Generator

Google Introduces an AI Watermark That Cannot Be Removed

Google DeepMind and Google Cloud have teamed to launch what they claim is an indelible AI watermark tool, which if it works would mark an industry first. Called SynthID, the technique for identifying AI-generated images is being launched in beta. The technology embeds its digital watermark “directly into the pixels of an image, making it imperceptible to the human eye, but detectable for identification,” according to DeepMind. SynthID is being released to a limited number of Google’s Vertex AI customers using Imagen, a Google AI language model that generates photorealistic images. Continue reading Google Introduces an AI Watermark That Cannot Be Removed

Meta In-House Chip Designs Include Processing for AI, Video

Meta Platforms has shared additional details on its next generation of AI infrastructure. The company has designed two custom silicon chips, including one for training and running AI models and eventually powering metaverse functions like virtual reality and augmented reality. Another chip is tailored to optimize video processing. Meta publicly discussed its internal chip development last week ahead of a Thursday virtual event on AI infrastructure. The company also showcased an AI-optimized data center design and talked about phase two of deployment of its 16,000 GPU supercomputer for AI research. Continue reading Meta In-House Chip Designs Include Processing for AI, Video

Google Announces Wide Range of New Products, AI Features

While the much-anticipated unveiling of the $1,799 Pixel Fold is generating headlines after yesterday’s Google I/O developer conference, the company made a slew of other announcements, including the $500 Pixel Tablet, the midrange Pixel 7A, AI functionality for Google Search and Android, an AI-powered editing feature for Google Photos, an improved Wear OS 4 (available later this year), and a redesigned Google Home app (available today). In addition, the company announced that its AI-powered chatbot Bard is now available to everyone, whether you were on the waitlist or not. We’ve compiled a helpful list of new products and features, along with links to reviews and related news. Continue reading Google Announces Wide Range of New Products, AI Features

Google and Meta Are Developing AI Text-to-Video Generators

AI image generators like OpenAI’s DALL-E 2 and Google’s Imagen have been generating a lot of attention recently. Now AI text-to-video generators are edging into the spotlight, with Google debuting Imagen Video on the heels of Meta AI’s Make-A-Video rollout last month. Imagen Video has been used to generate videos of up to 25-minutes at a 24 fps, 1280×768 pixel spec. Imagen Video was trained “on a combination of an internal dataset consisting of 14 million video-text pairs and 60 million image-text pairs,” resulting in some unusual functionality, according to Google Research. Continue reading Google and Meta Are Developing AI Text-to-Video Generators

Stability AI Releases Stable Diffusion Text-to-Image Generator

Stability AI is in the first stage of release of Stable Diffusion, a text-to-image generator similar in functionality to OpenAI’s DALL-E 2, with one important distinction: this open-source newcomer lacks the filters that prevent the earlier system from creating images of public figures or content deemed excessively toxic. Last week the Stable Diffusion code was made available to just over a thousand researchers and the Los Altos-based startup anticipates a public release in the coming weeks. The unfettered unleashing of a powerful imaging system has stirred controversy in the AI community, raising ethical questions. Continue reading Stability AI Releases Stable Diffusion Text-to-Image Generator

Businesses Experiment with DALL-E 2, Report Mixed Results

OpenAI’s powerful text-to-image generator DALL-E 2 is still in beta, but businesses are already testing it for commercial use. Apparel firm Stitch Fix has been using it to visualize fabric and color personalization, while Heinz tapped the AI system for a marketing campaign. Cosmopolitan used it to design a magazine cover. Others have leveraged the image engine to generate logos and thumbnails. These early adopters are identifying technical issues that OpenAI says it is addressing as it readies DALL-E 2 for enterprise. Foremost among the complaints is the lack of a dedicated API for public use. Continue reading Businesses Experiment with DALL-E 2, Report Mixed Results

Google’s Imagen AI Model Makes Advances in Text-to-Image

Google has released a research paper on a new text-to-image generator called Imagen, which combines the power of large transformer language models for text with the capabilities of diffusion models in high-fidelity image generation. “Our key discovery is that generic large language models (e.g. T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis,” the company said. Simultaneously, Google is introducing DrawBench, a benchmark for text-to-image models it says was used to compare Imagen with other recent technologies including VQGAN+CLIP, latent diffusion models, and OpenAI’s DALL-E 2. Continue reading Google’s Imagen AI Model Makes Advances in Text-to-Image