Google Imagen 2 Now Generates 4-Second Clips on Vertex AI

During Google Cloud Next 2024 in Las Vegas, Google announced an updated version of its text-to-image generator Imagen 2 on Vertex AI that has the ability to generate video clips of up to four seconds. Google calls this feature “text-to-live images,” and it essentially delivers animated GIFs at 24 fps and 360×640 pixel resolution, though Google says there will be “continuous enhancements.” Imagen 2 can also generate text, emblems and logos in different languages, and has the ability to overlay those elements on existing images like business cards, apparel and products.

The improved version of Imagen 2 on Google’s Vertex AI developer platform follows the company’s February repeal of the image generator from its Gemini AI platform, which included a public-facing web-based interface. Google released Imagen 2 in preview at Cloud Next 2023, putting it into general release on Vertex AI in December.

As a text-to-image generator, Imagen 2 competes with OpenAI’s DALL-E, Midjourney and Adobe Firefy. Text-to-video capabilities also see it going after apps like Runway, Pika and Irreverent Labs, albeit at lower resolution.

TechCrunch calls text-to-live images “the real meat of the Imagen 2 upgrade,” and notes Google is “pitching live images as a tool for marketers and creatives, such as a GIF generator for ads showing nature, food and animals — subject matter that Imagen 2 was fine-tuned on.”

Google explains in a blog post that Imagen 2’s live images can achieve “a range of camera angles and motions.” Imagen is capable of “supporting consistency over the entire sequence,” while its default resolution of 1024×1024 for stills has prompted “organizations like Shutterstock and Rakuten [to] leverage Imagen 2 to generate high-quality, highly accurate images at enterprise scale.”

As to whether Imagen can viably compete with currently available video generation tools, TechCrunch suggests “not really,” explaining that “Runway can generate 18-second clips in much higher resolutions,” while Stability AI’s Stable Video Diffusion “offers greater customizability (in terms of frame rate).” OpenAI’s as yet commercially unavailable Sora is “poised to blow away the competition with the photorealism it can achieve.”

“Instead of having a picture of an object, like a static picture of a car, you can see a short image like an animated moving vehicle,” Google Cloud CEO Thomas Kurian said during a press briefing reported in VentureBeat, adding that “many organizations, particularly in areas like media and advertising, are looking at it because it improves engagement with users.”

Imagen 2 was among many topics that made news at Google Cloud Next 2024, held April 9-11 in Las Vegas. Recaps of day one and day two events are available, as are some keynotes.

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.