Stability AI Advances Image Generation with Stable Cascade

Stability AI, purveyor of the popular Stable Diffusion image generator, has introduced a completely new model called Stable Cascade. Now in preview, Stable Cascade uses a different architecture than Stable Diffusion’s SDXL that the UK company’s researchers say is more efficient. Cascade builds on a compression architecture called Würstchen (German for “sausage”) that Stability began sharing in research papers early last year. Würstchen is a three-stage process that includes two-step encoding. It uses fewer parameters, meaning less data to train on, greater speed and reduced costs. Continue reading Stability AI Advances Image Generation with Stable Cascade

VideoPoet: Google Launches a Multimodal AI Video Generator

Google has unveiled a new large language model designed to advance video generation. VideoPoet is capable of text-to-video, image-to-video, video stylization, video inpainting and outpainting, and video-to-audio. “The leading video generation models are almost exclusively diffusion-based,” Google says, citing Imagen Video as an example. Google finds this counter intuitive, since “LLMs are widely recognized as the de facto standard due to their exceptional learning capabilities across various modalities.” VideoPoet eschews the diffusion approach of relying on separately trained tasks in favor of integrating many video generation capabilities in a single LLM. Continue reading VideoPoet: Google Launches a Multimodal AI Video Generator

OpenAI Expands DALL-E 2 Functionality with Facial Uploads

OpenAI has begun allowing users of its DALL-E 2 image-generating system to work with facial image uploads. The program previously allowed only computer-generated faces in an effort to prevent deepfakes and misuse, but OpenAI says improvements to its safety system succeeded in “minimizing the potential of harm” from things like explicit, political or violent content. OpenAI will continue to prohibit use of unauthorized photos and will seek to protect right of publicity, though it remains to be seen how effective that will be. In the past, customers have complained the company was overzealous in its policing. Continue reading OpenAI Expands DALL-E 2 Functionality with Facial Uploads