New Tech from MIT, Adobe Advances Generative AI Imaging

Researchers from the Massachusetts Institute of Technology and Adobe have unveiled a new AI acceleration tool that makes generative apps like DALL-E 3 and Stable Diffusion up to 30x faster by reducing the process to a single step. The new approach, called distribution matching distillation, or DMD, maintains or enhances image quality while greatly streamlining the process. Theoretically, the technique “marries the principles of generative adversarial networks (GANs) with those of diffusion models,” consolidating “the hundred steps of iterative refinement required by current diffusion models” into one step, MIT PhD student and project lead Tianwei Yin says. Continue reading New Tech from MIT, Adobe Advances Generative AI Imaging

Stability AI Advances Image Generation with Stable Cascade

Stability AI, purveyor of the popular Stable Diffusion image generator, has introduced a completely new model called Stable Cascade. Now in preview, Stable Cascade uses a different architecture than Stable Diffusion’s SDXL that the UK company’s researchers say is more efficient. Cascade builds on a compression architecture called Würstchen (German for “sausage”) that Stability began sharing in research papers early last year. Würstchen is a three-stage process that includes two-step encoding. It uses fewer parameters, meaning less data to train on, greater speed and reduced costs. Continue reading Stability AI Advances Image Generation with Stable Cascade

Stability AI Intros Real-Time Text-to-Image Generation Model

Stability AI, developer of Stable Diffusion (one of the leading visual content generators, alongside Midjourney and DALL-E), has introduced SDXL Turbo — a new AI model that demonstrates more of the latent possibilities of the common diffusion generation approach: images that update in real time as the user’s prompt updates. This feature was always a possibility even with previous diffusion models given text and images are comprehended differently across linear time, but increased efficiency of generation algorithms and the steady accretion of GPUs and TPUs in a developer’s data center makes the experience more magical. Continue reading Stability AI Intros Real-Time Text-to-Image Generation Model

AI-Powered Movies in Progress, Writing Makes Major Strides

In the not-so-distant future there will likely be services that allow the user to choose plots, characters and locations that are then fed into an AI-powered transformer with the result of a fully customized movie. The idea of using generative artificial intelligence to create content goes back to 2015’s computer vision program DeepDream, thanks to Google engineer Alexander Mordvintsev. Bringing that fantasy closer to reality is the AI system GPT-3 that creates convincingly coherent and interactive writing, often fooling the experts. Continue reading AI-Powered Movies in Progress, Writing Makes Major Strides

Amazon Developing AI System for Trying on Clothes Virtually

At Amazon Lab126, researchers proposed three related AI algorithms to create Outfit-VITON, an image-based virtual try-on system for apparel. The algorithms could form the basis of an assistant to help a customer shop for clothes by describing a product’s variations, recommending items that go with the one selected, and synthesizing the image of a model wearing clothes to show how all the items work as an outfit. The algorithms will be presented at the annual IEEE Conference on Computer Vision and Pattern Recognition (CVPR will be held virtually this year, June 14-19). Continue reading Amazon Developing AI System for Trying on Clothes Virtually

Google and IBM Create Advanced Text-to-Speech Systems

Both IBM and Google recently advanced development of Text-to-Speech (TTS) systems to create high-quality digital speech. OpenAI found that, since 2012, the compute power needed to train TTS models has exploded to more than 300,000 times. IBM created a much less compute-intensive model for speech synthesis, stating that it is able to do so in real-time and adapt to new speaking styles with little data. Google and Imperial College London created a generative adversarial network (GAN) to create high-quality synthetic speech. Continue reading Google and IBM Create Advanced Text-to-Speech Systems

Machine Learning Is Being Used to Upscale Classic Games

Gamers have discovered a way to use machine learning to improve the graphics of older games. Called “AI upscaling,” the technique uses an algorithm to take a low-resolution image and, based on training data, generates a version with more pixels. Although upping the resolution of images is not new, machine learning has improved both the speed and the quality of the end result. On the r/GameUpscale subreddit, which is moderated by Norwegian teacher and student Daniel Trolie, users share “tips and tricks” on the practice. Continue reading Machine Learning Is Being Used to Upscale Classic Games

Amazon Creates AI-Based Tools for Spotting Fashion Trends

Amazon is developing systems based on artificial intelligence algorithms that are aimed at spotting fashion trends and, eventually, shaping them. The effort could boost Amazon’s sales in clothing, perhaps even gaining a dominant position in fashion. The e-commerce giant isn’t alone in making recommendations based on products appearing in social media, and highlighting the resulting trends; Instagram and Pinterest also pinpoint trends and react quickly to them, as does startup subscription service Stitch Fix. Continue reading Amazon Creates AI-Based Tools for Spotting Fashion Trends