New Tech from MIT, Adobe Advances Generative AI Imaging

Researchers from the Massachusetts Institute of Technology and Adobe have unveiled a new AI acceleration tool that makes generative apps like DALL-E 3 and Stable Diffusion up to 30x faster by reducing the process to a single step. The new approach, called distribution matching distillation, or DMD, maintains or enhances image quality while greatly streamlining the process. Theoretically, the technique “marries the principles of generative adversarial networks (GANs) with those of diffusion models,” consolidating “the hundred steps of iterative refinement required by current diffusion models” into one step, MIT PhD student and project lead Tianwei Yin says.

It could potentially be a new generative modeling method that saves time while maintaining or improving quality, Yin explains in MIT News, which writes that the single-step DMD model “could enhance design tools, enabling quicker content creation and potentially supporting advancements in drug discovery and 3D modeling, where promptness and efficacy are key.”

MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and Adobe collaborators have shared their findings in a technical paper on arXiv and a GitHub overview that includes copious comparative images.

“The traditional process of generating images using diffusion models has been complex and time-consuming, often requiring multiple iterations for the algorithm to produce satisfactory results,” writes TechTimes, noting the new approach “leverages a teacher-student model, wherein a new computer model is trained to mimic the behavior of more complex, original models that generate images.”

DMD utilizes two key components: “a regression loss and a distribution matching loss,” reports TechTimes, explaining that “the regression loss ensures stable training by anchoring the mapping process, while the distribution matching loss aligns the probability of generating images with their real-world occurrence frequency.”

That dual approach is further assisted by two diffusion models for faster generation that minimizes the distribution divergence between the real and generated images. The results, TechSpot says, “are comparable to Stable Diffusion, but the speed is out of this world,” noting “the researchers claim their model can generate 20 images per second on modern GPU hardware.”

The researchers have “figured out how to make the most popular AI image generators 30 times faster,” condensing them into smaller models without a compromise in quality, writes Live Science.

The MIT researchers are not alone in applying a single-step approach to generative imaging, which may soon include generative video.

“Stability AI developed a technique known as Adversarial Diffusion Distillation (ADD) to generate 1-megapixel images in real-time,” TechSpot reports, detailing how the company “trained its SDXL Turbo model through ADD, achieving image generation speeds of just 207 ms on a single Nvidia A100 AI GPU accelerator,” using “a similar approach to MIT’s DMD.”

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.