April 8, 2022
OpenAI has created a new technology that creates and edits images based on written descriptions of the desired result. DALL-E 2, an homage to the surrealist painter Salvador Dalí and the Pixar film “Wall-E,” is still in development but is already producing impressive results with simple instructions like “kittens playing chess” and “astronaut riding a horse.” OpenAI says the tech, “isn’t being directly released to the public” and the hope is “to later make it available for use in third-party apps. “Already some are expressing worry that such a tool has potential to exponentially increase the use of deepfakes.
OpenAI has put in place a number of safeguards to prevent misuse. “We’ve limited the ability for DALL-E 2 to generate violent, hate, or adult images,” explains OpenAI’s DALL-E 2 landing page. “By removing the most explicit content from the training data, we minimized DALL-E 2’s exposure to these concepts. We also used advanced techniques to prevent photorealistic generations of real individuals’ faces, including those of public figures.”
Currently, the technology is in preview with “a limited number of trusted users who will help us learn about the technology’s capabilities and limitations,” OpenAI says.
The original DALL-E debuted early last year. This iteration has clearly improved as a result of additional training, as the OpenAI website illustrates. “DALL-E is what artificial intelligence researchers call a neural network, which is a mathematical system loosely modeled on the network of neurons in the brain,” writes The New York Times.
This is “the same technology that recognizes the commands spoken into smartphones and identifies the presence of pedestrians as self-driving cars navigate city streets. A neural network learns skills by analyzing large amounts of data,” notes NYT. DALL-E 2, “generates high-resolution images that in many cases look like photos. Though DALL-E often fails to understand what someone has described and sometimes mangles the image it produces.”
On a related note, NYT points out the Allen Institute has created “a system that can analyze audio as well as imagery and text. After analyzing millions of YouTube videos, including audio tracks and captions, it learned to identify particular moments in TV shows or movies, like a barking dog or a shutting door.”
Such systems may ultimately help companies “improve search engines, digital assistants and other common technologies as well as automate new tasks for graphic artists, programmers and other professionals,” NYT reports.
DALL-E 2 can edit existing images as well as create new ones. “Another feature, variations, is sort of like an image search tool for pictures that don’t exist,” reports The Verge. “Users can upload a starting image and then create a range of variations similar to it. They can also blend two images, generating pictures that have elements of both.”
OpenAI, co-founded by Elon Musk and “backed by a billion dollars in funding from Microsoft,” according to NYT , has opened a waitlist for those interested in previewing DALL-E 2.