New Tech from MIT, Adobe Advances Generative AI Imaging

Researchers from the Massachusetts Institute of Technology and Adobe have unveiled a new AI acceleration tool that makes generative apps like DALL-E 3 and Stable Diffusion up to 30x faster by reducing the process to a single step. The new approach, called distribution matching distillation, or DMD, maintains or enhances image quality while greatly streamlining the process. Theoretically, the technique “marries the principles of generative adversarial networks (GANs) with those of diffusion models,” consolidating “the hundred steps of iterative refinement required by current diffusion models” into one step, MIT PhD student and project lead Tianwei Yin says. Continue reading New Tech from MIT, Adobe Advances Generative AI Imaging

Stable Video 3D Generates Orbital Animation from One Image

Stability AI has released Stable Video 3D, a generative video model based on the company’s foundation model Stable Video Diffusion. SV3D, as it’s called,  comes in two versions. Both can generate and animate multi-view 3D meshes from a single image. The more advanced version also let users set “specified camera paths” for a “filmed” look to the video generation. “By adapting our Stable Video Diffusion image-to-video diffusion model with the addition of camera path conditioning, Stable Video 3D is able to generate multi-view videos of an object,” the company explains. Continue reading Stable Video 3D Generates Orbital Animation from One Image

Apple Unveils Progress in Multimodal Large Language Models

Apple researchers have gone public with new multimodal methods for training large language models using both text and images. The results are said to enable AI systems that are more powerful and flexible, which could have significant ramifications for future Apple products. These new models, which Apple calls MM1, support up to 30 billion parameters. The researchers identify multimodal large language models (MLLMs) as “the next frontier in foundation models,” which exceed the performance of LLMs and “excel at tasks like image captioning, visual question answering and natural language inference.” Continue reading Apple Unveils Progress in Multimodal Large Language Models

Midjourney Creates a Feature to Advance Image Consistency

Artificial intelligence imaging service Midjourney has been embraced by storytellers who have also been clamoring for a feature that enables characters to regenerate consistently across new requests. Now Midjourney is delivering that functionality with the addition of the new “–cref” tag (short for Character Reference), available for those who are using Midjourney v6 on the Discord server. Users can achieve the effect by adding the tag to the end of text prompts, followed by a URL that contains the master image subsequent generations should match. Midjourney will then attempt to repeat the particulars of a character’s face, body and clothing characteristics. Continue reading Midjourney Creates a Feature to Advance Image Consistency

TikTok Updates Its Code to Sync to Separate ‘TikTok Photos’

Having fended off challenges in the short-form video sphere since its late 2016 launch, it now appears TikTok is playing offense, laying the groundwork for a photo-sharing app that has drawn comparisons to Instagram and Pinterest. Avid TikTok users are probably familiar with a feature that lets them post still images as moving images that can be examined by advancing frame-by-frame. Now TikTok seems to want to improve that approach by building a separate TikTok Photos app to which users of the primary platform can export and showcase their still images to Android and iOS. Continue reading TikTok Updates Its Code to Sync to Separate ‘TikTok Photos’

Alibaba’s EMO Can Generate Performance Video from Images

Alibaba is touting a new artificial intelligence system that can animate portraits, making people sing and talk in realistic fashion. Researchers at the Alibaba Group’s Institute for Intelligent Computing developed the generative video framework, calling it EMO, short for Emote Portrait Alive. Input a single reference image along with “vocal audio,” as in talking or singing, and “our method can generate vocal avatar videos with expressive facial expressions and various head poses,” the researchers say, adding that EMO can generate videos of any duration, “depending on the length of video input.” Continue reading Alibaba’s EMO Can Generate Performance Video from Images

AI Video Startup Haiper Announces Funding and Plans for AGI

London-based AI video startup Haiper has emerged from stealth mode with $13.8 million in seed funding and a platform that generates up to two seconds of HD video from text prompts or images. Founded by alumni from Google DeepMind, TikTok and various academic research labs, Haiper is built around a bespoke foundation model that aims to serve the needs of the creative community while the company pursues a path to artificial general intelligence (AGI). Haiper is offering a free trial of what is currently a web-based user interface similar to offerings from Runway and Pika. Continue reading AI Video Startup Haiper Announces Funding and Plans for AGI

Apple’s Keyframer AI Tool Uses LLMs to Prototype Animation

Apple has taken a novel approach to animation with Keyframer, using large language models to add motion to static images through natural language prompts. “The application of LLMs to animation is underexplored,” Apple researchers say in a paper that describes Keyframer as an “animation prototyping tool.” Based on input from animators and engineers, Keyframer lets users refine their work through “a combination of prompting and direct editing,” the paper explains. The LLM can generate CSS animation code. Users can also use natural language to request design variations. Continue reading Apple’s Keyframer AI Tool Uses LLMs to Prototype Animation

Stability AI Advances Image Generation with Stable Cascade

Stability AI, purveyor of the popular Stable Diffusion image generator, has introduced a completely new model called Stable Cascade. Now in preview, Stable Cascade uses a different architecture than Stable Diffusion’s SDXL that the UK company’s researchers say is more efficient. Cascade builds on a compression architecture called Würstchen (German for “sausage”) that Stability began sharing in research papers early last year. Würstchen is a three-stage process that includes two-step encoding. It uses fewer parameters, meaning less data to train on, greater speed and reduced costs. Continue reading Stability AI Advances Image Generation with Stable Cascade

Apple Launches Open-Source Language-Based Image Editor

Apple has released MGIE, an open-source AI model that edits images using natural language instructions. MGIE, short for MLLM-Guided Image Editing, can also modify and optimize images. Developed in conjunction with University of California Santa Barbara, MGIE is Apple’s first AI model. The multimodal MGIE, which understands text and image input, also crops, resizes, flips, and adds filters based on text instructions using what Apple says is an easier instruction set than other AI editing programs, and is simpler and faster than learning a traditional program, like Apple’s own Final Cut Pro. Continue reading Apple Launches Open-Source Language-Based Image Editor

Yelp Adds 20 Features Plus AI to Help Users and Businesses

Yelp is introducing more than 20 new updates to improve the experience for community members and business owners. Included are AI-powered summaries that make it easier to find businesses, an updated Yelp Elite badge for reviewers who are passionate about specific subjects, and a new visual home feed and search experience geared toward discovery. For those seeking services, the new “Request a Quote” and “Projects” features are available. Artificial intelligence will also power market and competitive insights for business owners, while AI-powered smart budgets provide recommendations to optimize ad spend, “helping local businesses grow.” Continue reading Yelp Adds 20 Features Plus AI to Help Users and Businesses

AI Poison Pill App Nightshade Has 250K Downloads in 5 Days

AI copyright infringement tool Nightshade generated 250,000 downloads shortly after its January release, exceeding the expectations of its creators in the computer science department at the University of Chicago. Nightshade allows artists to thwart AI models from scraping and training on their work without consent. The Bureau of Labor Statistics shows more than 2.67 million artists working in the U.S., but social media feedback indicates the downloads have been worldwide. One of the coders says cloud mirror links had to be added so as not to overwhelm the University of Chicago’s web servers. Continue reading AI Poison Pill App Nightshade Has 250K Downloads in 5 Days

CES: Session Details the Impact and Future of AI Technology

Dr. Fei-Fei Li, Stanford professor and co-director of Stanford HAI (Human-Centered AI), and Andrew Ng, venture capitalist and managing general partner at Palo Alto-based AI Fund discussed the current state and expected near-term developments in artificial intelligence. As a general purpose technology, AI development will both deepen, as private sector LLMs are developed for industry-specific needs, and broaden, as open source public sector LLMs emerge to address broad societal problems. Expect exciting advances in image models — what Li calls “pixel space.” When implementing AI, think about teams rather than individuals, and think about tasks rather than jobs. Continue reading CES: Session Details the Impact and Future of AI Technology

Stability AI Is Offering Paid Membership for Commercial Users

As the pressure ratchets up for AI companies to go beyond the wow factor and make money, Stability AI has formalized three subscription tiers as it seeks to expand commercial use of its open-source, multimodal core models. The Stability AI Membership offerings include a free tier for personal and research (i.e., non-commercial) use, a professional tier that costs $20 a month, and a custom-priced enterprise tier for large outfits. The company says that with the three tiers it is “striking a balance between fostering competitiveness and maintaining openness in AI technologies.” Continue reading Stability AI Is Offering Paid Membership for Commercial Users

GenAI Lets Snapchat+ Subscribers Create and Share Images

Snapchat+ is rolling out new artificial intelligence features that let subscribers use text prompts to create generative AI images to share with friends. In addition, the Dreams feature, which creates generative AI selfies, is now able to add your friends to those photos. Snapchat+ subscribers get one pack of 8 Dreams per month as part of their $3.99 monthly fee. An onscreen button labeled “AI” lets subscribers access the AI image generator to choose from a menu of prompts (including “sunny day at the beach” and “planet made of cheese”) or they can enter their own descriptions. Continue reading GenAI Lets Snapchat+ Subscribers Create and Share Images