Midjourney Creates a Feature to Advance Image Consistency

By ETCentric Staff
March 15, 2024

Artificial intelligence imaging service Midjourney has been embraced by storytellers who have also been clamoring for a feature that enables characters to regenerate consistently across new requests. Now Midjourney is delivering that functionality with the addition of the new “–cref” tag (short for Character Reference), available for those who are using Midjourney v6 on the Discord server. Users can achieve the effect by adding the tag to the end of text prompts, followed by a URL that contains the master image subsequent generations should match. Midjourney will then attempt to repeat the particulars of a character’s face, body and clothing characteristics.

Narrative continuity has been a generative imaging challenge, since AI typically creates at least slightly different pictures for each request — even those using the same prompt terms. The –cref tag represents a tremendous convenience for those using Midjourney to create storyboards or animation.

VentureBeat provides some examples of the new tag at work, and while calling the results “far from exact to the original character (or even our original prompt),” suggests they are “definitely encouraging,” adding that with some further refining the consistency feature “could take Midjourney further from being a cool toy or ideation source into more of a professional tool.”

Based on its own experiment, PetaPixel concludes “all in all, the Character Reference did a pretty good job of keeping the same face across different AI images,” calling it another AI “encroachment on the photography space.”

Midjourney CEO David Holz wrote on Discord that the Character Reference tag is “similar to the ‘Style Reference’ feature, except instead of matching a reference style it tries to make the character match a ‘Character Reference’ image,” PetaPixel writes.

The feature works best with characters generated by Midjourney and is “not designed for real people and photos,” according to Holz, who qualifies that “the precision of this technique is limited, it won’t copy exact dimples/freckles/or t-shirt logos.”

PCMag cites an example of a user applying –cref to a photograph of a real person and concludes the results “definitely resemble” the subject, going on to express concern over how applying this technology to “real people” could make it “even easier for bad actors to make convincing deepfakes.”

Tom’s Guide reports “one of the primary use cases for this could be in creating graphic novels or turning the MidJourney-generated images into short video clips and using something like Pika Labs lip sync to animate the lips” to create a sequence of scenes using the same character.

“One of the holy grails of generative AI storytelling is being able to create consistency across characters in images and video,” writes Tom’s Guide, concluding “MidJourney has made that a little easier.”

Midjourney Creates a Feature to Advance Image Consistency

No Comments Yet

Leave a comment