Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Midjourney, the popular AI image generation startup with more than 21 million users on its Discord server alone, is branching out from AI image creation and editing.
Patchwork revealed
Max Kreminski, leader of Midjourney’s Storytelling Lab, demoed the new tool, called “Patchwork,” in a livestream screenshare on Discord and X via Restream.
He clarified that it would be a stand alone app that would require Midjourney accounts to log into, and that the URL would be available as a “research preview” in the Midjourney Discord server’s “updates” channel. Users will need to connect their Midjourney Discord account to their Google Account to access Patchwork’s research preview. The company posted instructions for doing so on its X account.
The tool appears to be a web-based blank white, infinite canvas with a “toolbox” on the left side of the browser screen, showing a variety of buttons labeled for “character,” “event,” “faction,” “place,” “prop,” and “random,” as well as tools such as “note,” “image,” “portal,” “save” and “share.” “Save” downloads a JSON file with links to all the Midjourney images created in the canvas. Midjourney considers each canvas a separate digital “world.”
To switch between worlds, the user creates a “portal,” a small black circular button.
To generate a new world, the user enters a text prompt into an editor bar at the top of the “create” screen and selects one or more of a set of 10 different image styles.
This then produces a new whiteboard with a bunch of new still image assets and text boxes or entities known as “scraps”, including input boxes that allow the user to prompt new images or settings that fit the initial world description, even whole new AI generated character descriptions.
In the demo livestream, the character name automatically populated with Marcus “Dizzy” Gillespie, echoing the name of the famous jazz musician. Dragging the description into a new character image creator box produces four new AI-generated images.
Adding new character boxes, the user can then prompt to create names and characteristics, as well as motivations that can spur a conflict for the basis of a story.
The user can then link characters together with lines that denote connections between them. They can also write action sequences and scene descriptions that each narrate a story. Each character can be used in multiple images and these images gathered together with a single option.
The user can “share” the board with other Midjourney users who can collaborate, purportedly in real-time, with multiple cursors moving across the same shared canvas. A single world can support dozens, even up to 100 users, according to Kreminski. However, he noted that the more users, the more chaotic the experience would be.
Kreminski said only users who are logged in can view boards (for now), but in the future, boards may be viewable by non-users. He mentioned that tabletop roleplaying groups were already using the feature to chart their campaigns.
He also said that Midjourney version 7 (V7) would include a setting to allow multiple character consistency across different and new images.
Moving towards immersive, 3D worlds
Kreminski further revealed that there were at least 3 different large language models powering the application, including a fine-tuned open source one unique to Midjourney.
Ultimately, it appears to be a novel, complex, powerful, somewhat overwhelming yet compelling tool for storyboarding. I could easily see it being used by writers and film directors, game designers, comic book creators and even live theater directors and writers.
In the long term, Kreminski said there was a “very clear path in terms of escalation of the details and interactions in the worlds,” including fully immersive 3D virtual reality scenes, but that was likely years away.
The news comes as other AI researchers, startups such as Fei-Fei Li’s World Labs, and big tech companies such as Google seek to develop AI that can create 3D immersive, navigable worlds online from simple prompts or images.
More Midjourney updates coming soon
In addition, Midjourney’s creator David Holz joined the announcement livestream to state the startup would launch multiple model personalization modes in the coming days.
Currently, Midjourney allows users to rate images to personalize the kinds of visuals they want to see in generations, and fine-tune the model to personal preferences. Now, the startup will allow users to have multiple personalized versions they can toggle between.
In addition, Holz shared that Midjourney would allow users to upload and reference multiple images to boards to guide generations.
Furthermore, sometime after Christmas (December 25), Midjourney will be introducing video models and a Midjourney V7 AI image generator that will feature increased prompt understanding.
Holz further revealed that Midjourney is working on three to four new hardware projects and said the startup was “trying to branch out and become a full research lab…it may take us six months to announce all six things.”
Source link lol