The New Stable Diffusion Is Producing Horrific Mangled Human Bodies

Stable Diffusion is here, and it’s freaky.

Medium Rare

After a chaotic few months, embattled AI startup Stability AI released the latest version of its text-to-image AI model, Stable Diffusion 3 Medium. According to Stability, the new AI is its “most sophisticated image-generating model to date.” So why is it consistently generating freakish body horror monstrosities?

As Ars Technica reports, disappointed Stable Diffusion users have taken to Reddit to complain that the new model often refuses to generate a picture of a human that isn’t a horrifyingly mangled, AI-generated mess of incoherent limbs.

“I haven’t been able to generate a single decent image at all outside of the example prompts,” one irritated Redditor wrote in a thread on the r/StableDiffusion subreddit. “I’ve tried highly descriptive prompts with no luck. Even an absolutely basic one like ‘photograph of a person napping in a living room’ leads to Cronenberg-esque monstrosities.”

It would “be funny if not so depressing :(” added another frustrated user.

A few users in the thread reported that they were able to generate normal-looking humans, but those folks seem to be in a small minority. Want an AI-spun picture of a person? If you’re using Stable Diffusion 3, watch out for appendage soup.

Cronenberg-22

Based on evidence shared by Redditors, Cronenberg-esque is an accurate descriptor for the images.

A simple prompt for “woman laying on a beach,” for example, allegedly resulted in whatever this mess of face-on-arm-on-hair-on-possible-tree-stump is, while a number of users shared botched images of women with extremely messed up hands. Elsewhere, in another thread, users trying to generate photos of women lying in the grass were repeatedly returned nightmare-fuel renderings of Thumb Thumb-like creatures.

Importantly, the AI mostly appears to struggle with humanoid figures. As Redditors reported in the various threads, other prompts turn out perfectly fine outputs — an outcome that’s likely the result of Stability’s decision to train the model on a dataset that filters out NSFW imagery.

The question of training on NSFW imagery presents a bit of a catch-22 for AI companies like Stability. Porn makes up vast swaths of the internet, and as Ars notes, researchers have found that not training models on NSFW material vastly reduces their ability to reliably and accurately generate human forms. At the same time, however, training on web-scraped NSFW material presents a slew of deeply serious safety and ethics concerns.

In the spirit of safety — and, perhaps, not getting sued — Stability chose to exclude explicit content from its training process. But its users, a large chunk of whom definitely wanted to generate images of women, are clearly frustrated with the final product.

“I guess now they can go bankrupt,” quipped one Redditor, “in a safe and ethically [sic] way.”

More on AI training: AI Companies Running out of Training Data after Burning Through Entire Internet

Source link
lol