It’s time to celebrate the incredible women leading the way in AI! Nominate your inspiring leaders for VentureBeat’s Women in AI Awards today before June 18. Learn More
Bigger isn’t always better, especially when it comes to running generative AI models on commodity hardware.
That’s a lesson that Stability AI is taking to heart, with its release today of Stable Diffusion 3 medium. Stable Diffusion is Stability AI’s flagship model providing text to image generation capabilities. The initial Stable Diffusion 3 release was previewed back on Feb 22, with public availability via an API on April 17.
The new Stable Diffusion Medium is intended to be a smaller, yet very capable model that can run on consumer grade GPUs. The new medium sized model will make Stable Diffusion 3 an even more attractive option for users and organizations with resource constraints that want to run a highly capable image generation technology.
Stable Diffusion Medium available today for users to try out via API, as well as on the Stable Artisan service via Discord. The model weights will also be available for non-commercial use on Hugging Face.
VB Transform 2024 Registration is Open
Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now
With the new release, the initial debut release of Stable Diffusion is now known as Stable Diffusion 3 (SD3) Large. Christian Laforte, co-CEO at Stability AI told VentureBeat that the SD3 Large has 8 billion parameters. In contrast, SD3 Medium has only 2 billion parameters.
“Unlike SD3 Large, SD3 Medium is smaller and will run efficiently on consumer hardware,” Laforte said.
Stable Diffusion Medium will run with 5GB of GPU VRAM
While many generative AI workloads, including Stable Diffusion have long relied on beefy Nvidia GPUs, the new Stability AI model changes the paradigm.
The minimum requirement to run Stable Diffusion Medium is only a paltry 5GB of GPU VRAM. At that level, the model will run on a wide variety of consumer PCs as well as high-end laptops. To be fair the minimum requirement is still just the minimum. Stability AI recommends 16GB of GPU VRAM, which might be a stretch for most laptops, but still isn’t an unreasonable amount.
Even with the smaller parameter count, Stability AI is claiming that SD3 Medium provides an exceptionally high level of quality that is comparable across a range of features with SD3 Large.
According to Laforte, SD3 Medium will stand out with a series of the same capabilities that are part of SD3 Large. Capabilities including photorealism, prompt adherence, typography, resource efficiency and fine tuning are all part of the smaller model.
“SD3 Medium excels at all of the capabilities mentioned, and is comparable to the current version of the SD3 Large API that you love and use today,” Laforte said.
Larforte noted that users can expect highly realistic image outputs from SD3. He explained that thanks to the 16-channel VAE (Variational Autoencoder) SD3 Medium delivers greater detail than any prior model per megapixel.
When it comes to prompt adherence, he said that SD3 is capable of a remarkable degree of prompt understanding in natural language. This includes spatial understanding of elements such as positioning of elements in an image.
The smaller model is also good at fine-tuning, according to Laforte. He noted that the model is exceptionally adaptable, efficiently capturing details from fine tuning datasets.
Among the big additions in SD3 overall is improved typography and that capability carries forward into SD3 medium as well.
The standout feature of the SD3 Medium however is its resource efficiency.
“The relatively smaller size and modularity of the 2B model allows for reduced computational requirements without compromising performance,” Laforte said. “This makes SD3 Medium an ideal choice for environments where resource management and efficiency are critical.”
Source link lol