Don’t miss OpenAI, Chevron, Nvidia, Kaiser Permanente, and Capital One leaders only at VentureBeat Transform 2024. Gain essential insights about GenAI and expand your network at this exclusive three day event. Learn More
Google’s next-generation text-to-image foundation model is coming to the company’s Vertex AI platform. Imagen 3 will be available for select customers in preview, offering developers faster image generation, better prompt understanding, a photo-realistic generation of people, and greater text rendering control within an image compared to its predecessor.
Introduced at Google I/O in May, Imagen 3 was initially available to select creators in a private preview in ImageFX. However, Google promised that the AI model would come to Vertex AI.
“It’s our most capable image generation model yet,” Douglas Eck, senior research director of Google DeepMind, said at the time. “Imagen 3 is more photorealistic, with richer details and fewer visual artifacts or distorted images. It understands prompts written the way people write—the more creative and detailed you are, the better. And Imagen 3 remembers to incorporate small details…in longer prompts. Plus, this is our best model yet for rendering text, which has been a challenge for image generation models.”
With its launch on Vertex AI, Imagen 3 comes with multi-language support, safety features such as Google DeepMind’s SynthID digital watermarking, and multiple aspect ratio support.
Countdown to VB Transform 2024
Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now
Stock photography provider Shutterstock is one company using this model. “Since adding Imagen to our AI image generator, our users have generated millions of pictures with the model,” Justin Hiza, the company’s vice president of data services, remarks in a statement. “We’re excited by the enhancements Imagen 3 promises as it enables our users to execute their ideas faster without sacrificing quality. As an important enhancement to Shutterstock’s launch of the first ethically-sourced AI image generator, we also appreciate how safety is built in and that the content that is created is protected under Google Cloud’s indemnification for generative AI.”
And while Google continues to innovate on Imagen, it declined to state when it would allow its Gemini AI to resume generating images following backlash over notable “inaccuracies.” When asked during a press briefing, Google Cloud Chief Executive Thomas Kurian pointed out that Imagen and Gemini are different types of models: “Gemini is a multimodal model, meaning you can give it input of many different modalities it can reason on it, and…allows you to reason across images and video and audio…This is not the same as what we do with Imagen. Imagen is a diffusion model. A diffusion model is…used to generate super high-fidelity text-to-image…Imagen is not a replacement for the image functionality in Gemini. Two different technologies for two different purposes.”
Another question posed by another journalist asking when Google would reenable Gemini’s image functionality went unanswered.
Source link lol