Amazon launches Nova AI model family for generating text, images and videos

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

As one of the biggest tech companies in the world, Amazon’s position in the ongoing generative AI race has been mainly focused on building out its developer tools and platforms — as well as providing significant funding for startup Anthropic.

But no longer: as announced today by CEO Andy Jassy at the annual Amazon Web Services (AWS) re:Invent conference, the e-commerce giant is fielding a whole new AI model family called Nova which allows users to generate text, images, and videos — pitting it right up against the likes of OpenAI, Google, and even its own investment Anthropic.

Several of the new models — including the text, image, and video offerings — are available now here, though you’ll need an Amazon Bedrock account to access them, with a speech-to-speech audio generation model said to be coming in 2025.

Super nova

The Amazon Nova suite introduces several models tailored to specific use cases, all supporting more than 200 languages:

• Amazon Nova Micro: A text-only model optimized for low-latency responses at minimal cost.

• Amazon Nova Lite: A multimodal model offering fast processing for text, images, and videos at a very low cost.

• Amazon Nova Pro: A multimodal model combining accuracy, speed, and cost-efficiency, designed for a wide range of tasks.

• Amazon Nova Premier: The most advanced multimodal model for complex reasoning tasks and for distilling custom models (launching in Q1 2025).

• Amazon Nova Canvas: An advanced image generation model for creative content development.

• Amazon Nova Reel: A state-of-the-art video generation model offering dynamic capabilities.

All models support fine-tuning and knowledge distillation, allowing customers to tailor AI tools to their proprietary data for improved accuracy and performance.

These models excel in supporting Retrieval Augmented Generation (RAG), which grounds outputs in specific organizational data to enhance reliability.

An image canvas and complex camera controls

The Nova Canvas and Reel models highlight Amazon’s push into creative content generation:

• Nova Canvas: Users can edit images through natural language text prompts and adjust layouts or color schemes. Built-in safety measures, such as watermarking and content moderation, ensure responsible AI usage.

• Nova Reel: This video generation model supports advanced features, including camera motion controls like panning, zooming, and 360-degree rotations. It allows for the creation of dynamic six-second videos, with additional functionalities expected in the future.

Human evaluations have validated the model’s capabilities. Nova Reel outperformed Runway’s Gen-3 Alpha in A/B testing, achieving winning rates of 61.4% for video quality and 71.6% for video consistency.

Integrated with Bedrock (duh)

Unsurprisingly, the Amazon Nova models are deeply integrated with its Bedrock fully managed service that simplifies access to high-performing AI models through a single API.

Customers can use this platform to experiment, evaluate, and deploy Nova models or other foundation models available on Bedrock.

There are also options for fine-tuning and distillation, allowing users to adapt models to their specific needs.

Designed for brands

Rohit Prasad, Senior Vice President of Amazon Artificial General Intelligence, noted that Amazon Nova is designed to address common challenges faced by application builders.

The models deliver advances in latency, cost-effectiveness, and information grounding, providing flexible and powerful solutions for both internal and external customers.

Brands using Amazon Nova tools in advertising have reported significant improvements, including a fivefold increase in the number of products advertised and a doubling of images per product.

These tools also enable advertisers to explore new strategies, such as keyword-level creative optimization and video advertising.

More to come

Amazon has announced plans to expand the Nova family in 2025 with two additional models:

• A speech-to-speech model for natural, humanlike verbal interactions.

• An any-to-any modality model that can process and generate text, images, audio, and video, enabling seamless translation and editing across modalities.

Amazon emphasizes safety and transparency with integrated protections across all Nova models. The company has introduced AWS AI Service Cards, offering clear documentation on use cases, limitations, and responsible AI practices. Features like watermarking and content moderation are embedded to ensure compliance with ethical standards.

Amazon Nova represents a significant step in the company’s AI journey, bringing innovative generative AI tools to businesses and individuals. As these tools become more widely available, Amazon continues to prioritize delivering real-world value to its customers

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat’s Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link lol