Nvidia’s Nemotron model families will advance AI agents

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

As part of its bevy of AI announcements at CES 2025 today, Nvidia announced Nemotron model families to advance agentic AI.

Available as Nvidia NIM microservices, open Llama Nemotron large language models and Cosmos Nemotron vision language models can supercharge AI agents on any accelerated system.

Nvidia made the announcement as part of CEO Jensen Huang’s opening keynote today at CES 2025.

Agentic AI

Artificial intelligence is entering a new era — the age of agentic AI — where teams of specialized agents can help people solve complex problems and automate repetitive tasks.

With custom AI agents, enterprises across industries can manufacture intelligence and achieve unprecedented productivity. These advanced AI agents require a system of multiple generative AI models optimized for agentic AI functions and capabilities. This complexity means that the need for powerful, efficient enterprise-grade models has never been greater.

“AI agents is the next robotic industry and likely to be a multibillion-dollar opportunity,” Huang said.

The Llama Nemotron family of open large language models (LLMs) is intended to provide a foundation for enterprise agentic AI. Built with Llama, the models can help developers create and deploy AI agents across a range of applications, including customer support, fraud detection, and product supply chain and inventory management optimization.

To be effective, many AI agents need both language skills and the ability to perceive the world and respond with the appropriate action.

Words and Visuals

Nvidia Nemotron

With the new Nvidia Cosmos Nemotron vision language models (VLMs) and Nvidia NIM microservices for video search and summarization, developers can build agents that analyze and respond to images and video from autonomous machines, hospitals, stores and warehouses, as well as sports events, movies and news. For developers seeking to generate physics-aware videos for robotics and autonomous vehicles, Nvidia today separately announced Nvidia Cosmos world foundation models.

The Nemotron models optimize compute efficiency and accuracy for AI agents built with Llama foundation models — one of the most popular commercially viable open-source model collections, downloaded over 650 million times — and provide optimized building blocks for AI agent development.

The models are pruned and trained with Nvidia’s latest techniques and high-quality datasets for enhanced agentic capabilities. They excel at instruction following, chat, function calling, coding and math, while being size-optimized to run on a broad range of Nvidia accelerated computing resources.

“Agentic AI is the next frontier of AI development, and delivering on this opportunity requires full-stack optimization across a system of LLMs to deliver efficient, accurate AI agents,” said Ahmad Al-Dahle, vice president and head of GenAI at Meta, in a statement. “Through our collaboration with Nvidia and our shared commitment to open models, the Nvidia Llama Nemotron family built on Llama can help enterprises quickly create their own custom AI agents.”

Early adopters

Leading AI agent platform providers including SAP and ServiceNow are expected to be among the first to use the new Llama Nemotron models.

“AI agents that collaborate to solve complex tasks across multiple lines of the business will unlock a whole new level of enterprise productivity beyond today’s generative AI scenarios,” said Philipp Herzig, chief AI officer at SAP, in a statement. “Through SAP’s Joule, hundreds of millions enterprise users will interact with these agents to accomplish their goals faster than ever before. Nvidia’s new open Llama Nemotron model family will foster the development of multiple specialized AI agents to transform business processes.”

“AI agents make it possible for organizations to achieve more with less effort, setting new standards for business transformation,” said Jeremy Barnes, vice president of platform AI at ServiceNow, in a statement. “The improved performance and accuracy of Nvidia’s open Llama Nemotron models can help build advanced AI agent services that solve complex problems across functions, in any industry.”

The Nvidia Llama Nemotron models use Nvidia NeMo for distilling, pruning and alignment. Using these techniques, the models are small enough to run on a variety of computing platforms while providing high accuracy as well as increased model throughput.

The Nemotron models will be available as downloadable models and as Nvidia NIM microservices that can be easily deployed on clouds, data centers, PCs and workstations. They are intended to offer enterprises industry-leading performance with reliable, secure and seamless integration into their agentic AI application workflows.

Customize and connect to business knowledge with Nvidia NeMo

The Llama Nemotron and Cosmos Nemotron model families are coming in Nano, Super and Ultra sizes to provide options for deploying AI agents at every scale.

● Nano: The most cost-effective model optimized for real-time applications with low latency, ideal for deployment on PCs and edge devices

● Super: A high-accuracy model offering exceptional throughput on a single GPU

● Ultra: The highest-accuracy model, designed for data-center-scale applications demanding the highest performance

Enterprises can also customize the models for their specific use cases and domains with Nvidia NeMo microservices to simplify data curation, accelerate model customization and evaluation, and apply guardrails to keep responses on track.

With Nvidia NeMo Retriever, developers can also integrate retrieval-augmented generation (RAG) capabilities to connect models to their enterprise data.

And using Nvidia Blueprints for agentic AI, enterprises can create their own applications using Nvidia’s advanced AI tools and end-to-end development expertise. In fact, Nvidia Cosmos Nemotron, Nvidia Llama Nemotron and NeMo Retriever supercharge the new Nvidia Blueprint for video search and summarization (announced separately today).

NeMo, NeMo Retriever and Nvidia Blueprints are all available with the Nvidia AI Enterprise software platform.

Availability

Llama Nemotron and Cosmos Nemotron models will be available as hosted APIs and for download on build.nvidia.com and on Hugging Face. Access for development, testing and research is free for members of the Nvidia Developer Program.

Enterprises can run Llama Nemotron and Cosmos Nemotron NIM microservices in production with the Nvidia AI Enterprise software platform on accelerated data center and cloud infrastructure.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link lol