KEMBAR78
Foundation Models for Agentic AI | NVIDIA Nemotron

NVIDIA Nemotron

Open and efficient multimodal models for agentic AI.

Overview

What Is NVIDIA Nemotron?

NVIDIA Nemotron™ is a family of open models, datasets, and technologies that empower you to build highly efficient, accurate, and specialized agentic AI. The models excel in graduate-level scientific reasoning, advanced math, coding, instruction following, tool calling, visual reasoning, and retrieval-augmented generation (RAG).

Nemotron is integrated by the AI ecosystem, so you can easily customize and deploy models with open-source NVIDIA software anywhere, from edge to cloud.

Open Secret: How NVIDIA Nemotron Models, Datasets, and Techniques Fuel AI Development

Learn how open-source AI technology like Nemotron provides the transparency and trust businesses need to successfully adopt AI.

Industry Pioneers Build Smarter AI Agents With NVIDIA Nemotron and Cosmos™ Reasoning Models

Designed for enterprise and physical AI applications, open reasoning models think up to 9x faster, speeding inference and lowering costs for agent platforms spanning customer service, cybersecurity, manufacturing, logistics, and robotics.

Video

Why NVIDIA Built Nemotron

Hear from Bryan Catanzaro, VP of applied deep learning research at NVIDIA, as he shares the vision behind Nemotron and why open technologies are essential for building trusted, enterprise-ready AI.

Benefits

What Does Nemotron Bring to Agentic AI?

Open Models

NVIDIA’s open data and optimization techniques ensure powerful, transparent, and adaptable models for developers and enterprises. Models and training data are published openly on Hugging Face.

High Compute Efficiency

Through the pruning of larger models, the Nemotron family is optimized for top compute efficiency, using NVIDIA TensorRT™-LLM to deliver higher throughput and on-or-off reasoning capabilities.

High Accuracy

Built on popular open reasoning models for their exceptional knowledge, post-trained with high-quality training data, and aligned to reason like humans, Nemotron models achieve the highest accuracy on leading benchmarks.

Secure and Simple Deployment

The Nemotron model family, available as optimized NIM microservices, offers peak inference performance and flexible deployment options, ensuring superior security, privacy, and portability.

Models

Models for Diverse Workloads

Nemotron models excel in vision for enterprise optical character recognition (OCR) and in reasoning for building agentic AI. Research models are also available for experimentation and customization.

Nano

Provides superior accuracy for PC and edge devices.

The newly announced Nemotron Nano 2 supports a configurable thinking budget, enabling enterprises to control token generation and deploy optimized agents on edge devices.

Super

Offers the highest accuracy and throughput in its size category to run on a single NVIDIA H100 Tensor Core GPU.

The newly announced Llama Nemotron Super 1.5 is optimized for NVIDIA Blackwell architecture with NVFP4 format, delivering up to 6x higher throughput on NVIDIA B200 compared with FP8 on NVIDIA H100.

Ultra

Delivers the highest agentic AI accuracy for complex systems, optimized for multi-GPU data centers.

Technology

Building Blocks for Agentic AI

Start building AI agents with NVIDIA NeMo™ for custom agentic AI, NVIDIA NIM for fast, enterprise-ready deployment, and NVIDIA Blueprints for accelerating development with customizable reference workflows.

NVIDIA NIM

  • Speed up deployment of performance-optimized generative AI models.
  • Run your business applications with stable and secure APIs backed by enterprise-grade support.

NVIDIA Blueprints

  • Quickly get started with reference applications for generative AI use cases, such as enterprise deep research and multimodal retrieval-augmented generation (RAG).
  • Accelerate development with Blueprints, which include partner microservices, one or more AI agents, reference code, customization documentation, and a Helm chart for deployment.

NVIDIA NeMo

  • Build, customize, and deploy generative AI and agentic AI.
  • Deliver enterprise-ready large language models (LLMs) with precise data curation, cutting-edge customization, scalable data ingestion, RAG, and accelerated performance.
  • Easily build data flywheels and continuously optimize AI agents with the latest information.

Starting Options

Ways to Get Started With Nemotron

Start Prototyping for Free

Get started with easy-to-use API endpoints for NIM, powered by DGX™ Cloud.

  • Access fully accelerated AI infrastructure.
  • Ensure your data isn't used for model training.
  • No credits, just a simple path to build, test and deploy.

Get in Touch

Talk to an NVIDIA AI specialist about moving generative AI pilots to production with the security, API stability, and support that comes with NVIDIA AI Enterprise.

  • Explore your generative AI use cases.
  • Discuss your technical requirements.
  • Align NVIDIA AI solutions to your goals and requirements.

Adopters

Enterprises Using Nemotron

Resources

Explore the Latest in Nemotron

Why NVIDIA Built Nemotron

Learn how Nemotron accelerates innovation, empowers developers, and shapes the future of AI.

How ServiceNow Is Pushing Document Intelligence Forward

Learn how access to Nemotron’s model weights, datasets, and training recipes enabled deeper evaluation, what ServiceNow discovered about visual Q&A accuracy, and why openness matters for continuous improvement in multimodal AI.

Reasoning On/Off: Navigating a Wedding Seating Chart With AI Reasoning

See how an LLM with AI reasoning capabilities thinks outside the box to come up with a solution to a wedding seating chart while navigating family dynamics and guest preferences.

FAQs

NVIDIA Nemotron models aren't just open, but truly open source. NVIDIA publishes the training datasets, techniques, and model weights so the open-source community can benefit from our learnings and use these resources to create their own models.

The NVIDIA Open Model License is a permissive license that allows users to use, modify, distribute, and commercially deploy the models and derivatives without crediting NVIDIA, to encourage innovation and further development of generative AI.

Yes, you can download and run NVIDIA Nemotron models from Hugging Face for free in production.

NVIDIA also offers Nemotron models as NVIDIA NIM microservices for secure, scalable deployment, which requires an NVIDIA AI Enterprise license. You can try the Nemotron models and download the NIM microservices from build.nvidia.com.

Yes, NVIDIA is committed to publishing more Nemotron models, datasets, and techniques to enable open-source ecosystems.

NVIDIA Nemotron models are built on top of frontier open models, making it possible to build better models faster. Additionally, NVIDIA publishes the model weights, training datasets, and training techniques so the developer community can use these different parts of Nemotron to train their own models.

Yes. NVIDIA built the Llama Nemotron models on top of the Llama model family using NVIDIA’s open datasets and advanced techniques, such as Neural Architecture Search (NAS). The Llama Nemotron models inherit the parent Llama model license.

NVIDIA provides a variety of tools, such as NVIDIA Dynamo, TensorRT-LLM, and NIM, to run Nemotron models at scale in production. You can also use popular open-source libraries, such as SGLang and vLLM.

Next Steps

Ready to Get Started?

Use the right tools and technologies to take NVIDIA Nemotron models from development to production.

Get in Touch

Talk to an NVIDIA product specialist about moving from pilot to production with the security, API stability, and support that comes with NVIDIA AI Enterprise.

Stay Up to Date on NVIDIA Agentic AI News

Get the latest agentic AI news, technologies, breakthroughs, and more sent straight to your inbox.