KEMBAR78

About Fine-Tuning — NVIDIA NeMo Microservices

About Fine-Tuning#

Learn how to fine-tune a model by making requests to the NVIDIA NeMo Customizer microservice through the API. Fine-tuned models you have created can be deployed using NVIDIA NIMs.

Fine-Tuning Workflow#

At a high level, the fine-tuning workflow consists of the following steps:

Choose a model or create a new customization target and customization config to train a custom model.
Format a compatible dataset.
Create a customization job.
Monitor the job until it completes.
If training all_weights, deploy the model using Deployment Management Service.
Move on to Evaluate the output model.

Model Catalog#

Explore the model families and sizes supported by the NVIDIA NeMo Customizer microservice.

Llama Models

View the available Llama models in the model catalog.

Llama Nemotron Models

View the available Llama Nemotron models from NVIDIA, including Nano and Super variants for efficient and advanced instruction tuning.

Llama Nemotron Models

Phi Models

View the available Phi models from Microsoft, designed for strong reasoning capabilities with efficient deployment.

GPT-OSS Models

View the available GPT-OSS models supported for Full SFT customization.

Embedding Models

View the available embedding models for question-answering and retrieval tasks.

Embedding Models

For hardware compatibility, A100 configurations work with B200 GPUs. Refer to configuration management for details.

Task Guides#

Perform common fine-tuning tasks.

Manage Customization Targets

Create, list, view, and delete customization targets.

Manage Customization Targets

Manage Customization Configs

View available customization configurations to use when creating a customization job.

Manage Customization Configuration

Manage Customization Jobs

Create, list, view, and cancel customization jobs.

Manage Customization Jobs

Tutorials#

Follow these tutorials to learn how to accomplish common fine-tuning tasks.

Format Training Datasets

Learn how to format datasets for different model types.

datasets chat-models completion-models

Format Training Dataset

Start a LoRA Customization Job

Learn how to start a LoRA customization job using a custom dataset.

nemo-customizer

Start a LoRA Model Customization Job

Start a Full SFT Customization Job

Learn how to start a SFT customization job using a custom dataset.

nemo-customizer

Start a Full SFT Customization Job

Start a Knowledge Distillation Job

Learn how to start a Knowledge Distillation (KD) job using a teacher and student model.

nemo-customizer distillation

Start a Knowledge Distillation (KD) Customization Job

Check Customization Job Metrics

Learn how to check job metrics using MLFlow or Weights & Biases.

nemo-customizer mlflow wandb

Checking Your Customization Job Metrics

Optimize Tokens per GPU

Learn how to optimize the token-per-GPU throughput for a LoRA optimization job.

nemo-customizer wandb sequence-packing

Optimize for Tokens/GPU Throughput

References#

Hyperparameters

View the available hyperparameters and their valid options that you can set when creating a customization job.

Hyperparameter Options

Customizer API

View the OpenAPI specification for Customizer.

Troubleshoot Failed Jobs

View troubleshooting tips for failed jobs.

Troubleshooting NeMo Customizer