Evaluation Tutorials#

Use these tutorials to become familiar with NVIDIA NeMo Evaluator.

Tip

Tutorials are organized by complexity and typically build on one another.

Before You Start#

Set up Evaluator before following these tutorials. Refer to the Demo Cluster Setup on minikube or production deployment guides for the platform and Evaluator individually.

Tip

The tutorials reference an EVALUATOR_BASE_URL whose value will depend on the ingress in your particular cluster. If you are using the minikube demo installation, it will be http://nemo.test. The demo installation’s for NIM_PROXY_BASE_URL is http://nim.test. Otherwise, you will need to consult with your own cluster administrator for the ingress values.


Run an Academic LM Harness Eval

Learn how to run an evaluation.

Run an Academic LM Harness Eval
Run an LLM Judge Eval

Learn how to evaluate a fine-tuned model using the LLM Judge metric with a custom dataset.

Run an LLM Judge Eval
Evaluate a Fine-tuned Model

Learn how to evaluate a fine-tuned model.

Customize and Evaluate Large Language Models