KEMBAR78
NVIDIA Training Certified Instructor Program (CIP) | NVIDIA

Certified Instructor Program

Learn what it takes to become an NVIDIA-Certified Instructor.

Program Overview

The NVIDIA Training Certified Instructor Program (CIP) enables candidates to become certified to teach NVIDIA workshops. The program connects qualified instructors with high-quality training and hands-on course materials, lab access, and fully configured, GPU-accelerated workstations in the cloud. Through this program, candidates can be certified to teach workshops offered by the:

  • Deep Learning Institute (DLI): Provides training designed for those who build AI applications, including developers, data scientists, and software engineers.
  • NVIDIA Academy: Provides training designed for those who build, deploy, operate, and maintain AI infrastructure, including administrators and network engineers.

For an overview of the workshops we offer by topic, browse our course catalog here.

General Qualifications

To qualify for this program, candidates must fit into one of the following categories:

  • Associated with an organization that is an NVIDIA Authorized Learning Partner or Education Services Partner
  • Currently employed by an academic institution that qualifies for the DLI University Ambassador Program (for DLI workshops only)
  • Employed by NVIDIA

In addition, candidates must have requisite knowledge of the technology covered in the workshop(s) they wish to teach.

Consideration for acceptance into the program is based on:

  • The candidate’s teaching experience and knowledge of subject
  • Workshop availability in the candidate’s country/region
  • Acknowledgment of having read and accepted the Certified Instructor Agreement

Certification Process

Instructor candidates must complete rigorous, workshop-specific evaluations covering their technical qualifications, subject matter expertise, mastery of workshop content, and classroom delivery skills, as well as training on effective use of the content delivery platform.  Certified instructors are expected to remain current on any workshop updates in order to maintain their certification. 

For DLI, instructor certification candidates need to complete the student assessment, followed by an interview with a Principal Instructor.

For the Academy, candidates need to complete a recorded training review, participate in live training and one or two train-the-trainer sessions, provide a teach-back session, and pass the certification exam.

The certification process includes the following steps: 

Teaching Experience Required

All candidates must demonstrate teaching experience, such as:

  • Classroom or virtual teaching experience delivering technical content to network or system professionals 
  • Significant presentation experience in instructor-led settings, including remote delivery via platforms like Teams or WebEx 
  • The ability to facilitate hands-on labs and guide troubleshooting exercises in a virtual environment 

Candidates must be able to effectively communicate complex technical concepts, adapt to varying learner skill levels, and foster an interactive, hands-on learning environment aligned with NVIDIA’s training standards.

DLI Workshop Qualifications

View the information for each workshop to ensure you meet the qualifications to teach it.

Academy Workshop Qualifications

View the information for each workshop to ensure you meet the qualifications to teach it.

Apply

Complete and submit the  certified instructor application here.

Get Certified

At the beginning of each month, accepted applicants will be placed into a cohort that most closely aligns with their interests and expertise. Each cohort then goes through these steps:

  • DLI Instructor Candidates 
    • Complete the workshop as a student, including any programming assessments.
    • Optionally, review the recording of the workshop.
    • Pass an interview with an NVIDIA Principal Instructor.
  • Academy Instructor Candidates 
    • Review the recorded training session.
    • Participate in live training and one or two “train-the-trainer” sessions.
    • Successfully deliver a teach-back session.
    • Pass the relevant certification exam.

Note that all instructors are subject to periodic performance reviews.

Start Teaching

Once you have become certified, you can now start teaching.

To schedule a workshop, log into the Certified Instructor Portal and submit the Request Workshop form.

Certified Instructors participate in ongoing evaluations, continuing education, workshop content reviews, student feedback reviews, and other activities as needed.

Get Certified in Additional Workshops

Take your skills to the next level by earning certifications in additional NVIDIA workshops. Broaden your teaching portfolio, unlock new professional opportunities, and deepen your mastery of cutting-edge AI, data science, and accelerated computing technologies.

Maintain Your Certified Status

To maintain your status, certified instructors are required to:

  • Maintain student positive feedback ratings.
  • Satisfy certification requirements for the current version of each course they are certified to deliver. Recertification is generally required for major content updates and to reinstate an instructor as “active” after they have become “inactive.”
  • Renew membership annually.
  • Adhere to the guidelines detailed in the Certified Instructor Agreement.
  • Ambassadors are required to deliver at least two NVIDIA workshops per year, to at least 40 total students.

Additional Resources

List of All DLI Workshops

View the latest list of DLI workshops.
The list can be filtered by topic.

List of All Academy Workshops

View the latest list of all Academy workshops.

NVIDIA Certified Instructor Directory

Looking to partner with a University Ambassador for an upcoming workshop?

Get Started Today

View our learning paths and select either your first or your next workshop to get certified to teach.

Questions?

Reach out if you have questions about our certified instructor program.

Stay Informed

Get training news, announcements, and more from NVIDIA, including the latest information on new self-paced courses, instructor-led workshops, free training, discounts, and more. You can unsubscribe at any time.

Fundamentals of Accelerated Computing with Modern CUDA C++

Practical Experience: 
Candidates must demonstrate significant prior experience working with NVIDIA CUDA/GPU-accelerated applications, either in a professional or meaningful academic scenario, and should be able to discuss their work, related to the following:

  • How meaningful acceleration was achieved on a problem that could not be addressed as successfully in a CPU-only environment
  • Details about the applied strategies the applications use in relation to a GPU architecture
  • Technical challenges encountered, tailored to CUDA-specifics, and how they were addressed


Knowledge and Expertise:
Candidates should have the following:

  • Basic understanding of computer architecture (memory hierarchies, computing cores, etc.)
  • Foundational knowledge in parallel computing
  • Awareness of race conditions and familiarity with methods to prevent them
  • An understanding of synchronization mechanisms between threads/processes
  • Medium to advanced knowledge and experience in modern C++ programming, including understanding classes, functors, and Lambda functions
  • Knowledge and experience with the C++ Standard Template Library (STL), including extensive use of iterators

Fundamentals of Accelerated Computing with CUDA Python

Practical Experience: 
Candidates must provide evidence of significant work with an NVIDIA CUDA-accelerated application, either in a professional or meaningful academic scenario, and be prepared to talk about this work with others. They should be able to discuss:

  • How their applications provide meaningful acceleration on a problem that could not be addressed as successfully in a CPU-only environment
  • The specifics of the optimization strategies the applications use
  • Specific CUDA-related technical challenges that arose while developing the applications


Knowledge and Expertise:
Candidates should also have the following:

  • Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations. Basic NumPy competency, including familiarity with arrays and functions

Fundamentals of Deep Learning

Practical Experience: 
Candidates must demonstrate experience working on a computer vision task—image classification, object detection, etc.—using deep learning in either a professional or academic setting. Foundational knowledge of natural language processing (NLP), reinforcement learning (RL), and other neural network architectures, such as RNNs/LSTMs and GANs, is required. Qualifying experience includes:

  • A professional role (e.g., data engineer, data scientist) architecting computer vision projects that use deep learning
  • Academic coursework in computer vision, NLP, RL, and neural network architectures


Knowledge and Expertise:
Candidates should also have the following:

  • Familiarity with basic programming fundamentals, such as functions and variables
  • Basic Python competency

Building Transformer-Based Natural Language Processing Applications

Practical Experience: 
Candidates must demonstrate experience working on at least one natural language processing (NLP) application using a transformer-based architecture (such as BERT), either in a commercial or academic capacity, and explain their work. Qualifying experience includes:

  • A professional role (e.g., engineer, data scientist) on an NLP project that used a transformer-based architecture
  • A completed NLP project that used a transformer-based architecture
  • Academic coursework in NLP transformer-based networks


Knowledge and Expertise:
Candidates should also have the following:

  • Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, array manipulations, and class objects/methods
  • Basic pandas and NVIDIA NeMo competency
  • Experience using NVIDIA Triton Inference Server

Enhancing Data Science Outcomes with Efficient Workflow

Practical Experience: 
Candidates must demonstrate significant experience with data science in Python using distributed computing for large datasets and should be able to discuss the following about their work:

  • Specifics about all aspects of their end-to-end workflows, including explaining their decisions and speaking knowledgeably about tools and libraries used
  • The use of various data transformations applied on input data for model consumption
  • The use of various machine learning algorithms in their work and explain their decisions
  • Extensive use of Python data science libraries like pandas, NumPy, scikit-learn, and xgboost
  • Previous work with or on NVIDIA RAPIDS and Dask
  • Recognition of the iterative nature of data science and appreciation of hardware acceleration for rapid experimentation


Knowledge and Expertise:
Candidates should have the following:

  • Experience with Python and common data science libraries like pandas, NumPy, scikit-learn, and xgboost
  • Proficiency with DataFrame manipulation
  • Familiarity with distributed computing using Dask
  • Familiarity with end-to-end machine learning workflow
  • Proficiency with various machine learning models, specifically those of tree-based variant
  • Proficiency with model performance metrics, such as accuracy and inference performance
  • Familiarity with model tuning and its benefits
  • Knowledge of NVIDIA’s RAPIDS, NVTabular, and Triton Inference Server

Fundamentals of Accelerated Data Science

Practical Experience: 
Candidates must demonstrate significant experience with Data Science in Python and should be able to discuss about their previous work:

  • Specifics about all aspects of their end-to-end workflows, explaining their decisions, and speaking knowledgeably about tools and libraries used
  • The use of many DS/ML algorithms in their work, explaining their decisions
  • Extensive use of Python DS libraries like Pandas, NumPy, scikit-learn, NetworkX
  • Encouraged, previous work with Dask. Polars, and/or RAPIDS


Knowledge and Expertise:
Candidates should have the following:

  • Classroom teaching experience
  • Significant presentation experience

Applications of AI for Anomaly Detection

Practical Experience: 
Candidates must demonstrate significant experience in data science, machine learning, deep learning, and the telecommunications industry. They should have worked on at least one significant AI application, either in a commercial or academic capacity, and be able to explain their work. Qualifying experience includes:

  • A role as a major contributor to a project that used deep learning
  • A role as a major contributor to a project that used other machine learning techniques
  • A role as a major contributor to a project that required data science


Knowledge and Expertise:
Candidates should also have the following:

  • Professional data science experience using Python
  • A working understanding of NVIDIA RAPIDS
  • Significant experience in machine and deep learning, specifically XG Boost, AutoEncoder, and GAN models
  • Exposure to the telecommunications industry and cybersecurity, specifically networking and the threat of network intrusion

Applications of AI for Predictive Maintenance

Practical Experience: 
Candidates must demonstrate experience working on at least one deep learning application, either in a commercial or academic capacity, and explain their work. Qualifying experience includes:

  • Deep learning for time-series data, work/research experience with variations of auto-encoder models, recurrent models (LSTMs), and GANs
  • Measures of model accuracy, preferably in the context of industrial applications
  • Familiarity with machine learning techniques; a thorough understanding of XGBoost algorithm is crucial to the success of course delivery
  • A minimum of one deep learning library; Keras and TensorFlow are preferred


Knowledge and Expertise:
Candidates should also be familiar with: 

  • Deep learning concepts—at a minimum, knowledge of artificial neural networks
  • Python, and common Python libraries used in deep learning (e.g., numpy, pandas, sklearn)
  • TensorFlow and Keras

Building Conversational AI Applications

Practical Experience: 
Candidates must demonstrate experience working on at least one conversational AI application using automatic speech recognition (ASR), natural language understanding (NLU), and/or text to speech (TTS), either in a commercial or academic capacity, and explain their work. Qualifying experience includes:

  • A professional role (e.g., engineer, data scientist) on a conversational AI project that used an ASR model to transcribe spoken language and process it
  • A completed conversational AI project for a virtual assistant application
  • Academic coursework in conversational AI using neural networks


Knowledge and Expertise:
Candidates should have the following:

  • Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, array manipulations, and class objects/methods
  • Experience using NVIDIA TAO Toolkit and NVIDIA Riva
  • Basic Linux command-line experience
  • Experience using Docker
  • Experience using Helm Charts and Kubernetes

Computer Vision for Industrial Inspection

Practical Experience: 
Candidates must demonstrate experience working on at least one deep learning application, either in a commercial or academic capacity, and explain their work. Qualifying experience includes:

  • Using deep learning techniques to tackle classification problems, preferably in the context of industrial applications
  • A professional role on a computer vision project that used deep learning techniques
  • Significant coursework in deep learning for computer vision that covers the various stages of the development workflow


Knowledge and Expertise:
Candidate should have: 

  • Knowledge of Python and common Python libraries used in deep learning (e.g., numpy and pandas)
  • Familiarity with end-to-end machine learning workflow
  • Familiarity with manipulating data using pandas DataFrame
  • Familiarity with deep learning concepts, including knowledge of convolutional neural networks
  • Familiarity with at least one deep learning framework (Keras and TensorFlow are preferred)
  • Familiarity with metrics, such as accuracy and inference performance
  • Familiarity with command-line interface and basic Linux commands
  • Familiarity with transfer learning and fine-tuning models
  • Knowledge of NVIDIA’s DALI, TAO Toolkit, TensorRT, and Triton Inference Server

Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

Practical Experience: 
Candidates must demonstrate experience working on at least one deep learning application, either in a commercial or academic capacity, and explain their work. Qualifying experience includes:

  • Deploying deep learning training workloads to multiple GPUs and preferably multi-node clusters
  • Data parallel approaches to distributed deep learning
  • Profiling and optimizing the deep learning code
  • Using NVIDIA NGC containers
  • Experience in building neural networks with PyTorch
  • Using PyTorch DDP to deploy distributed training


Knowledge and Expertise:
Candidate should have the following:

  • Strong grasp of the literature discussing implications of training deep neural networks with large batches—in particular, a good understanding of the LARS/LARC algorithm
  • Knowledge of the process used in training deep neural networks—In particular, understanding of the Stochastic Gradient Descent and backpropagation algorithms

Model Parallelism: Building and Deploying Large Neural Networks

Practical Experience: 
Candidates must demonstrate experience working on a model-parallelism-related task using deep learning in either a professional or academic setting. Foundational knowledge of optimization techniques, such as activation checkpointing, mixed precision training, and gradient accumulation, is required. Qualifying experience includes:

  • A professional role (e.g., data engineer, data scientist) architecting deep learning projects that use distributed systems like the cloud or multi-GPU machines
  • Academic coursework in large neural network architectures, such as GPT-3


Knowledge and Expertise:
Candidates must also demonstrate: 

  • An understanding of the Slurm, NVIDIA Triton, and DeepSpeed technologies
  • An understanding of the differences between model and data parallelism

Adding New Knowledge to LLMs

Practical Experience: 
Candidates must demonstrate experience working on at least one LLM application involving fine-tuning, either in a commercial or academic capacity, and explain their work. Qualifying experience could include:

  • How their applications provide meaningful acceleration on a problem that could not be addressed as successfully in a CPU-only environment
  • The specifics of optimization strategies that the applications use
  • Specific NVIDIA CUDA-related technical challenges that arose while developing the applications


Knowledge and Expertise:
Candidates should have expertise in the following areas:

  • Differentiating RAG, fine-tuning, and alignment
  • Strategies to drive the creation of diverse synthetic data sets
  • Parameter-efficient fine-tuning
  • Pruning techniques
  • Distillation techniques
  • Large language model (LLM) decoding strategies, such as top-k/p and beam search.
  • LLM output evaluation techniques, such as ROUGE/BLEU, semantic similarity, and LLM-as-a-judge

Building Agentic AI Applications with LLMs

Practical Experience: 
Candidates must demonstrate experience working on at least one agentic AI application, either in a commercial or academic capacity, and explain their work. Qualifying experience could include:

  • A professional role (e.g., engineer, data scientist)
  • A completed project
  • Academic coursework


Knowledge and Expertise:
Candidates should have experience with the following:

  • Modern LangChain, including LCEL, LangGraph, etc.
  • Differentiating capabilities of various agentic tools, such as LangGraph, CrewAI, and Autogen
  • Stateful large language model (LLM) systems
  • LLM tool calling
  • Strategies to prevent derailing
  • Agentic routing

Building LLM Applications with Prompt Engineering

Practical Experience: 
Candidates must demonstrate experience working on at least one large language model (LLM) application using a programmatic interface, either in a commercial or academic capacity, and explain their work. Qualifying experience could include:

  • A professional role (e.g., engineer, data scientist)
  • A completed project
  • Academic coursework


Knowledge and Expertise:
Candidates should have experience with the following:

  • Development in Python, including an understanding of Pydantic
  • Modern LangChain, including LCEL, LangGraph, etc.
  • LLM next token prediction decoding methods 
  • How LLM models are developed (pretraining, alignment, instruction-tuning, etc.)
  • LLM prompting techniques, including iterative, zero/one/few-shot, chain of thought.
  • Agents using tools, such as ReAct

Building Multimodal Pipelines with Large Language Models

Practical Experience: 
Candidates must demonstrate experience working on at least one generative AI application incorporating inputs of multiple modalities, either in a commercial or academic capacity, and explain their work. Qualifying experience could include:

  • A professional role (e.g., engineer, data scientist)
  • A completed project
  • Academic coursework


Knowledge and Expertise:
Candidates should have experience with the following:

  • Details of implementing contrastive pretraining
  • Techniques for combining embeddings of various modalities, such as vision-language model projections
  • Tools for chunking documents, including text, titles, figures, charts, and tables
  • Graph retrieval-augmented generation (RAG) and associated technology, such as knowledge bases and cypher queries

Building RAG Agents with LLMs

Practical Experience: 
Candidates must demonstrate significant experience in data science, machine learning, deep learning, and the telecommunications industry. They must have worked on at least one significant AI application, either in a commercial or academic capacity, and explain their work. Qualifying experience includes:

  • Active open-source contribution or coordination efforts in the area
  • Experience orchestrating dialog management and information retrieval systems
  • Strong applied software engineering expertise, especially surrounding microservices and inference server solutions


Knowledge and Expertise:
Candidates should have the following:

  • Strong proficiency in Python, including functional programming and server deployment
  • Expertise in large language models as inference endpoints, including industry use cases
  • Strong experience with modern LangChain (including LCEL) and LangServe required; understanding of LangGraph, LlamaIndex, Langsmith, and NeMo Guardrails useful
  • Experience with microservice/server orchestration, including Docker and FastAPI
  • Experience with modern retrieval-augmented generation (RAG), including some derivative formulations and pros/cons
  • Understanding of agentic behavior, tooling, and modular agent components
  • Intuition of evaluation metrics and performance expectations

Deploying RAG Pipelines for Production at Scale

Practical Experience: 
Candidates must demonstrate significant experience in deployment of retrieval systems. They must have worked on at least one significant AI application, either in a commercial or academic capacity, and explain their work. Qualifying experience includes:

  • Active open-source contribution or coordination efforts in the area
  • Experience orchestrating dialog management and information retrieval systems
  • Strong applied software engineering expertise, especially surrounding microservices and inference server solutions


Knowledge and Expertise:
Candidates should have experience with the following:

  • Strong proficiency in Python, including functional programming and server deployment
  • Expertise in large language models as inference endpoints, including industry use cases
  • Experience with microservice/server orchestration, including Docker and FastAPI
  • Retrieval systems, including combination of embedders and rerankers
  • Intuition of evaluation metrics and performance expectations
  • Container orchestration platforms, especially Kubernetes clusters
  • Monitoring tools like Prometheus

Generative AI with Diffusion Models

Practical Experience: 
Candidates must demonstrate thorough, up-to-date experience with deep learning, computer vision, and diffusion models. Ideal candidates should have background knowledge of surrounding material, as well as active roles that expose them to the latest trends, innovations, and emerging intuitions. Qualifying experiences include:

  • A professional role (e.g., machine learning engineer, data scientist) architecting deep learning projects that generate images
  • Active open-source contribution or coordination efforts in the area
  • Academic coursework in using AI to generate images


Knowledge and Expertise:
Candidates should have the following:

  • Proficiency in Python and PyTorch
  • Active intuitive understanding of CLIP and multimodal AEs/VAEs/GANs/Stable Diffusion
  • Intuitive understanding of audio/video/image classification/captioning/transcription
  • Foundation in statistics, including the normal distribution and random sampling

Rapid Application Development Using Large Language Models (LLMs)

Practical Experience: 
Candidates must demonstrate thorough, up-to-date experience with deep learning, large language models, and agent systems. Ideal candidates should have background knowledge of surrounding material, as well as active roles that expose them to the latest trends, innovations, and emerging intuitions. Qualifying experiences include:

  • Chat model/multimodal model architecture design experience
  • Experience with the training loop and pipeline assumptions/intuitions
  • Active open-source contribution or coordination efforts in the area
  • Experience orchestrating dialog management and information retrieval systems


Knowledge and Expertise:
Candidates should have the following:

  • Advanced proficiency with Python, sufficient for reading Hugging Face source code
  • Hugging Face comfort, including serialization, model release, HF Transformers, etc.
  • Experience designing systems with LLM constituent components
  • Familiarity with PyTorch, deep learning, generative AI, multimodal models, etc.
  • Understanding of experimentation/deployment with LLM systems, including hardware requirements, safety considerations, and evaluation techniques.
  • Intuitive understanding of audio/video/image classification/captioning/transcription
  • Active intuitive understanding of CLIP and multimodal AEs/VAEs/GANs/Stable Diffusion
  • LangChain experience, including intuitions and details of current developments
  • Familiarity with retrieval-augmented generation (RAG), including LlamaIndex, VDB services, retriever models, and more
  • Comfort with NVIDIA value propositions surrounding LLMs, RAG, NVIDIA NeMo, etc.

NVIDIA Isaac for Accelerated Robotics

Practical Experience: 
Candidates must demonstrate experience working on at least one robotics simulation project, either in a commercial or academic capacity, and explain their work. Qualifying experience could include:

  • A professional role (e.g., engineer, data scientist)
  • A completed project
  • Academic coursework


Knowledge and Expertise:
Candidates should have experience with the following:

  • NVIDIA Isaac Sim
  • OpenUSD
  • URDF models
  • ROS2
  • Robotic navigation, i.e., SLAM
  • Synthetic data generation

Cumulus Linux Administration

Candidates must demonstrate comprehensive, up-to-date expertise in data center networking. Ideal candidates have hands-on expertise in advanced AI networking technologies and real-time monitoring.

Practical Experience: 

  • Professional experience (e.g., network engineer, system administrator, infrastructure engineer, solutions architect, DevOps, trainers) deploying, configuring, and managing Cumulus Linux-based network environments in production data centers.


Knowledge and Expertise:

  • Proficiency in Linux administration (shell, config management, troubleshooting) 
  • Strong knowledge of Ethernet networking, switching, and routing 
  • NVIDIA networking hardware experience preferred 
  • Layer 2 and Layer 3 networking: VLANs, bridging, trunking, link aggregation (LAG/MLAG), SVIs, VRR, VRF, and BGP (including BGP unnumbered)
  • Network virtualization using VXLAN and EVPN, including both symmetric and asymmetric routing models 
  • Experience with network automation tools and workflows (e.g., Ansible, REST APIs, Zero Touch Provisioning)
  • Monitoring, diagnostics, and troubleshooting across multiple network layers, including hardware resource monitoring and OpenTelemetry 


Preferred: 

  • NVIDIA Cumulus Linux or AI Networking certification 
  • Active involvement in open-source networking projects or community

Spectrum-X Networking Platform Administration

Candidates must demonstrate comprehensive, up-to-date expertise in data center networking. Ideal candidates have hands-on expertise in advanced AI networking technologies and real-time monitoring. 

Practical Experience: 

  • Professional roles such as network engineer, DevOps, technical instructors, or system administrators working with Spectrum-X, Cumulus Linux, and AI data center networks 
  • Cumulus Linux Certified Instructor 
  • Experience with NVIDIA Air cluster simulation and deployment 
  • Real-time monitoring and troubleshooting with NVIDIA NetQ, Cumulus Linux CLI, and telemetry tools (ASIC, OTLP, DTS) 


Knowledge and Expertise:

  • Cumulus Linux: Hands-on experience with installation, configuration, upgrades, Layer 2/3 features, network virtualization (VXLAN/EVPN), automation, and troubleshooting 
  • Networking: Strong knowledge of Ethernet, switching, routing, and data center networking automation 
  • Linux: Proficiency in Linux administration and managing Linux-based network environments 


Preferred:

  • NVIDIA Spectrum-X or AI Networking certification 
  • Active involvement in open-source networking projects or community

AI Infrastructure

Candidates must demonstrate comprehensive, up-to-date expertise in deploying and managing AI data center infrastructure, including compute, networking, storage, and virtualization. Ideal candidates have hands-on experience with advanced AI infrastructure technologies and workflows. 

Practical Experience: 

  • Professional roles such as data center administrator, DevOps engineer, system administrator, or AI infrastructure engineer working with enterprise-scale AI environments
  • Direct, practical experiencewith
  • Deploying and managing AI compute platforms (GPUs, CPUs, DPUs) 
  • Building and maintaining InfiniBand and Ethernet networks for AI workloads 
  • Storage architecture and performance optimization for AI data centers 
  • Virtualization technologies (VMs, containers, GPU partitioning with vGPU/MIG) 
  • Installing and managing NVIDIA software (GPU drivers, DOCA, NVIDIA NGC containers, NVIDIA AI Enterprise Suite) 
  • Using management tools such as NVIDIA Base Command Manager (BCM) for AI cluster provisioning and operations 


Knowledge and Expertise:

  • Networking: Strong knowledge of Ethernet and InfiniBand, switching, routing, and advanced data center networking automation 
  • Linux: Proficiency in Linux system administration (user management, configuration, troubleshooting) 
  • Storage: Understanding of file systems, storage protocols, and performance testing 
  • Virtualization: Experience with VMs, containers, and GPU virtualization 
  • AI Concepts: Familiarity with machine learning, deep learning, and common AI applications

AI Operations

Candidates must demonstrate comprehensive, up-to-date expertise in operating and managing AI data center environments, including compute, networking, storage, and virtualization. Ideal candidates have hands-on experience with advanced AI data center operations and workflows. 

Practical Experience: 

  • Professional roles such as data center administrator, DevOps engineer, system administrator, AI infrastructure engineer, or data scientist working with enterprise-scale AI environments 
  • Direct, practical experience with: 
  • Operating and managing AI compute platforms (GPUs, CPUs, DPUs) 
  • Provisioning and managing AI workloads and virtualization in data centers 
  • Building and maintaining InfiniBand and Ethernet networks for AI workloads 
  • Storage architecture and performance optimization for AI data centers 
  • Virtualization technologies (VMs, containers, GPU partitioning) 
  • Installing and managing NVIDIA software (GPU drivers, DOCA, NGC containers, AI Enterprise Suite) 
  • Using management tools such as NVIDIA DCGM, UFM, and BlueField management utilities 


Knowledge and Expertise:

  • Networking: Strong knowledge of Ethernet and InfiniBand, switching, routing, and data center networking automation 
  • Linux: Proficiency in Linux system administration (user management, configuration, troubleshooting) 
  • Storage: Understanding of file systems, storage protocols, and performance testing 
  • Virtualization: Experience with VMs, containers, and GPU virtualization 
  • AI Concepts: Familiarity with machine learning, deep learning, and common AI applications