0% found this document useful (0 votes)

48 views21 pages

MLOps

This document discusses MLOps and how to operate a machine learning system productively. It covers topics like source control, CI/CD tools, feature stores, data and ML pipelines, model registries, metadata stores, and performance monitoring, which are important components of MLOps.

Uploaded by

project111995

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views21 pages

MLOps

Uploaded by

project111995

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

MLOps

How to operate a ML system productively?

groups/mlopsvn
About me

➢ Working profile
○ 2016-2017: Machine Learning Engineer Freelancer
○ 2017-2019: Machine Learning Researcher @ University of Aizu
○ 2019-2020: Machine Learning Engineer @ Heligate
○ 2020-2022: Senior Machine Learning Engineer @ One Mount
○ 2022-present:
■ Expert Machine Learning Engineer @ MSB
■ Admin of MLOps VN

➢ Contact: https://www.linkedin.com/in/quan-dang/

groups/mlopsvn
Ordinary ML workflow

Manual process

groups/mlopsvn 3
❖ Manual executions
❖ Disconnection b/w DS & Ops engineers
Any problems? ❖
❖
Infrequent release iterations
No CI/CD
❖ No monitoring

Manual process

groups/mlopsvn 4
A new design with
new components
➢ Source control
➢ CI/CD tool
➢ Feature Store
➢ Data/ML Pipelines
➢ Model Registry
➢ ML Metadata Store
➢ Performance Monitoring

Automation
(https://ml-ops.org/content/mlops-principles)
groups/mlopsvn 5
Source control

● Git workflow
● Code version control

● Data version control

groups/mlopsvn 6
CI/CD tool

CI/CD user workflow

(https://docs.gitlab.com/ee/ci/introduction/)
groups/mlopsvn 7
Feature store
● Why?
○ Feature reuse
○ Single source for both training
and serving (consistency)
○ Monitor for drift and quality
issues

https://www.tecton.ai/blog/what-is-a-feature-store

groups/mlopsvn 8
Data pipelines

9
groups/mlopsvn
ML pipelines

10
groups/mlopsvn
ML metadata store
● Run information
○ Description
○ Start/end time, duration
○ Executor
○ Parameters
● Artifacts in each step
MLFlow experiment tracking
○ Input/output data
○ Figures
○ HTML files
● Metrics
○ ML related: mae, r2, .etc.
○ Data related: completeness,
ks-test, .etc.

Kubeflow Pipeline
Metadata
(Trevor Grant, et. al,
Kubeflow for
groups/mlopsvn Machine Learning) 11
Model registry
UI and a set of APIs to manage model:

● Model lineage (which experiment/run

produced the model)
● Model version
● Model state transitions
Model lineage and description
i.e., from staging to production
● Model description/documentation
● Model validation results/metrics

Model version
groups/mlopsvn and state 12
Performance monitoring (1) System health:

Performance level: ● Number of requests (throughput)

● Average response time (latency)
● Data integrity: inspect volume, variety, ● Number of failure requests
veracity, velocity of incoming & outgoing ● IO/Memory/CPU usage
data for detecting outliers and anomaly ● Disk utilization
● Model drift:
● System uptime
Entities Drift

X: Inputs (features) Features drift API outgoing data:

y: Outputs (labels) Target drift ● Model metadata: name, version, docker

Relationship between X Concept drift image
and y
● Input data (features)
● Business metrics and ROI ● Predictions
e.g. Click-through Rate (CTR) and ● System actions
engagement and engagement metrics in ● Explanation
13
social network company groups/mlopsvn
Performance monitoring (2)

Seldon Core API dashboard

(https://github.com/SeldonIO/seldon-core/blob/master/README.md)

groups/mlopsvn 14
Performance monitoring (3)

Payload logging with ELK stack

(https://github.com/SeldonIO/seldon-core/blob/master/README.md)

groups/mlopsvn 15
Additional: Experimentation platform
A custom platform/on premises: On cloud:

● EDA: JupyterHub, Jupyter Notebook ● AWS Sagemaker Studio

● IDE: Code server ● Google Vertex AI
● Pre-built docker images with common ● Azure Machine Learning Studio
libraries shared among team members
● Try to build your own libraries, e.g., a
custom AutoML library with data in and
model out
● Git for code versioning
● Data Versioning (e.g. DVC)
● Experiment Tracking (e.g. MLFlow)

groups/mlopsvn 16
Additional: Model serving frameworks (1)

General purpose Model agnostic Framework specific

groups/mlopsvn
Additional: Model serving frameworks (2)
Features KServe Seldon Core BentoML Triton Serve Ray Serve
(0.9.0) (1.14.0) (1.0.0) (2.23.0) (1.13.0)

GPU

Micro (adaptive) batching

Offline batch serving

Autoscaling

Scale to 0

Canary deployments

A/B tests/MAB deployments

Native Kafka Integration

GRPC

Tracing

Prometheus metrics

groups/mlopsvn
A comparison of model agnostic serving frameworks
Skill set

Roles and their intersections contributing to

the MLOps paradigm
(https://arxiv.org/pdf/2205.02302.pdf)
groups/mlopsvn 19
Study materials
● Books

● Courses
○ https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops
○ https://stanford-cs329s.github.io
○ https://fullstackdeeplearning.com
○ https://github.com/DataTalksClub/mlops-zoomcamp
○ https://github.com/alexeygrigorev/mlbookcamp-code
● Blogs
○ https://madewithml.com
○ https://mlops.community/blog 20
groups/mlopsvn
To sum up

groups/mlopsvn 21

Session 29 - MLOps Tools Overview-New
100% (1)
Session 29 - MLOps Tools Overview-New
40 pages
CT1-MLOPs S1 2
No ratings yet
CT1-MLOPs S1 2
68 pages
ML Deployment & MLOps Guide
No ratings yet
ML Deployment & MLOps Guide
56 pages
Getting Started With MLOPs 21 Page Tutorial
No ratings yet
Getting Started With MLOPs 21 Page Tutorial
21 pages
Deloitte Take Home Challenge - V2
No ratings yet
Deloitte Take Home Challenge - V2
83 pages
MLOps Research Work by Arka Roy
No ratings yet
MLOps Research Work by Arka Roy
21 pages
MLOps Asilla 20221124
No ratings yet
MLOps Asilla 20221124
16 pages
DVC Mlflow
No ratings yet
DVC Mlflow
10 pages
MLOps Course for AI Professionals
No ratings yet
MLOps Course for AI Professionals
29 pages
MLOps Specialization Course January 2024!5!15
No ratings yet
MLOps Specialization Course January 2024!5!15
11 pages
MLOps Specialization Course January 2024
No ratings yet
MLOps Specialization Course January 2024
24 pages
MLOps Specialization Course April 2024
100% (1)
MLOps Specialization Course April 2024
25 pages
Ai-Powered Timesheet - Implementation Flow & Technical Documentation
No ratings yet
Ai-Powered Timesheet - Implementation Flow & Technical Documentation
8 pages
Base Paper 3 - Master Theises
No ratings yet
Base Paper 3 - Master Theises
75 pages
AWS MLOps Slides
No ratings yet
AWS MLOps Slides
185 pages
MLOPs Artem Koval
No ratings yet
MLOPs Artem Koval
38 pages
MLOps Specialization Course Sep 2025
No ratings yet
MLOps Specialization Course Sep 2025
30 pages
Lecture+Notes Intro To MLOps Session3
No ratings yet
Lecture+Notes Intro To MLOps Session3
8 pages
MLOps for Efficient ML Deployment
No ratings yet
MLOps for Efficient ML Deployment
20 pages
GCP ML Engineer Exam Guide
100% (1)
GCP ML Engineer Exam Guide
2 pages
Mlops Productionalization Brochure
No ratings yet
Mlops Productionalization Brochure
7 pages
Final Homework Assignment
No ratings yet
Final Homework Assignment
3 pages
MLOps Skills: A Step-by-Step Guide
No ratings yet
MLOps Skills: A Step-by-Step Guide
6 pages
Applied Sciences: Demystifying Mlops and Presenting A Recipe For The Selection of Open-Source Tools
No ratings yet
Applied Sciences: Demystifying Mlops and Presenting A Recipe For The Selection of Open-Source Tools
39 pages
Lecture 8-9
No ratings yet
Lecture 8-9
87 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
91 pages
MLOps With Agentic AI Curriculum
No ratings yet
MLOps With Agentic AI Curriculum
33 pages
NoteGPT - AWS Re - Inforce 2024 - Building A Secure MLOps Pipeline, Featuring PathAI (APS302)
No ratings yet
NoteGPT - AWS Re - Inforce 2024 - Building A Secure MLOps Pipeline, Featuring PathAI (APS302)
15 pages
Webinar Slides Mlops
100% (1)
Webinar Slides Mlops
35 pages
Advanced Data Science with Spark
No ratings yet
Advanced Data Science with Spark
47 pages
cs329s 2022 02 Slides MLSD
No ratings yet
cs329s 2022 02 Slides MLSD
99 pages
Introduction To MLFlow
No ratings yet
Introduction To MLFlow
8 pages
Tantithamthavorn Et Al - 2025
No ratings yet
Tantithamthavorn Et Al - 2025
7 pages
MLOps Tools OverviewNotes @unlocked - Coding @providerhub0
No ratings yet
MLOps Tools OverviewNotes @unlocked - Coding @providerhub0
17 pages
MLOps Notes
100% (1)
MLOps Notes
48 pages
Machine Learning Operations MLOps Overview Definition and Architecture
No ratings yet
Machine Learning Operations MLOps Overview Definition and Architecture
14 pages
MLOps Interview Q&A Guide 2024
No ratings yet
MLOps Interview Q&A Guide 2024
19 pages
Comprehensive Local AI LLM System Architecture v3.0
No ratings yet
Comprehensive Local AI LLM System Architecture v3.0
12 pages
Islandora 161301
No ratings yet
Islandora 161301
40 pages
Slides Rethink Machine Learning For Regulated Industries
No ratings yet
Slides Rethink Machine Learning For Regulated Industries
30 pages
Toward An Open Source MLOps Architecture
No ratings yet
Toward An Open Source MLOps Architecture
6 pages
MLOps for ML Researchers & Practitioners
No ratings yet
MLOps for ML Researchers & Practitioners
13 pages
Mlflow Workshop Part 2
No ratings yet
Mlflow Workshop Part 2
29 pages
Introduction to MLOps Concepts
No ratings yet
Introduction to MLOps Concepts
10 pages
Building Machine Learning Systems With A Feature Store Batch, Real-Time, and LLM Systems Early Release Jim
No ratings yet
Building Machine Learning Systems With A Feature Store Batch, Real-Time, and LLM Systems Early Release Jim
84 pages
Data Science Project Lifecycle
No ratings yet
Data Science Project Lifecycle
43 pages
MLflow - An Open Platform To Simplify The Machine Learning Lifecycle Presentation 1
No ratings yet
MLflow - An Open Platform To Simplify The Machine Learning Lifecycle Presentation 1
28 pages
MLOps & ML Lifecycle Mastery
No ratings yet
MLOps & ML Lifecycle Mastery
106 pages
Unit 1
No ratings yet
Unit 1
21 pages
ML Design Patterns for MLOps
No ratings yet
ML Design Patterns for MLOps
43 pages
Production ML Pipelines With TensorFlow Extended - TFX - Presentation
No ratings yet
Production ML Pipelines With TensorFlow Extended - TFX - Presentation
234 pages
Machine Learning Guide: Basics to Deployment
No ratings yet
Machine Learning Guide: Basics to Deployment
2 pages
MLHOps Guide for Healthcare Experts
No ratings yet
MLHOps Guide for Healthcare Experts
86 pages
Unlocking LLM Performance With Ebpf Optimizing Training and Inference Pipelines Chuan Hui Ebpfji Xi Llmxia Daep Xiao Zhen Relia Fa Qiu Yang Xiang Yunshan Networks Inc 1
No ratings yet
Unlocking LLM Performance With Ebpf Optimizing Training and Inference Pipelines Chuan Hui Ebpfji Xi Llmxia Daep Xiao Zhen Relia Fa Qiu Yang Xiang Yunshan Networks Inc 1
37 pages
Summary
No ratings yet
Summary
2 pages
MLOps: Definition and Architecture
No ratings yet
MLOps: Definition and Architecture
13 pages
Machine Learning
No ratings yet
Machine Learning
102 pages
Notesv 1
No ratings yet
Notesv 1
6 pages
Object Oriented PeopleCode
No ratings yet
Object Oriented PeopleCode
12 pages
SDS - Software Design Specifications
No ratings yet
SDS - Software Design Specifications
14 pages
TMV's Online Internal Assessment Portal: No. Questions
No ratings yet
TMV's Online Internal Assessment Portal: No. Questions
5 pages
Manual - ENG
No ratings yet
Manual - ENG
7 pages
BADI in Infospoke
No ratings yet
BADI in Infospoke
12 pages
Grade 11 CS Term 2 QP 2024-25
No ratings yet
Grade 11 CS Term 2 QP 2024-25
8 pages
Priyanka Chauhan: Agile & Scrum Expert
No ratings yet
Priyanka Chauhan: Agile & Scrum Expert
2 pages
Java Cryptography Programming (PDFDrive)
No ratings yet
Java Cryptography Programming (PDFDrive)
64 pages
Python Basics for Beginners
No ratings yet
Python Basics for Beginners
113 pages
Ab Initio Software Overview & Architecture
No ratings yet
Ab Initio Software Overview & Architecture
99 pages
Zscaler Cloud Protection at A Glance
No ratings yet
Zscaler Cloud Protection at A Glance
2 pages
Software Engineering For Modern Web Applications Methodologies and Technologies 1st Edition Daniel M. Brandon PDF Download
100% (9)
Software Engineering For Modern Web Applications Methodologies and Technologies 1st Edition Daniel M. Brandon PDF Download
81 pages
Mongo MCQ
No ratings yet
Mongo MCQ
13 pages
Laravel Blog Admin Panel Setup
No ratings yet
Laravel Blog Admin Panel Setup
5 pages
Meha Desai Resume
No ratings yet
Meha Desai Resume
1 page
STL - Vector - HackerEarth
100% (1)
STL - Vector - HackerEarth
4 pages
CGM Practical File by Akshat Agrawal
No ratings yet
CGM Practical File by Akshat Agrawal
21 pages
Calendar
No ratings yet
Calendar
6 pages
Process Management in OS Lecture
No ratings yet
Process Management in OS Lecture
16 pages
QA Engineer Experience & Skills
No ratings yet
QA Engineer Experience & Skills
6 pages
Setting Up An SMS Gateway With Ubuntu 8.04, Kannel and Huawei E220 GSM Modem
No ratings yet
Setting Up An SMS Gateway With Ubuntu 8.04, Kannel and Huawei E220 GSM Modem
13 pages
Problem Set 2 - Solutions: C && C C && C
No ratings yet
Problem Set 2 - Solutions: C && C C && C
8 pages
Object Oriented Programming With Java Jan 2023
No ratings yet
Object Oriented Programming With Java Jan 2023
7 pages
OOPTJ R23 Unit 1
No ratings yet
OOPTJ R23 Unit 1
34 pages
Engineering Placement Insights
No ratings yet
Engineering Placement Insights
65 pages
P Sapp Deploy Toolkit
No ratings yet
P Sapp Deploy Toolkit
335 pages
ICT BASICS-lesson3
No ratings yet
ICT BASICS-lesson3
23 pages
Class XI (As Per CBSE Board) : Computer Science
No ratings yet
Class XI (As Per CBSE Board) : Computer Science
32 pages
TIAPortalOpennessenUS en US
No ratings yet
TIAPortalOpennessenUS en US
278 pages
Lab 04 - Conditional Statement
No ratings yet
Lab 04 - Conditional Statement
8 pages

MLOps

Uploaded by

MLOps

Uploaded by

MLOps

How to operate a ML system productively?

● Data version control

CI/CD user workflow

● Model lineage (which experiment/run

Performance level: ● Number of requests (throughput)

X: Inputs (features) Features drift API outgoing data:

y: Outputs (labels) Target drift ● Model metadata: name, version, docker

Seldon Core API dashboard

Payload logging with ELK stack

● EDA: JupyterHub, Jupyter Notebook ● AWS Sagemaker Studio

General purpose Model agnostic Framework specific

Micro (adaptive) batching

Offline batch serving

A/B tests/MAB deployments

Native Kafka Integration

Roles and their intersections contributing to

You might also like