Distributed Machine Learning

this is my class presentation on distributed machine learning

Uploaded by

زهرا کشاورز

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views23 pages

Distributed Machine Learning

this is my class presentation on distributed machine learning

Uploaded by

زهرا کشاورز

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

DISTRIBUTED MACHINE

LEARNING
Machine Learning
Fall 2024
Zahra Keshavarz Rezaei
What is Distributed Machine Learning?

Distributed Machine Learning refers to the process of training machine

learning models using multiple machines or processors simultaneously. This
allows the handling of massive datasets and complex models that a single
machine cannot efficiently process.
distributed Machine Learning
vs.
Federated Learning
Types of Distributed Learning:

Data Parallelism
Model Parallelism
Hybrid Parallelism
Model Parallelism
Split the model itself (e.g., layers of a neural network) across
different nodes. One node processes input layers, while
another processes hidden layers.
Data Parallelism
Split the training data into smaller subsets. Each worker
(node) processes its subset. Think of each node as working
on a few puzzle pieces to contribute to the entire picture.
Key Algorithms in DML

Stochastic Gradient Descent (SGD)

Mini-batch SGD distributed across nodes.
Gradient Aggregation
Aggregates gradients computed by multiple nodes.
Synchronous vs. Asynchronous Training

Synchronous: All nodes update weights simultaneously.

Asynchronous: Nodes update weights independently.

AllReduce Algorithm

A communication pattern used to aggregate gradients across

all workers.
Reduces communication overhead by combining operations
(e.g., summing gradients) during data transfer.
Ring-AllReduce Algorithm
Optimized version of All-Reduce.
Workers are organized in a ring topology.
Each worker sends and receives gradients from its neighbors in a
pipeline fashion.
Reduces latency compared to traditional All-Reduce.
Scatter Reduce
Scatter Reduce
All Gather
All Gather
Parameter Server Architecture

Dedicated nodes responsible for storing and updating the model

parameters (weights, biases, etc.).
Aggregate gradients from workers and send updated parameters
back.
Frameworks

TensorFlow Distributed: Provides strategies like tf.distribute.Strategy for

data and model parallelism.
PyTorch Distributed: Includes utilities like torch.distributed for
communication and gradient sharing.
Horovod: Open-source library optimized for distributed deep learning with
minimal code changes.
Challenges in DML

Communication Overhead:
Synchronization between nodes (e.g., sharing gradients) can slow down training.
Fault Tolerance:
Node failures can disrupt training or lead to inconsistencies.
Data Imbalance:
Uneven distribution of data can lead to skewed models.
Applications

Image Recognition

Language Modeling

Finance
Conclusion
DML is essential for scaling AI to meet modern demands.
Key approaches: Data parallelism, model parallelism, and hybrid methods.
Challenges: Communication overhead, fault tolerance, and data imbalance.
THANK YOU
REFERENCES
Joost Verbraeken, Matthijs Wolting, Jonathan Katzy, Jeroen Kloppenburg, Tim
Verbelen, and Jan S. Rellermeyer. 2020. A Survey on Distributed Machine
Learning.
Zhao, Huasha & Canny, John. (2013). Sparse Allreduce: Efficient Scalable
Communication for Power-Law Data.
Distributed Machine Learning and the Parameter Server, “CS4787 — Principles of
Large-Scale Machine Learning Systems” Course, Cornell University, Lecture Note.

Computer Network
No ratings yet
Computer Network
10 pages
DC Unit IV
No ratings yet
DC Unit IV
28 pages
Modern Machine Learning Review
No ratings yet
Modern Machine Learning Review
31 pages
Distributed Algorithms in ML Research
No ratings yet
Distributed Algorithms in ML Research
9 pages
Slide 14 - Distributed Deep Learning
No ratings yet
Slide 14 - Distributed Deep Learning
30 pages
FL 1
No ratings yet
FL 1
25 pages
11 Distributed DL Printable
No ratings yet
11 Distributed DL Printable
73 pages
Lecture 22
No ratings yet
Lecture 22
4 pages
Distributed Training Scaling Deep Dive
No ratings yet
Distributed Training Scaling Deep Dive
33 pages
Lecture Notes
No ratings yet
Lecture Notes
35 pages
ML Parallelization1
No ratings yet
ML Parallelization1
14 pages
Splitfed: When Federated Learning Meets Split Learning
No ratings yet
Splitfed: When Federated Learning Meets Split Learning
14 pages
Spark-Based Machine Learning Platform
No ratings yet
Spark-Based Machine Learning Platform
6 pages
SPLT Based Data Parallel Framework For ML Training
No ratings yet
SPLT Based Data Parallel Framework For ML Training
16 pages
Today (AutoRecovered)
No ratings yet
Today (AutoRecovered)
1 page
Presentation On Distributed Federated Learning For Ddos Attack Mitigation in Industrial Control System (Ics)
No ratings yet
Presentation On Distributed Federated Learning For Ddos Attack Mitigation in Industrial Control System (Ics)
12 pages
Nsdi21 SwitchML
No ratings yet
Nsdi21 SwitchML
25 pages
Semi-Dynamic Load Balancing E Icient Distributed
No ratings yet
Semi-Dynamic Load Balancing E Icient Distributed
16 pages
A Survey On Distributed Machine Learning
No ratings yet
A Survey On Distributed Machine Learning
33 pages
Byzantine Machine Learning: A Primer: Rachid Guerraoui Nirupam Gupta Rafael Pinot
No ratings yet
Byzantine Machine Learning: A Primer: Rachid Guerraoui Nirupam Gupta Rafael Pinot
39 pages
Thesis Proposal: Scaling Distributed Machine Learning With System and Algorithm Co-Design
No ratings yet
Thesis Proposal: Scaling Distributed Machine Learning With System and Algorithm Co-Design
12 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
DeMo: Optimizing Neural Network Training
No ratings yet
DeMo: Optimizing Neural Network Training
8 pages
A Federated Learning Enabled Predictive Analysis To Forecast Stock Market Trends
No ratings yet
A Federated Learning Enabled Predictive Analysis To Forecast Stock Market Trends
7 pages
A Review of Machine Learning For Big Data Analysis: Nadia - Cs89@uomustansiriyah - Edu.iq
No ratings yet
A Review of Machine Learning For Big Data Analysis: Nadia - Cs89@uomustansiriyah - Edu.iq
4 pages
MODULE 2 Deep Learning
No ratings yet
MODULE 2 Deep Learning
26 pages
A New Platform For Distributed
No ratings yet
A New Platform For Distributed
19 pages
Big Data Meets AI - Optimizing Distributed Computing For Scalable
No ratings yet
Big Data Meets AI - Optimizing Distributed Computing For Scalable
12 pages
Data Poison Detection Schemes For Distributed Machine Learning
No ratings yet
Data Poison Detection Schemes For Distributed Machine Learning
13 pages
Deep Learning U1
No ratings yet
Deep Learning U1
5 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
19 pages
Distributed Deep Learning for HPC
No ratings yet
Distributed Deep Learning for HPC
60 pages
Heiko Ludwig Editor Nathalie Baracaldo Editor - Federated Learning A Comprehensive Overview of Methods and Applications-Springer 2022
No ratings yet
Heiko Ludwig Editor Nathalie Baracaldo Editor - Federated Learning A Comprehensive Overview of Methods and Applications-Springer 2022
531 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
S2-23 - AIMLCZG515: Distributed Machine Learning: BITS Pilani BITS Pilani
No ratings yet
S2-23 - AIMLCZG515: Distributed Machine Learning: BITS Pilani BITS Pilani
54 pages
Machine Learning
No ratings yet
Machine Learning
38 pages
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
No ratings yet
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
44 pages
Class Note Expanded 1
No ratings yet
Class Note Expanded 1
7 pages
Deep Learning
No ratings yet
Deep Learning
4 pages
Cloud Computing Unit - 3 Final
No ratings yet
Cloud Computing Unit - 3 Final
43 pages
Federated Learning
No ratings yet
Federated Learning
50 pages
Distributed ML with Parameter Server
No ratings yet
Distributed ML with Parameter Server
16 pages
Zeroth Review PPT Template
No ratings yet
Zeroth Review PPT Template
18 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
6 pages
Machine Learning Overview & Applications
No ratings yet
Machine Learning Overview & Applications
8 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
77 pages
Distributed ML at Scale
No ratings yet
Distributed ML at Scale
9 pages
? Understanding Machine Learning
No ratings yet
? Understanding Machine Learning
3 pages
Karthik
No ratings yet
Karthik
10 pages
An Invitation To Distributed Quantum Neural Networks: Lirandë Pira Chris Ferrie
No ratings yet
An Invitation To Distributed Quantum Neural Networks: Lirandë Pira Chris Ferrie
24 pages
A Decentralized AI Data Management System in Federated Learning
No ratings yet
A Decentralized AI Data Management System in Federated Learning
4 pages
Unit 1 ML
No ratings yet
Unit 1 ML
41 pages
4-Machine Learning and Neural Networks
No ratings yet
4-Machine Learning and Neural Networks
9 pages
Machine Learning Unit-1
No ratings yet
Machine Learning Unit-1
22 pages
Machinelearning Unit1
No ratings yet
Machinelearning Unit1
9 pages
Presentation1.Pptx Tanushka
No ratings yet
Presentation1.Pptx Tanushka
13 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
Unit I
No ratings yet
Unit I
48 pages
2021 Advances and Open Problems in Federating Learning
No ratings yet
2021 Advances and Open Problems in Federating Learning
76 pages
Auditing Fair Value Measurements and Disclosures
No ratings yet
Auditing Fair Value Measurements and Disclosures
93 pages
Precision Tune Auto Care
No ratings yet
Precision Tune Auto Care
3 pages
Me 453 Heat Exchanger Design - Syllabus
No ratings yet
Me 453 Heat Exchanger Design - Syllabus
4 pages
SAP ERP System Overview Guide
No ratings yet
SAP ERP System Overview Guide
5 pages
UltraTech Cement Project Report
No ratings yet
UltraTech Cement Project Report
3 pages
Reduce DC Motor Energy Losses
No ratings yet
Reduce DC Motor Energy Losses
50 pages
TCP Annual Report 2022-23
No ratings yet
TCP Annual Report 2022-23
212 pages
Assembly Language & TASM Lab Report
No ratings yet
Assembly Language & TASM Lab Report
8 pages
Vsan 802 Administration Guide
No ratings yet
Vsan 802 Administration Guide
153 pages
Release Waiver Quitclaim Undertaking Blank
No ratings yet
Release Waiver Quitclaim Undertaking Blank
1 page
DR Fixit Coal Tar Epoxy
No ratings yet
DR Fixit Coal Tar Epoxy
3 pages
Integration Guide
100% (1)
Integration Guide
672 pages
Word Larep010315
100% (1)
Word Larep010315
649 pages
TECUMSEH Model Number Codes
0% (1)
TECUMSEH Model Number Codes
6 pages
HumanNutrition Graduate HANDBOOK 2022-2023
No ratings yet
HumanNutrition Graduate HANDBOOK 2022-2023
9 pages
Chapter 3 Types of Companies
No ratings yet
Chapter 3 Types of Companies
29 pages
Management Consulting List
No ratings yet
Management Consulting List
24 pages
MH 3 HS
No ratings yet
MH 3 HS
4 pages
Completed CH6 Mini Case Pharma Biotech Working Papers Fall 2014
50% (2)
Completed CH6 Mini Case Pharma Biotech Working Papers Fall 2014
3 pages
Bangalore Agents 27.11.24
No ratings yet
Bangalore Agents 27.11.24
97 pages
Cip Accomplishment Report Asingan, Pangasinan
No ratings yet
Cip Accomplishment Report Asingan, Pangasinan
1 page
Ethyl Vanillin Safety Guide
No ratings yet
Ethyl Vanillin Safety Guide
5 pages
GMAT VERBAL PAST Q&A (Eduregard - Com)
No ratings yet
GMAT VERBAL PAST Q&A (Eduregard - Com)
22 pages
Human Resource Development Strategies
No ratings yet
Human Resource Development Strategies
10 pages
Content Theories - Maslow, Herzberg, Alderfer
No ratings yet
Content Theories - Maslow, Herzberg, Alderfer
3 pages
10.synergy & Change Management
No ratings yet
10.synergy & Change Management
5 pages
Discounted Cash Flow Valuation
100% (1)
Discounted Cash Flow Valuation
26 pages
Madrid, Mira Flor F. 3BSA-5
No ratings yet
Madrid, Mira Flor F. 3BSA-5
3 pages
Quick Start Guide: Rainwise Cc-2000 Edition
No ratings yet
Quick Start Guide: Rainwise Cc-2000 Edition
7 pages
Northern Rock 2
100% (1)
Northern Rock 2
19 pages