KEMBAR78
Distributed Deep Learning with Docker at Salesforce | PDF
Distributed Deep
Learning with Docker
at Salesforce
Jeff Hajewski
Software Engineer,
Salesforce.
github.com/j-haj
jeff-hajewski-3a1b5a29
Caveats
● These my own views and opinions, not those of Salesforce
● This is how one team at Salesforce deploys deep learning
models
● When I use the term Docker I am referring to the
technology, not the company
● Some of these designs are simplified
● What is deep learning and why is it difficult?
● Deep learning at Salesforce
● Challenges
○ Designing for team specialization
○ Interacting with the model server
○ Testing
● Key takeaways
About this talk
The core task of deep learning is function approximation.
Neural networks can approximate any function.
Neural networks are expensive to evaluate.
● Linear regression: ~1,000 parameters
● Deep neural network: 100M - 1B parameters (100,000 - 1M x linear reg.)
Deep Learning Review
How should we design
distributed systems
for deep learning?
high latency tasks
We use deep learning models to provide our customers
useful information about their sales process.
They send us this data as a firehose of streaming data.
The faster we get this data to our customers, the more
useful and actionable it is for their sales teams.
Deep Learning at Salesforce
There are three steps to this process
1. Preprocessing - cleaning and formatting the data
2. Inference - running the data through the model
3. Postprocessing - interpreting the output from the model
Deep Learning at Salesforce
Deep Learning at Salesforce
Discusses
cat
preprocess
[0.2, 0.71, 0.89, 0.6]
[0.85, 0.15]
inference
postprocess“Hello! My
cat is
friendly.”
Deep Learning at Salesforce
Discusses
cat
preprocess
[0.2, 0.71, 0.89, 0.6]
[0.85, 0.15]
inference
postprocess“Hello! My
cat is
friendly.”
Data Science Team
Deep Learning at Salesforce
Discusses
cat
preprocess
[0.2, 0.71, 0.89, 0.6]
[0.85, 0.15]
inference
postprocess“Hello! My
cat is
friendly.”
Data Science Team Systems Team
Challenge 1:
designing
for team
specialization
Requirements
1. The data science team shouldn’t need to know
about the system. They just want to define a
sequence of computation.
2. The systems engineers shouldn’t need to know
anything about the computation. They just want to
scale the system.
Designing for team specialization
Challenges
1. Some functions takes longer to execute than others
(e.g., model inference)
2. The order of execution is important
Designing for team specialization
Solution: map functions to containers
Designing for team specialization
postprocess(inference(preprocess(x)))
preprocess inference postprocess
What about throughput?
preprocess inference postprocess
0110010
1001111
0110010
1000011
It’s a cat!
1,000
QPS
500
QPS
300
QPS
1,000
QPS
Max
throughput
inference
inferenceinferencepreprocess
What about throughput?
preprocess inference postprocess
0110010
1001111
0110010
1000011
It’s a cat!
1,000
QPS
2x
500
QPS
4x
300
QPS
1,000
QPS
Max
throughput
Docker enables us to easily scale out each individual stage
inferenceinferencepreprocess
What about throughput?
preprocess inference postprocess
0110010
1001111
0110010
1000011
It’s a cat!
1,000
QPS
2x
500
QPS
2x
300
QPS
1,000
QPS
Max
throughput
Kafka gives stage-wise checkpointing
Challenge 2:
interacting
with the
model servers
Model servers provide a way to query the model,
typically via gRPC or HTTP.
What is the best way to deploy and interact with these
model servers?
Serving deep learning models
Challenge:
1. Model servers are designed as a standalone process.
2. How should we best utilize multiple GPUs?
3. What about networking?
Interacting with the model server
We want to keep deployment simple!
Solution: Deploy model server images as part of a “pod” or
“group” with a coordinator service
Interacting with the model server
JVM
Manager
Model Server Model Server Model Server...
Pod
1. Who owns the model server?
2. How should we handle model versions? Where are they
stored locally?
3. What are the addresses of the model servers?
This solves additional challenges
Data science team
Docker shared volume
http://localhost via Docker private networking
Challenge 3:
testing
Challenge: how should we test these systems?
1. Deep learning models are probabilistic
2. Interservice interactions can be quite complex
Testing
Solution: Docker Compose
● Makes it easy to swap out the model server with a mock
service
● Deploying the entire system locally is easy
● Integrates well with Maven and Gradle
Testing
We haven’t spent a lot of time
discussing the details of Docker
That is precisely the point!
● Docker allows us to simplify many aspects of our design.
● Docker stays out of the way.
● Docker provides a simple alternative to a much more
complex solution.
Docker simplifies our lives

Distributed Deep Learning with Docker at Salesforce

  • 1.
    Distributed Deep Learning withDocker at Salesforce
  • 2.
  • 3.
    Caveats ● These myown views and opinions, not those of Salesforce ● This is how one team at Salesforce deploys deep learning models ● When I use the term Docker I am referring to the technology, not the company ● Some of these designs are simplified
  • 4.
    ● What isdeep learning and why is it difficult? ● Deep learning at Salesforce ● Challenges ○ Designing for team specialization ○ Interacting with the model server ○ Testing ● Key takeaways About this talk
  • 5.
    The core taskof deep learning is function approximation. Neural networks can approximate any function. Neural networks are expensive to evaluate. ● Linear regression: ~1,000 parameters ● Deep neural network: 100M - 1B parameters (100,000 - 1M x linear reg.) Deep Learning Review
  • 6.
    How should wedesign distributed systems for deep learning? high latency tasks
  • 7.
    We use deeplearning models to provide our customers useful information about their sales process. They send us this data as a firehose of streaming data. The faster we get this data to our customers, the more useful and actionable it is for their sales teams. Deep Learning at Salesforce
  • 8.
    There are threesteps to this process 1. Preprocessing - cleaning and formatting the data 2. Inference - running the data through the model 3. Postprocessing - interpreting the output from the model Deep Learning at Salesforce
  • 9.
    Deep Learning atSalesforce Discusses cat preprocess [0.2, 0.71, 0.89, 0.6] [0.85, 0.15] inference postprocess“Hello! My cat is friendly.”
  • 10.
    Deep Learning atSalesforce Discusses cat preprocess [0.2, 0.71, 0.89, 0.6] [0.85, 0.15] inference postprocess“Hello! My cat is friendly.” Data Science Team
  • 11.
    Deep Learning atSalesforce Discusses cat preprocess [0.2, 0.71, 0.89, 0.6] [0.85, 0.15] inference postprocess“Hello! My cat is friendly.” Data Science Team Systems Team
  • 12.
  • 13.
    Requirements 1. The datascience team shouldn’t need to know about the system. They just want to define a sequence of computation. 2. The systems engineers shouldn’t need to know anything about the computation. They just want to scale the system. Designing for team specialization
  • 14.
    Challenges 1. Some functionstakes longer to execute than others (e.g., model inference) 2. The order of execution is important Designing for team specialization
  • 15.
    Solution: map functionsto containers Designing for team specialization postprocess(inference(preprocess(x))) preprocess inference postprocess
  • 16.
    What about throughput? preprocessinference postprocess 0110010 1001111 0110010 1000011 It’s a cat! 1,000 QPS 500 QPS 300 QPS 1,000 QPS Max throughput
  • 17.
    inference inferenceinferencepreprocess What about throughput? preprocessinference postprocess 0110010 1001111 0110010 1000011 It’s a cat! 1,000 QPS 2x 500 QPS 4x 300 QPS 1,000 QPS Max throughput Docker enables us to easily scale out each individual stage
  • 18.
    inferenceinferencepreprocess What about throughput? preprocessinference postprocess 0110010 1001111 0110010 1000011 It’s a cat! 1,000 QPS 2x 500 QPS 2x 300 QPS 1,000 QPS Max throughput Kafka gives stage-wise checkpointing
  • 19.
  • 20.
    Model servers providea way to query the model, typically via gRPC or HTTP. What is the best way to deploy and interact with these model servers? Serving deep learning models
  • 21.
    Challenge: 1. Model serversare designed as a standalone process. 2. How should we best utilize multiple GPUs? 3. What about networking? Interacting with the model server We want to keep deployment simple!
  • 22.
    Solution: Deploy modelserver images as part of a “pod” or “group” with a coordinator service Interacting with the model server JVM Manager Model Server Model Server Model Server... Pod
  • 23.
    1. Who ownsthe model server? 2. How should we handle model versions? Where are they stored locally? 3. What are the addresses of the model servers? This solves additional challenges Data science team Docker shared volume http://localhost via Docker private networking
  • 24.
  • 25.
    Challenge: how shouldwe test these systems? 1. Deep learning models are probabilistic 2. Interservice interactions can be quite complex Testing
  • 26.
    Solution: Docker Compose ●Makes it easy to swap out the model server with a mock service ● Deploying the entire system locally is easy ● Integrates well with Maven and Gradle Testing
  • 27.
    We haven’t spenta lot of time discussing the details of Docker That is precisely the point!
  • 28.
    ● Docker allowsus to simplify many aspects of our design. ● Docker stays out of the way. ● Docker provides a simple alternative to a much more complex solution. Docker simplifies our lives