KEMBAR78
Experimenting and Learning Kubernetes and Tensorflow | PPTX
Experimenting and Learning
Kubernetes and Tensorflow
@Ben_Hall
Ben@BenHall.me.uk
Katacoda.com
Experimenting and Learning
Kubernetes and Tensorflow
@Ben_Hall
Ben@BenHall.me.uk
Katacoda.com
@Ben_Hall / Blog.BenHall.me.uk
WHOAMI?
Learn via Interactive Browser-Based Labs
Katacoda.com
• In the next 25/30 minutes
• Learning to Learn
• Creating Kubernetes Experiment Playground
• Running Tensorflow on Kubernetes
• Keeping up to date with the community
Learn By
Doing
Goals are clear
Feedback is immediate
Demo Time!
Minikube
Tensorflow Playground
Kubeadm
Tensorflow on Kubernetes
Create Kubernetes Cluster
Minikube, Kubeadm
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: inception-deployment
labels:
k8s-app: inception-deployment
spec:
replicas: 3
selector:
matchLabels:
k8s-app: inception-deployment
template:
metadata:
labels:
k8s-app: inception-deployment
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
containers:
- name: inception-container
image: katacoda/tensorflow_serving
imagePullPolicy: Never
command:
- /bin/sh
- -c
args:
- /serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
--port=9000 --model_name=inception --model_base_path=/serving/inception-export
ports:
- containerPort: 9000
apiVersion: batch/v1
kind: Job
metadata:
name: inception-client
spec:
template:
metadata:
name: inception-client
spec:
containers:
- name: inception-client
image: katacoda/tensorflow_serving
imagePullPolicy: Never
command:
- /bin/bash
- -c
args:
- /serving/bazel-bin/tensorflow_serving/example/inception_client
--server=inception-deployment:9000 --image=/data/cat.jpg
volumeMounts:
- name: inception-persistent-storage
mountPath: /data
volumes:
- name: inception-persistent-storage
hostPath:
path: /root
restartPolicy: Never
Kubernetes and Tensorflow at
scale?
https://www.tensorflow.org/deploy
/distributed
https://www.youtube.com/watch?v=yFXNASK0cPk
Worker Worker Worker Worker
Parameter
Server
Parameter
Server
# On ps0.example.com:
$ python trainer.py 
--ps_hosts=ps0.example.com:2222,ps1.example.com:2222 
--worker_hosts=worker0.example.com:2222,worker1.example.com:2222 
--job_name=ps --task_index=0
# On ps1.example.com:
$ python trainer.py 
--ps_hosts=ps0.example.com:2222,ps1.example.com:2222 
--worker_hosts=worker0.example.com:2222,worker1.example.com:2222 
--job_name=ps --task_index=1
# On worker0.example.com:
$ python trainer.py 
--ps_hosts=ps0.example.com:2222,ps1.example.com:2222 
--worker_hosts=worker0.example.com:2222,worker1.example.com:2222 
--job_name=worker --task_index=0
# On worker1.example.com:
$ python trainer.py 
--ps_hosts=ps0.example.com:2222,ps1.example.com:2222 
--worker_hosts=worker0.example.com:2222,worker1.example.com:2222 
--job_name=worker --task_index=1
Kubernetes namespace
Kubernetes PodKubernetes PodKubernetes Pod
Containerized TF
Worker
Containerized TF
Worker
Containerized TF
Worker
Kubernetes Deployment
Containerized TF
PS
Server 1 Server 2 Server 3
Kubernetes namespace
Kubernetes PodKubernetes PodKubernetes Pod
Containerized TF
Worker
Containerized TF
Worker
Containerized TF
Worker
Kubernetes Deployment
Containerized TF
PS
Storage
Server 4 Server 5
Kubernetes namespace
Kubernetes PodKubernetes PodKubernetes Pod
Containerized TF
Worker
Containerized TF
Worker
Containerized TF
Worker
Kubernetes Deployment
Containerized TF
PS
GPU1 GPU2 GPU3
Server 1 Server 2
GPU1 GPU2 GPU3
Server 3
Kubernetes namespace
Kubernetes PodKubernetes PodKubernetes Pod
Containerized TF
Worker
Containerized TF
Worker
Containerized TF
Worker
Kubernetes Deployment
Containerized TF
PS
Storage
Server 4 Server 5
Docker Container and GPU
docker run -it 
--device /dev/nvidia0:/dev/nvidia0 
--device /dev/nvidia1:/dev/nvidia1 
--device /dev/nvidiactl:/dev/nvidiactl 
--device /dev/nvidia-uvm:/dev/nvidia-uvm 
tf-cuda:v1.1beta /bin/bash
Summary
• Kubernetes is designed for running
distributed systems at scale
• The model of Tensorflow fits cleanly into
Kubernetes
• As Tensorflow usage increases,
Kubernetes can scale to meet demands
www.katacoda.com
Call To Action
• Interested in sharing your Kubernetes or
Tensorflow experience? Write your own
scenarios and teach interactively!
• Teaching teams internally? Private
Katacoda environments
Thank you!
@Ben_Hall
Ben@BenHall.me.uk
Blog.BenHall.me.uk
www.Katacoda.com

Experimenting and Learning Kubernetes and Tensorflow

  • 1.
    Experimenting and Learning Kubernetesand Tensorflow @Ben_Hall Ben@BenHall.me.uk Katacoda.com
  • 4.
    Experimenting and Learning Kubernetesand Tensorflow @Ben_Hall Ben@BenHall.me.uk Katacoda.com
  • 5.
  • 6.
    Learn via InteractiveBrowser-Based Labs Katacoda.com
  • 7.
    • In thenext 25/30 minutes • Learning to Learn • Creating Kubernetes Experiment Playground • Running Tensorflow on Kubernetes • Keeping up to date with the community
  • 9.
  • 13.
  • 14.
  • 15.
  • 17.
  • 18.
  • 19.
  • 20.
  • 22.
  • 23.
    apiVersion: extensions/v1beta1 kind: Deployment metadata: name:inception-deployment labels: k8s-app: inception-deployment spec: replicas: 3 selector: matchLabels: k8s-app: inception-deployment template: metadata: labels: k8s-app: inception-deployment annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: containers: - name: inception-container image: katacoda/tensorflow_serving imagePullPolicy: Never command: - /bin/sh - -c args: - /serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=inception --model_base_path=/serving/inception-export ports: - containerPort: 9000
  • 25.
    apiVersion: batch/v1 kind: Job metadata: name:inception-client spec: template: metadata: name: inception-client spec: containers: - name: inception-client image: katacoda/tensorflow_serving imagePullPolicy: Never command: - /bin/bash - -c args: - /serving/bazel-bin/tensorflow_serving/example/inception_client --server=inception-deployment:9000 --image=/data/cat.jpg volumeMounts: - name: inception-persistent-storage mountPath: /data volumes: - name: inception-persistent-storage hostPath: path: /root restartPolicy: Never
  • 26.
  • 27.
  • 28.
  • 29.
    Worker Worker WorkerWorker Parameter Server Parameter Server
  • 30.
    # On ps0.example.com: $python trainer.py --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 --job_name=ps --task_index=0 # On ps1.example.com: $ python trainer.py --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 --job_name=ps --task_index=1 # On worker0.example.com: $ python trainer.py --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 --job_name=worker --task_index=0 # On worker1.example.com: $ python trainer.py --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 --job_name=worker --task_index=1
  • 31.
    Kubernetes namespace Kubernetes PodKubernetesPodKubernetes Pod Containerized TF Worker Containerized TF Worker Containerized TF Worker Kubernetes Deployment Containerized TF PS Server 1 Server 2 Server 3 Kubernetes namespace Kubernetes PodKubernetes PodKubernetes Pod Containerized TF Worker Containerized TF Worker Containerized TF Worker Kubernetes Deployment Containerized TF PS Storage Server 4 Server 5
  • 32.
    Kubernetes namespace Kubernetes PodKubernetesPodKubernetes Pod Containerized TF Worker Containerized TF Worker Containerized TF Worker Kubernetes Deployment Containerized TF PS GPU1 GPU2 GPU3 Server 1 Server 2 GPU1 GPU2 GPU3 Server 3 Kubernetes namespace Kubernetes PodKubernetes PodKubernetes Pod Containerized TF Worker Containerized TF Worker Containerized TF Worker Kubernetes Deployment Containerized TF PS Storage Server 4 Server 5
  • 33.
    Docker Container andGPU docker run -it --device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidia1:/dev/nvidia1 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm tf-cuda:v1.1beta /bin/bash
  • 38.
    Summary • Kubernetes isdesigned for running distributed systems at scale • The model of Tensorflow fits cleanly into Kubernetes • As Tensorflow usage increases, Kubernetes can scale to meet demands
  • 39.
  • 40.
    Call To Action •Interested in sharing your Kubernetes or Tensorflow experience? Write your own scenarios and teach interactively! • Teaching teams internally? Private Katacoda environments
  • 41.