Docker & Kubernetes
Ghulam Mustafa Raza,
Expert Cloud Native Engineer
Big Data Tech. Lab in SK telecom
• Discovery Group
• Predictive Maintenance Group
• Manufacturing Solution Group
• Groups making own solutions
• Technology and Architecture Leading Group
• Big data processing engine
• Advanced analytics algorithms
• Systematize service deployment and service operation on cluster
• Docker
• Kubernetes
Prepare for an era of cloud with Docker and Kubernetes
* technology trend in USA (2004-2017)
Major technologies Trend
ubuntu - Buy both SW & HW
oracle - Buy HW and DIY
- Run your SW on cloud
cloud
Cloud services Ubiquitous cloud services
around us icloud
for users
google
drive
dropbox
one drive
Cloud technologies Enabling technologies
for service providers for custom cloud services Amazon Web Service
Microsoft azure Docker
Kubernetes
Overview & Conclusion
• Docker to build portable software
• Build your software upon Docker
• Then distribute it anywhere (even on MS Azure and Amazon Web Service)
• Kubernetes to orchestrate multiple Docker instances
• Start using Docker and Kubernetes before too late!
• Google has been using container technologies more than 10 years
Hadoop
Docker
Kubernetes
Popularity of Docker and Kubernetes The Enterprise IT Adoption Cycle
Docker
Motivation
Enabling technologies for Docker
How to use Docker
Docker came to save us from the dependency hell
Docker Dependency hell
Portable software
Dependency hell
Development Production
environment environment
Your program Your program Customer program
depends on depends on depends on
program1 program2 program3 program1 program2 program3 program1 program2 program3
v2 v2 v2 v2 v2 v2 v1 v1 v1
conflict!
Package manager Package manager
Few choices left to you
1. Convince your customer (a.k.a. 甲)
2. Install all the dependencies manually (without the package manager)
3. Modify your program to make it depend v1
Use Docker for isolating your application
Customer program Your program
depends on depends on
program1 program2 program3 program1 program2 program3
v1 v1 v1 v2 v2 v2
Package manager in guest OS
Docker container
Package manager in host OS Docker engine (daemon)
Host operating system
Linux kernel must be ≥3.10 (such as Ubuntu 14.04 and CentOS 7)
Virtual machines and docker containers
Virtual machines Docker containers
Ubuntu CentOS Ubuntu-like CentOS-like
virtual machine virtual machine container container
App App App App
Libraries Libraries Libraries Libraries
apt yum apt yum
Kernel Device Kernel Device Containers share the kernel in the host
drivers drivers
Hypervisor Docker engine
Kernel Device drivers Kernel Device drivers
Host Operating System Host Operating System
Linux namespaces – what makes isolated environments in a host OS
Six namespaces are enough to give an illusion of running inside a virtual machine
Container Container Container
Network devices
Process ID number space - Network devices
pid (staring from 1)
net - IPv4, IPv6 stacks
pid net pid net
- Routing tables, Firewall
Various ipc objects
- POSIX message queue Mount points
ipc - SystemV IPC objects
mnt (directory hierarchy)
ipc mnt ipc mnt
(mq, sem, shm)
System identifiers Security-related identifiers
uts - hostname user - User IDs uts user uts user
- NIS domain name - Group IDs
Docker engine
Host Operating System
Analogy between program and docker
Source code Byte/machine code Process
(read only) (read only)
stack
compile execute
Program heap
data
text
Dockerfile Docker image Docker container
(read-only layers) (read-only layers + writable layer)
build run
Docker
How to define an image and run a container from it?
1) Write Dockerfile 2) Build an image from Dockerfile
- Specify to install python with pip on ubuntu - Execute each line of Dockerfile to build an image
- Tell pip to install numpy
3) Execute a Docker container from the image
1 to N relationship between image and container
Execute five containers from an image
Q) Five containers take up 2,445MB (=489MB*5) in the host?
A) No due to image layering & sharing
Images consists of layers each of which is a set of files
Image Dockerfile
Layer
Layer (pip install numpy)
Layer (apt-get install python-dev python-pip)
Base ubuntu image
Layer (files)
Layer (files)
Layer (files)
• Instructions (FROM, RUN, CMD, etc) create layers
• Base images (imported by “FROM”) also consist of layers
• If a file exists in multiple layers, the one in the upper layer is seen
Docker container
• A container is just a thin read/write layer
• base images are not copied to containers
• Copy-On-Write (COW)
• When a file in the base image is modified,
• copy the file to the R/W layer
• and then modify the copied file
Image sharing between containers
ubuntu:15.04 image (~188MB) does not copied to all containers
Layer sharing between images
If multiple Dockerfiles
1. start from the same base image
2. share a sequence of instructions (one RUN instruction in a below example)
numpy Dockerfile matplotlib Dockerfile
, then docker engine automatically reuses existing layers
Example of stacking docker images
PdM container (cpu) PdM container (gpu)
PdM engine PdM engine
(librdkafka, avro, flask) (librdkafka, avro, flask)
theano compiles
theano-gpu (theano, keras)
Kafka theano-cpu its expression graphs into
container (theano, keras) CPU/GPU instructions
cuda
Zookeeper
container kafka scipy
(with scala) (numpy, scipy, matplotlib, ipython, jupyter, pandas, scikit-learn, h5py) scipy libraries has nothing to
do with GPU, so share it
zookeeper python:2.7 official
openjdk:8 official buildpack-deps:jessie official buildpack-deps contains
essential tools to
buildpack-deps:jessie-curl official
download/compile softwares
buildpack-deps:jessie-scm official
debian:jessie official
jessie is the latest, stable
Debian release
Zookeeper cluster Kafka broker PdM engine
zk broker
zk zk
broker Web server
zk zk broker Kafka
Kafka
consumer producer
Enabling technologies for docker (wrap-up)
• Linux namespaces (covered)
• To isolate system resources
• pid, net, ipc, mnt, uts, user
• It makes a secure & isolate environment (like a VM)
• Advanced multi-layer unification File System (covered)
• Image layering & sharing
• Linux control groups (not covered)
• To track, limit, and isolate resources
• CPU, memory, network, and IO
* https://mairin.wordpress.com/2011/05/13/ideas-for-a-cgroups-ui/
Docker topics not covered here
• How to install Docker engine
• What are the docker instructions other than FROM, RUN, and CMD
• ENV / ADD / ENTRYPOINT / LABEL / EXPOSE / COPY / VOLUME / WORKDIR /
ONBUILD
• How to push local Docker images to docker hub
• How to pull remote images from docker hub
• ...
Consult with https://docs.docker.com/engine/getstarted/
Kubernetes
Motivation
A motivating example
Disclaimer
• The purpose of this section is
to briefly explain Kubernetes without details
• For a detailed explanation
with the exact Kubernetes terminology,
see the following slide
• https://www.slideshare.net/ssuser6bb12d/kubernetes-introduction-
71846110
What is Kubernetes for?
Container-based virtualization + Container orchestration
To satisfy common needs in production
replicating application instances
naming and discovery
load balancing
horizontal auto-scaling
co-locating helper processes
mounting storage systems
distributing secrets
application health checking
rolling updates
resource monitoring
log access and ingestion
...
from the official site : https://kubernetes.io/docs/whatisk8s/
Why Docker with Kubernetes?
• A mission of our group
• Systematize service deployment and service operation on cluster
• I believe that systematizing smth. is to minimize human efforts on smth.
• How to minimize human efforts on service deployment?
• Make software portable using a container technology
• Docker (chosen for its maturity and popularity)
• Rkt from CoreOS (alternative)
• Build images and run containers anywhere
• Your laptop, servers, on-premise clusters, even cloud
• How to minimize human efforts on service operation?
• Inform a container orchestration runtime of service specification
• Kubernetes from Google (chosen for its maturity and expressivity)
• Docker swarm from Docker
• Define your specification and then the runtime operates your services as you wish
Kubernetes architecture
Server
Service specification - REST API server with a K/V store
(written in yaml) - Scheduler
- Execute a web-server image - Find suitable machines for containers
- Two replicas for LB & HA - Controller manager
- Current state Desired state
- 3GB memory each
- Make changes if states go undesirable
Node agent Node agent Node agent
Ensure a specified Docker engine Docker engine Docker engine
# of replicas running
all the time container container container
(3GB) (3GB) (3GB)
Web server example
Want to launch 3 replicas
for high availability and load balancing
webserver
webserver webserver
node 1 node 2 node 3
a well-known address
How to achieve the followings?
• Users must be unaware of the replicas webserver webserver webserver
• Traffic is evenly distributed to replicas 4bp80 6dk12 g1sdf
It’s a piece of cake with Kubernetes!
How to replicate your service instances
Specify your Docker image and a replication factor
using Deployment
Server
Node agent Node agent Node agent
Docker engine Docker engine Docker engine
Specify a common label webserver webserver webserver
to group containers with 4bp80 app=web1 6dk12 app=web1 g1sdf app=web1
different names
node 1 node 2 node 3
Define a service to do round-robin forwarding External traffic
over internet
<ingress>
Server metatron:80
<service>
webserver:80
Internal traffic
33% 33% 33%
webserver webserver webserver
4bp80 app=web1 6dk12 app=web1 g1sdf app=web1
node 1 node 2 node 3
Kubernetes runs its own DNS server for name resolution
Kubernetes manipulates iptables on each node to proxy traffic
How to guarantee a certain # of running containers during maintenance
Define disruption budget
to specify requirement for
the minimum available containers
Drain node1 Drain node2
Operation is permitted Operation not permitted
because allowed-disruptions=1 because allowed-disruptions=0
(Note that minAvailable=2)
Kubernetes Kubernetes
node1 node2 node3 node1 node2 node3
zk-0 zk-2 zk-3 zk-0 zk-2 zk-3
3 replicas have to be running
due to StatefulSet,
Containers Containers Containers Containers Containers Containers
so try scheduling zk-0
on other nodes!
Volumes Volumes Volumes Volumes Volumes Volumes
Oops!
cannot schedule zk-0
on node2 and node3 Please wait until
Hold on for a while due to anti-affinity! node1 is up and zk-0 is rescheduled!
PdM Kubernetes cluster
Statefulset Statefulset
Zookeeper headless service Kafka headless service
Pod Pod Pod Pod Pod Pod
QuorumPeer QuorumPeer QuorumPeer Kafka Kafka Kafka
Main Main Main (broker) (broker) (broker)
2181
2888
3888
2181
2888
3888
2181
2888
3888
9092
9092
9092
Kafka Kafka
consumer producer
Web
Volume Attached server
volume Ingress
PdM engine 8080
rule
80
Persistent
storage Pod (Deployment)
PdM service
Overview & Conclusion
• Docker to build portable software
• Build your software upon Docker
• Then distribute it anywhere (even on MS Azure and AWS)
• Kubernetes to orchestrate multiple Docker instances
• Start using Docker and Kubernetes before too late!
• Google has been using container technologies more than 10 years
Hadoop
Docker
Kubernetes
Popularity of Docker and Kubernetes The Enterprise IT Adoption Cycle
the end