KEMBAR78
Kubernetes introduction | PPTX
Kubernetes introduction
with a running example
Dongwon Kim, PhD
SK Telecom
Why we use Kubernetes?
Container-based virtualization + Container orchestration
Satisfying common needs in production
co-locating helper processes
mounting storage systems
distributing secrets
application health checking
replicating application instances
horizontal auto-scaling
naming and discovery
load balancing
rolling updates
resource monitoring
log access and ingestion
...
from a web page from the official site : https://kubernetes.io/docs/whatisk8s/
Pod – the basic unit of Kubernetes
• Components
• a group of containers
• docker, rkt (pronounced “rock-it”) from CoreOS, etc
• a group of shared storage called volumes
• ephemeral volume
• persistent volume
• host local directories
• nfs
• iscsi
• flocker
• Google Compute Engine (GCE) Persistent Disk
• Amazon Web Services (AWS) Elastic Block Store (EBS)
• Purpose
• model an application-specific logical host/VM
• Characteristics
• containers in a pod share IP addresses/ports
• containers in a pod can communicate via IPC
Pod
Container
(port : 1234)
Volume
(ephemeral)
Container
(port : 3456)
Container
(port : 5678)
Volume
(persistent)
Containers claim their volumes
ipc
Address : 10.244.1.10localhost:3456
Few things to consider when running Zookeeper with Kubernetes
• How to launch Zookeeper servers using a pod?
• How to give IDs to pods?
• What is the domain name of each pod?
• How to make sure a certain # of pods running during maintenance?
Pod
Zookeeper server (leader)
- myid : 1
- server.1
- zk-1:2888:3888
- server.2
- zk-2:2888:3888
- server.3
- zk-3:2888:3888
Zookeeper server
- myid : 2
- server.1
- zk-1:2888:3888
- server.2
- zk-2:2888:3888
- server.3
- zk-3:2888:3888
Zookeeper server
- myid : 3
- server.1
- zk-1:2888:3888
- server.2
- zk-2:2888:3888
- server.3
- zk-3:2888:3888
Kafka server
- broker.id : 1
- zookeeper.connect
- zk-1.zk:2181
- zk-2.zk:2181
- zk-3.zk:2181
Kafka server
- broker.id : 2
- zookeeper.connect
- zk-1.zk:2181
- zk-2.zk:2181
- zk-3.zk:2181
Kafka server
- broker.id : 3
- zookeeper.connect
- zk-1.zk:2181
- zk-2.zk:2181
- zk-3.zk:2181
Zookeeper
servers
(zk)
Kafka
servers
(kk)
Pod Pod
Pod Pod Pod
zk-1 zk-2 zk-3
kk-1 kk-1 kk-1
a majority quorum must be present
StatefulSet – a way of launching ordered replicas of a container
zk-0
Containers
Volumes
zk-1
Containers
Volumes
zk-2
Containers
Volumes
The StatefulSet creates 3 pods with ordinals suffixed to pod names,
and guarantees the followings:
pod-0
Containers
Volumes
pod-1
Containers
Volumes
pod-2
Containers
Volumes
pods are created sequentially
pod-0
Containers
Volumes
pod-1
Containers
Volumes
pod-2
Containers
Volumes
pods are deleted in reverse order
pod-0
Containers
Volumes
pod-1
Containers
Volumes
pod-2
Containers
Volumes pod-3
Containers
Volumes
Before a scaling op is applied
all its predecessors must be running
pod-0
Containers
Volumes
pod-1
Containers
Volumes
pod-2
Containers
Volumes
Before a pod is terminated,
all of its successors are shutdown
Each pod is created and scheduled
using this template
Each pod lays its claim to storage
using this template
Create 3 replicas of servers
using the following templates
Service (10.111.67.108)
Service – to represent a group of pods with a cluster IP
server-0
Containers
Volumes
server-1
Containers
Volumes
server-2
Containers
Volumes
Q) How to achieve the followings?
• Users must be unaware of the replicas
• Traffic is distributed over the replicas
server-0
Containers
Volumes
server-1
Containers
Volumes
server-2
Containers
Volumes
Let’s say that we have 3 replicas of a pod for load balancing
A) Define a service with a cluster IP.
Then Kubernetes does round-robin forwarding
Headless service – service without a common IP
• Zookeeper clients (e.g. Kafka) need to specify the address of each Zookeeper server
• Kubernetes depends on its DNS service for headless services
• Each pod is assigned a domain name from Kubernetes
• Each pod is directly accessed with its domain name (not through a cluster IP)
• Fully Qualified Domain Name (FQDN) format
• $pod.$service.$namespace.svc.cluster.local
Pod
Zookeeper server
- myid : 1
- server.1
- zk-1:2888:3888
- server.2
- zk-2:2888:3888
- server.3
- zk-3:2888:3888
Zookeeper server
- myid : 2
- server.1
- zk-1:2888:3888
- server.2
- zk-2:2888:3888
- server.3
- zk-3:2888:3888
Zookeeper server
- myid : 3
- server.1
- zk-1:2888:3888
- server.2
- zk-2:2888:3888
- server.3
- zk-3:2888:3888
Kafka server
- broker.id : 1
- zookeeper.connect
- zk-1.zk:2181
- zk-2.zk:2181
- zk-3.zk:2181
Kafka server
- broker.id : 2
- zookeeper.connect
- zk-1.zk:2181
- zk-2.zk:2181
- zk-3.zk:2181
Kafka server
- broker.id : 3
- zookeeper.connect
- zk-1.zk:2181
- zk-2.zk:2181
- zk-3.zk:2181
Zookeeper
servers
(zk)
Kafka
servers
(kk)
Pod Pod
Pod Pod Pod
zk-1 zk-2 zk-3
kk-1 kk-1 kk-1
Namespace in Kubernetes
zk-0
Containers
Volumes
zk-1
Containers
Volumes
zk-2
Containers
Volumes
Three pods are defined within zk-headless service,
and they are given DNS entries of the following format:
pod.service.namespace.svc.cluster.local
zk-headless service
zk-1:2181 (within service)
zk-1.zk-headless:2181 (within same namespace)
default namespace
kafka service
kk-0
Containers
Volumes
kk-1
Containers
Volumes
kk-2
Containers
Volumes
kk-3
Containers
Volumes
zk-1.zk-headless.default.svc.cluster.local:2181 (from other namespace)
alien namespace
The default namespace is used
as there’s no namespace declaration
Pod anti-affinity
This pod should not run in X in which one or more pods that satisfy Y are
running.
- X belongs to topology domain
- node (topologyKey:kubernetes.io/hostname in this example)
- rack
- cloud provider zone
- cloud provider region
- Y is a label selector
- it selects all pods belonging to a service named zk-headless
⇓ debugging hook (a pod pauses until it is set to true)
kube-scheduler is about to schedule pod2 labeled app=zk-headless,
but wants to avoid node3 because there’s pod1 labeled app=zk-headless.
Kubernetes provides pod anti-affinity for this case.
node1 node2 node3
pod1
Containers
Volumes
pod2
Containers
Volumes
app=
zk-headless
kube-
scheduler
app=
zk-headless
Files in the container image
• Dockerfile
1. Download the latest Zookeeper tarball
2. Extract and place the content under /opt/zookeeper
3. ln -s /opt/zookeeper/* /usr/bin
• zkGenConfig.sh
1. create zoo.cfg
2. configure log-related properties
3. create data directories
4. set myid extracted from domain name
• ex) zk-0.zk-headless.default.svc.cluster.local  0+1 = 1
• zkOk.sh
• check readiness and liveness of a pod
⇓ it’s from Zookeeper
Environmental variables for container processes in a pod
env defines environmental variables
to be used in container processes.
Two ways to assign values
1. value = constant val
2. valueFrom = val from ConfigMap
Readiness & liveness check for containers
Kubernetes provides a means of checking
readiness & liveness
Kubernetes
How to guarantee a certain # of running pods during maintenance
• Users can define PodDisruptionBudget with minAvailable
• At least two pods from zk must be available at any time
• Below is an example illustrating PodDisruptionBudget
• together with StatefulSet and PodAntiAffinity
node1
zk-0
Containers
Volumes
node2
zk-2
Containers
Volumes
node3
zk-3
Containers
Volumes
Drain node1
Operation is permitted
because allowed-disruptions=1
Kubernetes
Drain node2
3 replicas have to be running
due to StatefulSet,
so try scheduling zk-0
on other nodes!
Oops!
cannot schedule zk-0
on node2 and node3
due to PodAntiAffinity!
Operation not permitted
because allowed-disruptions=0
(Note that minAvailable=2)
Please wait until
node1 is up and zk-0 is rescheduled!
node1
zk-0
Containers
Volumes
node2
zk-2
Containers
Volumes
node3
zk-3
Containers
Volumes
Scaling issue with Zookeeper
• Dynamically changing the membership of a replicated distributed system, while
preserving data consistency and system availability, is challenging
• from “Dynamic Reconfiguration of Primary/Backup Clusters” in USENIX ATC 2012
• Prior to Zookeeper 3.5.0 (We use 3.4.9 which is the latest stable version at this point)
• Configuration parameters are loaded during boot
• Configuration parameters are immutable at runtime
• Operators have to carefully restart all daemons
• Starting with Zookeeper 3.5.0,
• Full support for automated configuration changes
• without service interruption while preserving data consistency
• Set of zookeeper servers, roles of servers, all ports, and even quorum systems
* https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html
Scaling up/down a StatefulSet
StatefulSet itself has means to scaling up/down
• kubectl scale statefulset $statefulSetInstanceName --replicas=5
• kubectl patch statefulset $statefulSetInstanceName -p '{"spec":{"replicas":3}}’
Topics not covered here
• Detailed architecture of Kubernetes
• https://github.com/kubernetes/community/blob/master/contributors/design-
proposals/architecture.md
• ReplicaSet and Deployment (other than StatefulSet)
• https://kubernetes.io/docs/user-guide/replicasets/
• https://kubernetes.io/docs/user-guide/deployments/
• Persistent Volume and Persistent Volume Claim
• https://kubernetes.io/docs/user-guide/volumes/
• Kubernetes network (Proxy, DNS, etc)
• https://kubernetes.io/docs/admin/networking/
• https://kubernetes.io/docs/admin/dns/
The end

Kubernetes introduction

  • 1.
    Kubernetes introduction with arunning example Dongwon Kim, PhD SK Telecom
  • 2.
    Why we useKubernetes? Container-based virtualization + Container orchestration Satisfying common needs in production co-locating helper processes mounting storage systems distributing secrets application health checking replicating application instances horizontal auto-scaling naming and discovery load balancing rolling updates resource monitoring log access and ingestion ... from a web page from the official site : https://kubernetes.io/docs/whatisk8s/
  • 3.
    Pod – thebasic unit of Kubernetes • Components • a group of containers • docker, rkt (pronounced “rock-it”) from CoreOS, etc • a group of shared storage called volumes • ephemeral volume • persistent volume • host local directories • nfs • iscsi • flocker • Google Compute Engine (GCE) Persistent Disk • Amazon Web Services (AWS) Elastic Block Store (EBS) • Purpose • model an application-specific logical host/VM • Characteristics • containers in a pod share IP addresses/ports • containers in a pod can communicate via IPC Pod Container (port : 1234) Volume (ephemeral) Container (port : 3456) Container (port : 5678) Volume (persistent) Containers claim their volumes ipc Address : 10.244.1.10localhost:3456
  • 4.
    Few things toconsider when running Zookeeper with Kubernetes • How to launch Zookeeper servers using a pod? • How to give IDs to pods? • What is the domain name of each pod? • How to make sure a certain # of pods running during maintenance? Pod Zookeeper server (leader) - myid : 1 - server.1 - zk-1:2888:3888 - server.2 - zk-2:2888:3888 - server.3 - zk-3:2888:3888 Zookeeper server - myid : 2 - server.1 - zk-1:2888:3888 - server.2 - zk-2:2888:3888 - server.3 - zk-3:2888:3888 Zookeeper server - myid : 3 - server.1 - zk-1:2888:3888 - server.2 - zk-2:2888:3888 - server.3 - zk-3:2888:3888 Kafka server - broker.id : 1 - zookeeper.connect - zk-1.zk:2181 - zk-2.zk:2181 - zk-3.zk:2181 Kafka server - broker.id : 2 - zookeeper.connect - zk-1.zk:2181 - zk-2.zk:2181 - zk-3.zk:2181 Kafka server - broker.id : 3 - zookeeper.connect - zk-1.zk:2181 - zk-2.zk:2181 - zk-3.zk:2181 Zookeeper servers (zk) Kafka servers (kk) Pod Pod Pod Pod Pod zk-1 zk-2 zk-3 kk-1 kk-1 kk-1 a majority quorum must be present
  • 5.
    StatefulSet – away of launching ordered replicas of a container zk-0 Containers Volumes zk-1 Containers Volumes zk-2 Containers Volumes The StatefulSet creates 3 pods with ordinals suffixed to pod names, and guarantees the followings: pod-0 Containers Volumes pod-1 Containers Volumes pod-2 Containers Volumes pods are created sequentially pod-0 Containers Volumes pod-1 Containers Volumes pod-2 Containers Volumes pods are deleted in reverse order pod-0 Containers Volumes pod-1 Containers Volumes pod-2 Containers Volumes pod-3 Containers Volumes Before a scaling op is applied all its predecessors must be running pod-0 Containers Volumes pod-1 Containers Volumes pod-2 Containers Volumes Before a pod is terminated, all of its successors are shutdown Each pod is created and scheduled using this template Each pod lays its claim to storage using this template Create 3 replicas of servers using the following templates
  • 6.
    Service (10.111.67.108) Service –to represent a group of pods with a cluster IP server-0 Containers Volumes server-1 Containers Volumes server-2 Containers Volumes Q) How to achieve the followings? • Users must be unaware of the replicas • Traffic is distributed over the replicas server-0 Containers Volumes server-1 Containers Volumes server-2 Containers Volumes Let’s say that we have 3 replicas of a pod for load balancing A) Define a service with a cluster IP. Then Kubernetes does round-robin forwarding
  • 7.
    Headless service –service without a common IP • Zookeeper clients (e.g. Kafka) need to specify the address of each Zookeeper server • Kubernetes depends on its DNS service for headless services • Each pod is assigned a domain name from Kubernetes • Each pod is directly accessed with its domain name (not through a cluster IP) • Fully Qualified Domain Name (FQDN) format • $pod.$service.$namespace.svc.cluster.local Pod Zookeeper server - myid : 1 - server.1 - zk-1:2888:3888 - server.2 - zk-2:2888:3888 - server.3 - zk-3:2888:3888 Zookeeper server - myid : 2 - server.1 - zk-1:2888:3888 - server.2 - zk-2:2888:3888 - server.3 - zk-3:2888:3888 Zookeeper server - myid : 3 - server.1 - zk-1:2888:3888 - server.2 - zk-2:2888:3888 - server.3 - zk-3:2888:3888 Kafka server - broker.id : 1 - zookeeper.connect - zk-1.zk:2181 - zk-2.zk:2181 - zk-3.zk:2181 Kafka server - broker.id : 2 - zookeeper.connect - zk-1.zk:2181 - zk-2.zk:2181 - zk-3.zk:2181 Kafka server - broker.id : 3 - zookeeper.connect - zk-1.zk:2181 - zk-2.zk:2181 - zk-3.zk:2181 Zookeeper servers (zk) Kafka servers (kk) Pod Pod Pod Pod Pod zk-1 zk-2 zk-3 kk-1 kk-1 kk-1
  • 8.
    Namespace in Kubernetes zk-0 Containers Volumes zk-1 Containers Volumes zk-2 Containers Volumes Threepods are defined within zk-headless service, and they are given DNS entries of the following format: pod.service.namespace.svc.cluster.local zk-headless service zk-1:2181 (within service) zk-1.zk-headless:2181 (within same namespace) default namespace kafka service kk-0 Containers Volumes kk-1 Containers Volumes kk-2 Containers Volumes kk-3 Containers Volumes zk-1.zk-headless.default.svc.cluster.local:2181 (from other namespace) alien namespace The default namespace is used as there’s no namespace declaration
  • 9.
    Pod anti-affinity This podshould not run in X in which one or more pods that satisfy Y are running. - X belongs to topology domain - node (topologyKey:kubernetes.io/hostname in this example) - rack - cloud provider zone - cloud provider region - Y is a label selector - it selects all pods belonging to a service named zk-headless ⇓ debugging hook (a pod pauses until it is set to true) kube-scheduler is about to schedule pod2 labeled app=zk-headless, but wants to avoid node3 because there’s pod1 labeled app=zk-headless. Kubernetes provides pod anti-affinity for this case. node1 node2 node3 pod1 Containers Volumes pod2 Containers Volumes app= zk-headless kube- scheduler app= zk-headless
  • 10.
    Files in thecontainer image • Dockerfile 1. Download the latest Zookeeper tarball 2. Extract and place the content under /opt/zookeeper 3. ln -s /opt/zookeeper/* /usr/bin • zkGenConfig.sh 1. create zoo.cfg 2. configure log-related properties 3. create data directories 4. set myid extracted from domain name • ex) zk-0.zk-headless.default.svc.cluster.local  0+1 = 1 • zkOk.sh • check readiness and liveness of a pod ⇓ it’s from Zookeeper
  • 11.
    Environmental variables forcontainer processes in a pod env defines environmental variables to be used in container processes. Two ways to assign values 1. value = constant val 2. valueFrom = val from ConfigMap
  • 12.
    Readiness & livenesscheck for containers Kubernetes provides a means of checking readiness & liveness
  • 13.
    Kubernetes How to guaranteea certain # of running pods during maintenance • Users can define PodDisruptionBudget with minAvailable • At least two pods from zk must be available at any time • Below is an example illustrating PodDisruptionBudget • together with StatefulSet and PodAntiAffinity node1 zk-0 Containers Volumes node2 zk-2 Containers Volumes node3 zk-3 Containers Volumes Drain node1 Operation is permitted because allowed-disruptions=1 Kubernetes Drain node2 3 replicas have to be running due to StatefulSet, so try scheduling zk-0 on other nodes! Oops! cannot schedule zk-0 on node2 and node3 due to PodAntiAffinity! Operation not permitted because allowed-disruptions=0 (Note that minAvailable=2) Please wait until node1 is up and zk-0 is rescheduled! node1 zk-0 Containers Volumes node2 zk-2 Containers Volumes node3 zk-3 Containers Volumes
  • 14.
    Scaling issue withZookeeper • Dynamically changing the membership of a replicated distributed system, while preserving data consistency and system availability, is challenging • from “Dynamic Reconfiguration of Primary/Backup Clusters” in USENIX ATC 2012 • Prior to Zookeeper 3.5.0 (We use 3.4.9 which is the latest stable version at this point) • Configuration parameters are loaded during boot • Configuration parameters are immutable at runtime • Operators have to carefully restart all daemons • Starting with Zookeeper 3.5.0, • Full support for automated configuration changes • without service interruption while preserving data consistency • Set of zookeeper servers, roles of servers, all ports, and even quorum systems * https://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html
  • 15.
    Scaling up/down aStatefulSet StatefulSet itself has means to scaling up/down • kubectl scale statefulset $statefulSetInstanceName --replicas=5 • kubectl patch statefulset $statefulSetInstanceName -p '{"spec":{"replicas":3}}’
  • 16.
    Topics not coveredhere • Detailed architecture of Kubernetes • https://github.com/kubernetes/community/blob/master/contributors/design- proposals/architecture.md • ReplicaSet and Deployment (other than StatefulSet) • https://kubernetes.io/docs/user-guide/replicasets/ • https://kubernetes.io/docs/user-guide/deployments/ • Persistent Volume and Persistent Volume Claim • https://kubernetes.io/docs/user-guide/volumes/ • Kubernetes network (Proxy, DNS, etc) • https://kubernetes.io/docs/admin/networking/ • https://kubernetes.io/docs/admin/dns/
  • 17.