The document discusses container networking and microservices architecture, highlighting the need for service discovery, load balancing, and multi-tenancy in cloud environments. It explains how containers are lightweight units of software that help reduce conflicts in development environments, while microservices allow for autonomous and scalable applications. Additionally, it elaborates on networking models such as CNM and CNI, which facilitate container communication and management, particularly in platforms like Docker and Kubernetes.
2017
Containers
• A containerimage is a lightweight, stand-alone, executable unit of software
• Includes everything needed to run it: code, runtime, system tools, system libraries, settings
• Containerized software run regardless of the environment (i.e. Host OS distro)
• Containers isolate software from its surroundings
– “smooth out” differences between development and staging environments
• Help reduce conflicts between teams running different software on the same infrastructure
6
What Developers Want:
Portable Fast Light
What IT Ops Needs:
Network
Services
Data
Persistence
Rich
SLAs
Consistent
Management
+
Security
Isolation
7.
2017
Containers “at-a-glance”
Physical Server
Hypervisor
VMVM
Bins/Libraries
App A
Bins/Libraries
App B
Physical Server
Bins/Libraries
App A
Bins/Libraries
App B
Container Engine
Guest OSGuest OS
Host OS Host OS
Containers are isolated,
but share OS and (where
appropriate) bins/libraries
Server
with
VMs
Server
with
Containers
Abstraction at the OS layer rather than hardware layer
7
8.
2017
Microservices: Application Designis changing !!!
Properties of a Microservice
ü Small code base
ü Easy to scale, deploy and throw away
ü Autonomous
ü Resilient
Benefits of a Microservices Architecture
ü A highly resilient, scalable and resource
efficient application
ü Enables smaller development teams
ü Teams free to use the right languages and
tools for the job
ü Rapid application development
8
9.
2017
Cloud Native Application
Applicationsbuilt using the “Microservices” architecture pattern
User mgmt. Payments Inventory
Billing Delivery Notification
API GW Web UI Mobile
• Loosely coupled distributed application
Application tier is decomposed into multiple web services
• Datastore
Each micro service typically has its own datastore
• Packaging
Each microservice is typically packaged in a “Container”
image
• Teams
Typically a team owns one or more Microservices
9
10.
2017
More on Microservices….
10
•Microservices != Containers
• The idea behind Microservices is to
separate functionality into small parts that
are created independently, by different teams,
and possibly even in very different languages
• Microservices communicate with each other
using language-agnostic APIs
(e.g. REST)
• The host for each Microservice could be
a VM, but containers are seen are ideal
packaging unit to deploy a Microservice => low footprint
https://upload.wikimedia.org/wikipedia/commons/9/9b/
Social_Network_Analysis_Visualization.png
11.
2017
Challenges of runningMicroservices…
• Service Discovery
• Operational Overhead (100s+ of Services !!!)
• Distributed System... inherently complex
• Service Dependencies
– service fan-out
– dependency services running “hot”
• Traffic / Load each service can handle
• Service Health / Fault Tolerance
• Auto-Scale
11
12.
2017
Applications and Micro-Services
12
ServiceA
Instance #1
Service A
Instance #2
Service A
Instance #3
Internet
Users accessing
services
Service B
Instance #1
Service B
Instance #2
Service B
Instance #3
Service C
Instance #1
Service C
Instance #2
Service A
Service B
Service C
External
Network
System
Administrator
2017
Basics of ContainerNetworking
Minimalist Networking requirements:
• IP Connectivity in Container’s Network
• IP Address Management (IPAM) and
Network Device Creation
• External Connectivity via Host NAT or
Route Advertisement
Bare Metal / Virtual Machine Bare Metal / VM
OS Networking OS Networking
14
2017
Docker: The ContainerNetwork Model (CNM) Interfacing
17
• Sandbox
– A Sandbox contains the configuration of a container's network stack. This includes management of the
container's interfaces, routing table and DNS settings. An implementation of a Sandbox could be a
Linux Network Namespace, a FreeBSD Jail or other similar concept.
• Endpoint
– An Endpoint joins a Sandbox to a Network. An implementation of an Endpoint could be a veth pair, an
Open vSwitch internal port or similar
• Network
– A Network is a group of Endpoints that are able to communicate with each-other directly. An
implementation of a Network could be a VXLAN Segment, a Linux bridge, a VLAN, etc.
Backend Container
Network
Sandbox
Backend Network Frontend Network
GW Bridge
Container Host
App Container
Network
Sandbox
GW Bridge
Container Host
Frontend Container
Network
Sandbox
GW Bridge
Container Host
External
Network
Endpoint
18.
2017
Container Network Model(CNM)
• The intention is for CNM (aka libnetwork) to implement and use any kind of networking
technology to connect and discover containers
• Partitioning, Isolation, and Traffic Segmentation are achieved by dividing network addresses
• CNM does not specify one preferred methodology for any network overlay scheme
18
19.
2017
Docker Host (VM)
Dockernetworking – Using the defaults
19
int
eth0
192.168.178.0/24
192.168.178.100
int
docker 0
172.17.42.1/16
Iptables
Firewall
Linux
Kernel
Routing
Linux
Bridge
‘docker0’
Iptables
Firewall
Iptables
Firewall
int
veth0f00eed
int
veth27e6b05
container
container
172.17.0.1/16
172.17.0.2/16
20.
2017
Docker Swarm &&libnetwork – Built-In Overlay model
20
Swarm Master
Admin-Clients
docker network …
Distributed Key-Value
Store node(s)
master writes
available
global overlay
networks in kvs
Swarm Node (Docker Host) Swarm Node (Docker Host)
nodes write
endpoints seen
with all their
details into kvs
Nodes create the
networks seen in kvs
as new lx bridges
int
eth0
int
eth0
docker_gwbridge
User_defined_net User_defined_net
docker_gwbridge
Each container has two interfaces
• eth0 = Plugs into the overlay
• eth1 = Plugs into a local bridge for
NAT internet / uplink access
Overlay networks are
implemented with fixed
/ static MAC to VTEP
mappings
Datacenter of public cloud provider Network
21.
2017
Docker Networking –key points
• Docker adopts the Container Network Model (CNM), providing the following contract
between networks and containers:
• All containers on the same network can communicate freely with each other
• Multiple networks are the way to segment traffic between containers and should be supported by all drivers
• Multiple endpoints per container are the way to join a container to multiple networks
• An endpoint is added to a network sandbox to provide it with network connectivity
• Docker Engine can create overlay networks on a single host. Docker Swarm can create
overlay networks that span hosts in the cluster
• A container can be assigned an IP on an overlay network. Containers that use the same
overlay network can communicate, even if they are running on different hosts
• By default, nodes in the swarm encrypt traffic between themselves and other nodes.
Connections between nodes are automatically secured through TLS authentication with
certificates
21
2017
Kubernetes Node (Minion)
KubernetesNode (Minion)
Kubernetes Architectural overview
23
Kubernetes Master
Master components are colocated or
spread across machines
APIs
scheduler
Controller Manager
(replication controller, etc)
Distributed Key-Value Store
node(s) (etcd)
Scheduling
actuator
REST interface
(pods, services,
rep. controllers)
Authentication /
Authorization
Admin-Clients
(kubectl, ..)
Kubernetes
Nodes
(Minions)
Users accessing
services
Docker engine
Control Pod
Pod
Pod
cadvisor Pause
Kubelet Kube-Proxy
skyDNS
24.
2017
Quick Overview ofKubernetes
Kubernetes (k8s) = Open Source Container Cluster Manager
• Pods: tightly coupled group of containers
• Replication controller: ensures that a specified number of
pod "replicas" are running at any one time.
• Networking: Each pod gets its own IP address
• Service: Load balanced endpoint for a set of pods with internal and external
IP endpoints
• Service Discovery: Using env variable injection or SkyDNS with the Service
• Uses etcd as distributed key-value store
• Has its roots in ‘borg’, Google’s internal container cluster management
24
25.
2017
Kubernetes
Node
(Minion)
Kubernetes Node (Minion)– Docker networking details
25
ip route 10.24.1.0/24 10.240.0.3
• Traffic destined to a POD is
routed by the IaaS network to the
Kubernetes node that ‘owns’ the
subnet
Pod
Pause
Kubernetes
Node
(Minion)
Pod
Pause
Pod
Pause
Pod
Pause
crb0
Linux bridge
int
cbr0
10.24.1.0/24
10.24.1.2 10.24.1.3 10.24.1.4
10.24.1.1
int
eth0
10.240.0.3
Iptables
Firewall
Kube-
Proxy
ip route 10.24.2.0/24 10.240.0.4
• Each POD uses one single IP
from the nodes IP range
• Every container in the POD
shares the same IP
26.
2017
Container Network Interface(CNI)
• Kubernetes uses the Container Network Interface (CNI) specification and plug-ins to
orchestrate networking
• Very differently from CNM, CNI is capable of addressing other containers’ IP addresses without
resorting to network address translation (NAT)
• Every time a POD is initialized or removed, the default CNI plug-in is called with the default
configuration
• This CNI plug-in creates a pseudo interface, attaches it to the relevant underlay network, sets
IP Address / Routes and maps it to the POD namespace
26
/etc/cni/net.d/10-bridge.conf
27.
2017
Kubernetes Networking –key points
• Kubernets adopts the Container Network Interface (CNI) model to provide a
contract between networks and containers
• From a user perspective, provisioning networking for a container involves two steps:
ØDefine the network JSON
ØConnect container to the network
• Internally, CNI provisioning involves three steps:
ØRuntime create a network namespace and gives it a name
ØInvokes the CNI plugin specified in the “type” field of the network JSON. Type field refers to the
plugin being used and so CNI invokes the corresponding binary
ØPlugin code in turn will create a veth pair, check the IPAM type and data in the JSON, invoke the
IPAM plugin, get the available IP, and finally assign the IP address to the interface
27
2017
Container Networking Specifications
ContainerNetworking Model
CNM
• Specification proposed by Docker,
adopted by projects such as
libnetwork
• Plugins built by projects such as
Weave, Project Calico and Kuryr
• Supports only Docker runtime
Container Networking Interface
CNI
• Specification proposed by CoreOS
and adopted by projects such as
Kubernetes, Cloud Foundry and
Apache Mesos
• Plugins built by projects such as
Weave, Project Calico, Contiv
Networking
• Supports any container runtime
29
30.
2017
CNI and CNMcommonalities…
• CNI and CNM models are both driver-based
– provide “freedom of selection” for a specific type of container networking
• Multiple Network drivers can be active and used concurrently
– 1-1 mapping among network type and network driver
• Containers are allowed to join one or more networks
• Container runtime can lunch network in its own namespace
– delegate to the network driver the responsibility of connecting the container to
the network
30
2017
Client vs Serverside Service discovery
• Client talks to Service registry and does
load balancing.
• Client service needs to be Service registry
aware.
eg: Netflix OSS
• Client talks to load balancer and load
balancer talks to Service registry.
• Client service need not be Service
registry aware
eg: Consul, AWS ELB, K8s, Docker
Client Discovery Server Discovery
34
35.
2017
What should ServiceDiscovery provide ?
• Discovery
– Services need to discover each other dynamically, to get IP address and port detail to
communicate with other services in the cluster
– Service Registry maintains a database of services and provides an external API
(HTTP/DNS). Typically implemented as a distributed key, value store
– Registrator registers services dynamically to Service registry by listening to Service
creation and deletion events
• Health check
– Monitoring Service Instance health dynamically and updates Service registry
appropriately
• Load balancing
– Traffic destined to a particular service should be dynamically load balanced to “healthy”
instances providing that service
35
36.
2017
Health Check options…
•Script based check
– User provided script is run periodically to verify health of the service.
• HTTP based check
– Periodic HTTP based check is done to the service IP and endpoint address.
• TCP based check
– Periodic TCP based check is done to the service IP and specified port.
• Container based check
– Health check application is available as a Container. Health Check Manager invokes the
Container periodically to do the health-check.
36
2017
Internal Load Balancer- IPVS
• IPVS (IP Virtual Server) implements transport-layer load balancing inside the Linux kernel, so
called Layer-4 switching
• It’s based on Netfilter and supports TCP, SCTP & UDP, v4 and v7
• IPVS is dynamically configurable, supports 8+ balancing methods, provides health checking
39
2017
Service Discovery
• Kubernetesprovides two options for internal service discovery :
– Environment variable: When a new Pod is created, environment variables from older services
can be imported. This allows services to talk to each other. This approach enforces ordering in
service creation.
– DNS: Every service registers to the DNS service; using this, new services can find and talk to
other services. Kubernetes provides the kube-dns service for this.
• Kubernetes provides several ways to expose services to the outside:
– NodePort: In this method, Kubernetes exposes the service through special ports (30000-32767)
of the node IP address.
– Loadbalancer: In this method, Kubernetes interacts with the cloud provider to create a load
balancer that redirects the traffic to the Pods. This approach is currently available with GCE
– Ingress Controller : Since Kubernetes v1.2.0 it’s possible to use Kubernetes ingress which
includes support for TLS and L7 http-based traffic routing
42
43.
2017
• Service namegets mapped to Virtual IP and port using Skydns
• Kube-proxy watches Service changes and updates IPtables. Virtual IP to Service IP, port
remapping is achieved using IP tables
• Kubernetes does not use DNS based load balancing to avoid some of the known issues
associated with it
Internal Load Balancing
43
2017
Ingress Load Balancingw/t Ingress Controller
• An Ingress is a collection of rules that allow inbound connections to reach the cluster services.
• It can be configured to give services externally-reachable urls, load balance traffic, terminate
SSL, offer name based virtual hosting etc
– Users request ingress by POSTing the Ingress resource to the API server.
• In order for the Ingress resource to work, the cluster must have an Ingress controller running.
The Ingress controller is responsible for fulfilling the Ingress dynamically by watching the
ApiServer’s /ingresses endpoint.
45
46.
2017
Networking for Services
46
Node1
ProjA-1 ProjB-1
10.10.10.2 10.10.10.3
Guest vSwitch
10.10.10.0/24 Node 2
10.10.20.2 10.10.20.3
Guest vSwitch
10.10.20.0/24
10.10.10.0/24 à 10.114.214.100
10.10.20.0/24 à 10.114.214.101
10.114.214.100/24 10.114.214.101/24
myapp.k8s.com à
{10.10.10.2, 10.10.20.2}
myapp.k8s.com
ProjA-2 ProjB-2
• K8s default networking configures
• Routable IP per POD
• Subnet per node / minion
• K8s Service provides East-West Load
Balancing
• Provides DNS based service discovery –
Service Name to IP
• Network Security Policy – in beta
• Not in K8s scope
• Edge LB – e.g. external to frontend
pods
• Routing of a subnet to k8s node
Node specific routes
Edge LB
2017
Multi-Tenancy and Applicationtiering (cont.)
49
Example of Multi-Tenancy Model
Tenant CTenant BTenant A
Project A – 250 GB, 100 vCPU
Access for paulf, jamesz and tinga
Project B – 200 GB, 200 vCPU
Access for kitc, mikep and mikew
Project E – 600 GB, 600 vCPU
Access for martijnb
Kubernetes
Project C – 250 GB, 150 vCPU
Access for stegeler and francisg
Pivotal CF
Kubernetes
VM
VM
VM
Project D – 300 GB, 100 vCPU
Access for tinga
DockerPivotal CF
VM
VM
VM
…to Overlay-based NetworkingModel…
• Neutron plugin talk to SDN Controller via vendor APIs
• SDN Controller manages vSwitches in the Hypervisors
• Vmware NSX, Contrail, Nuage, Midokura, …
53
54.
2017
Kubernetes
…to Cluster Deploymenton Logical Networks…
54
Master ‘VM’ Minion ‘VM’ Minion ‘VM’ Minion ‘VM’
Cluster Management Nodes - Logical Switch
Pod
1
Pod
3
Pod
5
Pod
2
Pod
4
Pod
6
etcd
Kube
DNS
API
Srv
Kube
DNS
Pod
1
Pod
2
Pod
3
Pod
5
Pod
6
Pod
4
Namespace ‘demo’ POD – Logical Switch
Namespace ‘foo’ POD - Logical Switch
kube-system POD - Logical Switch
Logical Router Edge Router
Kube
Proxy
Kube
Proxy
Internet /
Corporate
Network