KEMBAR78
Docker Networking Deep Dive | PDF
Docker Networking Deep Dive
@MadhuVenugopal
online meetup 08/24/2016
• What is libnetwork
• CNM
• 1.12 Features
• Multihost networking
• Secured Control plane & Data plane
• Service Discovery
• Native Loadbalacing
• Routing Mesh
• Demo
Agenda
Overview
It is not just a driver interface
• Docker networking fabric
• Defines Container Networking Model
• Provides builtin IP address management
• Provides native multi-host networking
• Provides native Service Discovery and Load Balancing
• Allows for extensions by the ecosystem via plugins
What is libnetwork?
Design Philosophy
• Users First:
• Application Developers
• IT/Network Ops
• Plugin API Design
• Batteries Included but Swappable
Docker Networking
1.7 1.8 1.9 1.10 1.11
- Libnetwork
- CNM
- Migrated Bridge, host,
none drivers to CNM
- Multihost Networking
- Network Plugins
- IPAM Plugins
- Network UX/API
Service Discovery

(using /etc/hosts)
Distributed DNS
- Aliases
- DNS Round Robin LB
1.12
- Load Balancing
- Encrypted Control and
data plane
- Routing Mesh
- Built-in Swarm-mode
networking
Container Networking Model
• Endpoint
• Network
• Sandbox
• Drivers & Plugins
https://github.com/docker/libnetwork/blob/master/docs/design.md
Network driver overview
Use-case1
Default Bridge Network
(docker0)
eth0 eth0 eth0
docker0 docker0 docker0
C1
eth0 eth0
C2
eth0
C3 C1
eth0 eth0
C2
eth0
C3 C1
eth0 eth0
C2
eth0
C3
ToR switch / Hypervisor switch / …
iptables : 

NAT / port-mapping
iptables : 

NAT / port-mapping
iptables : 

NAT / port-mapping
Use-case2
User-Defined Bridge Network
Host1 : 

$ docker network create -d bridge -o com.docker.network.bridge.name=brnet brnet

$ docker run --net=brnet -it busybox ifconfig
eth0
brnet
172.18.0.1
ToR switch / Hypervisor switch / …
eth0
C1
Host1
eth0
C2
eth0
C3
iptables : 

NAT / port-mapping
eth0
brnet
172.18.0.1
eth0
C4
Host2
eth0
C5
eth0
C6
iptables : 

NAT / port-mapping
eth0
brnet
172.18.0.1
eth0
C7
Host3
eth0
C8
eth0
C9
iptables : 

NAT / port-mapping
Host2 : 

$ docker network create -d bridge -o com.docker.network.bridge.name=brnet brnet

$ docker run --net=brnet -it busybox ifconfig
Host3 : 

$ docker network create -d bridge -o com.docker.network.bridge.name=brnet brnet

$ docker run --net=brnet -it busybox ifconfig
Use-case3
Bridge Network plumbed to underlay with built-in IPAM

(no NAT / Port-mapping)
Host1 : 

$ docker network create -d bridge --subnet=192.168.57.0/24 --ip-range=192.168.57.32/28 --gateway=192.168.57.11 --aux-address
DefaultGatewayIPv4=192.168.57.1 -o com.docker.network.bridge.name=brnet brnet
$ brctl addif brnet eth2

$ docker run --net=brnet -it busybox ifconfig
Host2 :
$ docker network create -d bridge --subnet=192.168.57.0/24 --ip-range=192.168.57.64/28 --gateway=192.168.57.12 --aux-address
DefaultGatewayIPv4=192.168.57.1 -o com.docker.network.bridge.name=brnet brnet

$ brctl addif brnet eth2

$ docker run --net=brnet -it busybox ifconfig
Host3 :
$ docker network create -d bridge --subnet=192.168.57.0/24 --ip-range=192.168.57.128/28 --gateway=192.168.57.13 --aux-address
DefaultGatewayIPv4=192.168.57.1 -o com.docker.network.bridge.name=brnet brnet
$ brctl addif brnet eth2

$ docker run --net=brnet -it busybox ifconfig
eth2

192.168.57.11
brnet
192.168.57.11
ToR switch / Hypervisor switch / Virtual-box host-only / … (Gateway : 192.168.57.1)
eth0
C1
Host1
eth0
C2
eth0
C3
eth2

192.168.57.12
brnet
192.168.57.12
eth0
C4
eth0
C5
eth0
C6
eth2

192.168.57.13
brnet
192.168.57.13
eth0
C7
eth0
C8
eth0
C9
Host2 Host3
Use-case4
Docker Overlay Network
eth0
C1
eth1 eth1 eth1
ToR switch / Hypervisor switch / …
docker0docker_gwbridge
eth0
eth1 eth1 eth1
docker0docker_gwbridge
eth0
eth1 eth1 eth1
docker0docker_gwbridge
ov-net1 ov-net1 ov-net1
VXLAN-VNI 100 VXLAN-VNI 100
eth0 eth0 eth0 eth0 eth0 eth0 eth0 eth0 eth0
VXLAN-VNI 100
iptables : 

NAT / port-mapping
iptables : 

NAT / port-mapping
iptables : 

NAT / port-mapping
Docker overlay networking
C2 C3 C4 C5 C6 C7 C8 C9
Use-case5
Plumbed to underlay vlan with built-in IPAM
macvlan driver (& experimental ipvlan)
https://github.com/docker/docker/blob/master/experimental/vlan-networks.md
# vlan 10 (eth0.10)
$ docker network create -d macvlan —subnet=10.1.10.0/24 —
gateway=10.1.10.1 -o parent=eth0.10 mcvlan10
$ docker run --net=mcvlan10 -it --rm alpine /bin/sh
# vlan 20 (eth0.20)
$ docker network create -d macvlan —subnet=10.1.20.0/24 —
gateway=10.1.20.1 -o parent=eth0.20 mcvlan20
$ docker run --net=mcvlan20 -it --rm alpine /bin/sh
# vlan 30 (eth0.30)
$ docker network create -d macvlan —subnet=10.1.30.0/24 —
gateway=10.1.30.1 -o parent=eth0.30 mcvlan30
$ docker run --net=mcvlan30 -it --rm alpine /bin/sh
Docker 1.12 Networking
New features in 1.12 swarm mode
CNM
Routing
Mesh
Multi-host
Networking
without external
k/v store
Service
Discovery
Secure

Data-Plane
Secure

Control-Plane
Load

Balancing
• Cluster aware
• De-centralized control
plane
• Highly scalable
Swarm-mode Multi-host networking
Manager
Network
Create
Orchestrator
Allocator
Scheduler
Dispatcher
Service
Create
Task
Create
Task
Dispatch
Task
Dispatch
Gossip
Worker1 Worker2
Engine
Libnetwork
Engine
Libnetwork
• VXLAN based data path
• No external key-value store
• Central resource allocation
• Improved performance
• Highly scalable
• Gossip based protocol
• Network scoped
• Fast convergence
• Secure by default
• periodic key rotations
• swarm native key-exchange
• Gossips control messages
• Routing-states
• Service-discovery
• Plugin-data
• Highly scalable
Secured network control plane
Cluster Scope Gossip
W1
W2
W3
W1
W5
W4
Network Scope Gossip
Network Scope Gossip
• Available as an option during
overlay network creation
• Uses kernel IPSec modules
• On-demand tunnel setup
• Swarm native key-exchange
• Periodic key rotations
Secure dataplane
Worker1
Worker2
Worker3
secure
network
secure
network
IPSec Tunnel
IPSec Tunnel
IPSec Tunnel
secure
network
secure
network
non-
secure
network
non-
secure
network
Open UDP traffic
• Provided by embedded DNS
• Highly available
• Uses Network Control Plane to learn state
• Can be used to discover both tasks and
services
Service Discovery
engine
DNS Server
DNS Resolver DNS Resolver
DNS requests
• Internal & Ingress load-balancing
• Supports VIP & DNS-RR
• Highly available
• Uses Network Control Plane to learn state
• Minimal Overhead
Load balancer
Task1
ServiceA
Task2
ServiceA
Task3
ServiceA
Client1 Client2
VIP LB VIP LB
• Builtin routing mesh for edge routing
• Worker nodes themselves participate in
ingress routing mesh
• All worker nodes accept connection
requests on PublishedPort
• Port translation happens at the worker
node
• Same internal load balancing mechanism
used to load balance external requests
Routing mesh
External
Loadbalancer
(optional)
Task1
ServiceA Task1
ServiceA
Task1
ServiceA
Worker1 Worker2
Ingress Network
8080 8080
VIP LB VIP LB
8080->80 8080->80
Routing Mesh
• Operator reserves a swarm-
wide ingress port (8080) for
myapp
• Every node listens on 8080
• Container-aware routing mesh
can transparently reroute traffic
from Worker3 to a node that is
running container
• Built in load balancing into the
Engine
• DNS-based service discovery
Worker 1
:8080
Manager
User accesses
myapp.com:8080:8080
Worker 2
:8080
Worker 3
:8080
frontend frontend
$ docker service create --replicas 3 --name frontend --network mynet
--publish 8080:80/tcp frontend_image:latest
frontend
Routing Mesh: Published Ports
• Operator reserves a swarm-
wide ingress port (8080) for
myapp
• Every node listens on 8080
• Container-aware routing mesh
can transparently reroute traffic
from Worker3 to a node that is
running container
• Built in load balancing into the
Engine
• DNS-based service discovery
Worker 1
:8080
Manager
User accesses
myapp.com:8080:8080
Worker 2
:8080
Worker 3
:8080
frontend frontend
$ docker service create --replicas 3 --name frontend --network mynet
--publish 8080:80/tcp frontend_image:latest
frontend
Deep Dive
Service , Port-Publish & Network
iptables
eth0 Host1
default_gwbridge
ingress-sbox
eth1
ingress-overlay-bridge
Ingress- Network
eth0
vxlan tunnel to host2 - vni-100vxlan tunnel to host3 - vni-100
eth0
Container-sbox
eth1
eth2
mynet
mynet-br vxlan tunnel to host2 - vni-101
docker service create —name=test —network=mynet -p 8080:80 —replicas=2 xxx
iptables
ipvs
iptables
ipvs
Host1: 8080
DNS Resolver
daemon embedded 

DNS server

service -> VIP
Day in life of a packet - IPTables & IPVS
Day in life of a packet - Routing Mesh & Ingress LB
iptables NAT table

DOCKER-INGRESS
DNAT : Published-Port -> ingress-sbox
eth0 Host1
default_gwbridge
ingress-sboxeth1
iptables NAT table

PREROUTING
Redirect -> service-port
iptables MANGLE table

PREROUTING
MARK : Published-Port -> <fw-mark-id>
IPVS
Match <fw-mark-id> -> Masq
{RR across container-IPs)
ingress-overlay-bridge
Ingress- Network
eth0
iptables NAT table

DOCKER-INGRESS
DNAT : Published-Port -> ingress-sbox
eth0 Host2
default_gwbridge
ingress-sbox
eth1
ingress-overlay-bridge
eth0
vxlan tunnel with vni
Ingress- Network
eth0
Container-sbox
(backs a task/
service)
eth1
Day in life of a packet - Internal LB
eth0 Host1
container-sbox

(service1)
eth1
iptables MANGLE table

OUTPUT
MARK : VIP -> <fw-mark-id>
IPVS
Match <fw-mark-id> -> Masq
{RR across container-IPs)
mynet-overlay-bridge
mynet
eth2
Host2
mynet-overlay-bridgevxlan tunnel with vni
mynet
eth2
Container-sbox
(service2)
Application looks up service2
(using embedded-DNS @ 127.0.0.11)
DNS Resolver
daemon embedded DNS server

service2 -> VIP2
vxlan tunnel with vni
Thank you!

Docker Networking Deep Dive

  • 1.
    Docker Networking DeepDive @MadhuVenugopal online meetup 08/24/2016
  • 2.
    • What islibnetwork • CNM • 1.12 Features • Multihost networking • Secured Control plane & Data plane • Service Discovery • Native Loadbalacing • Routing Mesh • Demo Agenda
  • 3.
  • 4.
    It is notjust a driver interface • Docker networking fabric • Defines Container Networking Model • Provides builtin IP address management • Provides native multi-host networking • Provides native Service Discovery and Load Balancing • Allows for extensions by the ecosystem via plugins What is libnetwork?
  • 5.
    Design Philosophy • UsersFirst: • Application Developers • IT/Network Ops • Plugin API Design • Batteries Included but Swappable
  • 6.
    Docker Networking 1.7 1.81.9 1.10 1.11 - Libnetwork - CNM - Migrated Bridge, host, none drivers to CNM - Multihost Networking - Network Plugins - IPAM Plugins - Network UX/API Service Discovery
 (using /etc/hosts) Distributed DNS - Aliases - DNS Round Robin LB 1.12 - Load Balancing - Encrypted Control and data plane - Routing Mesh - Built-in Swarm-mode networking
  • 7.
    Container Networking Model •Endpoint • Network • Sandbox • Drivers & Plugins https://github.com/docker/libnetwork/blob/master/docs/design.md
  • 8.
  • 9.
  • 10.
    eth0 eth0 eth0 docker0docker0 docker0 C1 eth0 eth0 C2 eth0 C3 C1 eth0 eth0 C2 eth0 C3 C1 eth0 eth0 C2 eth0 C3 ToR switch / Hypervisor switch / … iptables : 
 NAT / port-mapping iptables : 
 NAT / port-mapping iptables : 
 NAT / port-mapping
  • 11.
  • 12.
    Host1 : 
 $docker network create -d bridge -o com.docker.network.bridge.name=brnet brnet
 $ docker run --net=brnet -it busybox ifconfig eth0 brnet 172.18.0.1 ToR switch / Hypervisor switch / … eth0 C1 Host1 eth0 C2 eth0 C3 iptables : 
 NAT / port-mapping eth0 brnet 172.18.0.1 eth0 C4 Host2 eth0 C5 eth0 C6 iptables : 
 NAT / port-mapping eth0 brnet 172.18.0.1 eth0 C7 Host3 eth0 C8 eth0 C9 iptables : 
 NAT / port-mapping Host2 : 
 $ docker network create -d bridge -o com.docker.network.bridge.name=brnet brnet
 $ docker run --net=brnet -it busybox ifconfig Host3 : 
 $ docker network create -d bridge -o com.docker.network.bridge.name=brnet brnet
 $ docker run --net=brnet -it busybox ifconfig
  • 13.
    Use-case3 Bridge Network plumbedto underlay with built-in IPAM
 (no NAT / Port-mapping)
  • 14.
    Host1 : 
 $docker network create -d bridge --subnet=192.168.57.0/24 --ip-range=192.168.57.32/28 --gateway=192.168.57.11 --aux-address DefaultGatewayIPv4=192.168.57.1 -o com.docker.network.bridge.name=brnet brnet $ brctl addif brnet eth2
 $ docker run --net=brnet -it busybox ifconfig Host2 : $ docker network create -d bridge --subnet=192.168.57.0/24 --ip-range=192.168.57.64/28 --gateway=192.168.57.12 --aux-address DefaultGatewayIPv4=192.168.57.1 -o com.docker.network.bridge.name=brnet brnet
 $ brctl addif brnet eth2
 $ docker run --net=brnet -it busybox ifconfig Host3 : $ docker network create -d bridge --subnet=192.168.57.0/24 --ip-range=192.168.57.128/28 --gateway=192.168.57.13 --aux-address DefaultGatewayIPv4=192.168.57.1 -o com.docker.network.bridge.name=brnet brnet $ brctl addif brnet eth2
 $ docker run --net=brnet -it busybox ifconfig eth2
 192.168.57.11 brnet 192.168.57.11 ToR switch / Hypervisor switch / Virtual-box host-only / … (Gateway : 192.168.57.1) eth0 C1 Host1 eth0 C2 eth0 C3 eth2
 192.168.57.12 brnet 192.168.57.12 eth0 C4 eth0 C5 eth0 C6 eth2
 192.168.57.13 brnet 192.168.57.13 eth0 C7 eth0 C8 eth0 C9 Host2 Host3
  • 15.
  • 16.
    eth0 C1 eth1 eth1 eth1 ToRswitch / Hypervisor switch / … docker0docker_gwbridge eth0 eth1 eth1 eth1 docker0docker_gwbridge eth0 eth1 eth1 eth1 docker0docker_gwbridge ov-net1 ov-net1 ov-net1 VXLAN-VNI 100 VXLAN-VNI 100 eth0 eth0 eth0 eth0 eth0 eth0 eth0 eth0 eth0 VXLAN-VNI 100 iptables : 
 NAT / port-mapping iptables : 
 NAT / port-mapping iptables : 
 NAT / port-mapping Docker overlay networking C2 C3 C4 C5 C6 C7 C8 C9
  • 17.
    Use-case5 Plumbed to underlayvlan with built-in IPAM macvlan driver (& experimental ipvlan) https://github.com/docker/docker/blob/master/experimental/vlan-networks.md
  • 18.
    # vlan 10(eth0.10) $ docker network create -d macvlan —subnet=10.1.10.0/24 — gateway=10.1.10.1 -o parent=eth0.10 mcvlan10 $ docker run --net=mcvlan10 -it --rm alpine /bin/sh # vlan 20 (eth0.20) $ docker network create -d macvlan —subnet=10.1.20.0/24 — gateway=10.1.20.1 -o parent=eth0.20 mcvlan20 $ docker run --net=mcvlan20 -it --rm alpine /bin/sh # vlan 30 (eth0.30) $ docker network create -d macvlan —subnet=10.1.30.0/24 — gateway=10.1.30.1 -o parent=eth0.30 mcvlan30 $ docker run --net=mcvlan30 -it --rm alpine /bin/sh
  • 19.
  • 20.
    New features in1.12 swarm mode CNM Routing Mesh Multi-host Networking without external k/v store Service Discovery Secure
 Data-Plane Secure
 Control-Plane Load
 Balancing • Cluster aware • De-centralized control plane • Highly scalable
  • 21.
    Swarm-mode Multi-host networking Manager Network Create Orchestrator Allocator Scheduler Dispatcher Service Create Task Create Task Dispatch Task Dispatch Gossip Worker1Worker2 Engine Libnetwork Engine Libnetwork • VXLAN based data path • No external key-value store • Central resource allocation • Improved performance • Highly scalable
  • 22.
    • Gossip basedprotocol • Network scoped • Fast convergence • Secure by default • periodic key rotations • swarm native key-exchange • Gossips control messages • Routing-states • Service-discovery • Plugin-data • Highly scalable Secured network control plane Cluster Scope Gossip W1 W2 W3 W1 W5 W4 Network Scope Gossip Network Scope Gossip
  • 23.
    • Available asan option during overlay network creation • Uses kernel IPSec modules • On-demand tunnel setup • Swarm native key-exchange • Periodic key rotations Secure dataplane Worker1 Worker2 Worker3 secure network secure network IPSec Tunnel IPSec Tunnel IPSec Tunnel secure network secure network non- secure network non- secure network Open UDP traffic
  • 24.
    • Provided byembedded DNS • Highly available • Uses Network Control Plane to learn state • Can be used to discover both tasks and services Service Discovery engine DNS Server DNS Resolver DNS Resolver DNS requests
  • 25.
    • Internal &Ingress load-balancing • Supports VIP & DNS-RR • Highly available • Uses Network Control Plane to learn state • Minimal Overhead Load balancer Task1 ServiceA Task2 ServiceA Task3 ServiceA Client1 Client2 VIP LB VIP LB
  • 26.
    • Builtin routingmesh for edge routing • Worker nodes themselves participate in ingress routing mesh • All worker nodes accept connection requests on PublishedPort • Port translation happens at the worker node • Same internal load balancing mechanism used to load balance external requests Routing mesh External Loadbalancer (optional) Task1 ServiceA Task1 ServiceA Task1 ServiceA Worker1 Worker2 Ingress Network 8080 8080 VIP LB VIP LB 8080->80 8080->80
  • 27.
    Routing Mesh • Operatorreserves a swarm- wide ingress port (8080) for myapp • Every node listens on 8080 • Container-aware routing mesh can transparently reroute traffic from Worker3 to a node that is running container • Built in load balancing into the Engine • DNS-based service discovery Worker 1 :8080 Manager User accesses myapp.com:8080:8080 Worker 2 :8080 Worker 3 :8080 frontend frontend $ docker service create --replicas 3 --name frontend --network mynet --publish 8080:80/tcp frontend_image:latest frontend
  • 28.
    Routing Mesh: PublishedPorts • Operator reserves a swarm- wide ingress port (8080) for myapp • Every node listens on 8080 • Container-aware routing mesh can transparently reroute traffic from Worker3 to a node that is running container • Built in load balancing into the Engine • DNS-based service discovery Worker 1 :8080 Manager User accesses myapp.com:8080:8080 Worker 2 :8080 Worker 3 :8080 frontend frontend $ docker service create --replicas 3 --name frontend --network mynet --publish 8080:80/tcp frontend_image:latest frontend
  • 29.
  • 30.
    Service , Port-Publish& Network iptables eth0 Host1 default_gwbridge ingress-sbox eth1 ingress-overlay-bridge Ingress- Network eth0 vxlan tunnel to host2 - vni-100vxlan tunnel to host3 - vni-100 eth0 Container-sbox eth1 eth2 mynet mynet-br vxlan tunnel to host2 - vni-101 docker service create —name=test —network=mynet -p 8080:80 —replicas=2 xxx iptables ipvs iptables ipvs Host1: 8080 DNS Resolver daemon embedded 
 DNS server
 service -> VIP
  • 31.
    Day in lifeof a packet - IPTables & IPVS
  • 32.
    Day in lifeof a packet - Routing Mesh & Ingress LB iptables NAT table
 DOCKER-INGRESS DNAT : Published-Port -> ingress-sbox eth0 Host1 default_gwbridge ingress-sboxeth1 iptables NAT table
 PREROUTING Redirect -> service-port iptables MANGLE table
 PREROUTING MARK : Published-Port -> <fw-mark-id> IPVS Match <fw-mark-id> -> Masq {RR across container-IPs) ingress-overlay-bridge Ingress- Network eth0 iptables NAT table
 DOCKER-INGRESS DNAT : Published-Port -> ingress-sbox eth0 Host2 default_gwbridge ingress-sbox eth1 ingress-overlay-bridge eth0 vxlan tunnel with vni Ingress- Network eth0 Container-sbox (backs a task/ service) eth1
  • 33.
    Day in lifeof a packet - Internal LB eth0 Host1 container-sbox
 (service1) eth1 iptables MANGLE table
 OUTPUT MARK : VIP -> <fw-mark-id> IPVS Match <fw-mark-id> -> Masq {RR across container-IPs) mynet-overlay-bridge mynet eth2 Host2 mynet-overlay-bridgevxlan tunnel with vni mynet eth2 Container-sbox (service2) Application looks up service2 (using embedded-DNS @ 127.0.0.11) DNS Resolver daemon embedded DNS server
 service2 -> VIP2 vxlan tunnel with vni
  • 34.