KEMBAR78
PostgreSQL High Availability in a Containerized World | PDF
POSTGRESQL HIGH AVAILABILITY
IN A CONTAINERIZED WORLD
Jignesh Shah
Chief Architect, Data Platform
About @jkshah
ü  appOrbit
•  Focus is on data management of applications running in Containers
ü  VMware
•  Lead and manage Postgres and Data Management teams at VMware for various products embedding
PostgreSQL running in virtualized embedded instances
ü  Sun Microsystems
•  Team Member of first published SpecJAppServer 2004 benchmark with PostgreSQL
•  Performance of PostgreSQL on Solaris/Sun Servers
ü  Working with PostgreSQL community since 2005
•  http://jkshah.blogspot.com/2005/04/profiling-postgresql-using-dtrace-on_22.html
ü  Working with Container technologies (Solaris Zones) since 2004
•  http://jkshah.blogspot.com/2004/08/db2-working-under-solaris-10-zones_30.html
Agenda
ü Containers
ü Enterprise Needs
ü PostgreSQL Replication
ü Modern Projects
ü Blueprint Of deployments
What are Containers?
ü OS Level virtualization where kernel allows for multiple isolated user-
space instances
Operating
System
Bare Metal
Server
OS
Bare Metal
Server
Hypervisor
OS
Operating
System
Bare Metal
Server
C C C C C OS
Bare Metal
Server
Hypervisor
OS
C C C C
Advantages of Containers
ü Lower footprint
ü Very Quick Startup and Shutdown
ü Density
ü Nesting
Disadvantages of Containers
ü Same Kernel version
ü Cannot run other OS natively
ü Security (to be improved)
ü Not a complete solution for enterprise needs
Where to use container?
ü Recreate identical environment (cookie-cutter)
ü Resource Grouping of specific processes in heavily loaded server
ü Handling multiple versions of software applications
ü Ephemeral application instances (Dev/Test)
ü Production instances (Growing everyday)
ü Many more
Docker – Popular Container engine
•  Installation
# sudo tee /etc/yum.repos.d/docker.repo <<-'EOF'
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/7/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF
# yum install docker-engine
# systemctl enable docker.service
# systemctl start docker.service
Docker
ü Quick Guide to use a docker based container
# docker run --name mycontainer –e
POSTGRES_PASSWORD=mysecretpassword -d postgres
# docker exec -ti mycontainer psql -U postgres
# docker stop mycontainer
# docker rm mycontainer
# docker rmi postgres
Container Volumes
ü Persists beyond the life of a Docker container
•  VOLUME command in Dockerfile or
•  Using –v using docker run command
•  Automatically created if not already present during docker run
•  Not part of docker push/pull operations
•  Can select a non-local directory using --volume-driver
•  Third party components required to get multi-host support (NFS, etc )
ü Different options using –v
•  -v /hostsrc/data:/opt/data:ro # for read only volumes (default rw)
•  -v /hostsrc/data:/opt/data:Z # Z – private volume, z – shared volume
•  -v /etc/nginx.conf:/etc/nginx.conf # for mounting a single file only
ü Volumes can be shared from another container using --volumes-from on
same host
ü Starting from docker 1.9 gives first class status to Docker Volumes
PostgreSQL Container as a DB server
ü Maybe you want a database server standalone
•  Not all database clients will be in the same host
•  Need to limit memory usage
•  Need different layout of how files are distributed
ü Use the –p option to make the port available even to non containers
clients
ü Use –m to limit memory usage by the DB server (by default it can see
and use all)
•  Note this does not set shared buffers automatically with the library image
docker run --name mycontainer -m 4g -e POSTGRES_PASSWORD=mysecretpassword 
-v /hostpath/pgdata:/var/lib/postgresql/data -p 5432:5432 -d postgres
PostgreSQL in an enterprise environment
ü However for a real production use case we would need
•  Bigger shared memory configurations
•  Need different layout of how files are distributed
•  Ability to backup the database
•  Ability to setup replication
•  etc
ü In short we need a more custom image of PostgreSQL
Best Practices for custom image
ü For production install customize the docker image
•  Allocate proper memory limits - example 8GB
•  All pagecache usage shows up as docker container memory usage
•  Bump up shared buffers and other parameters as required
•  Hint: use PostgreSQL 9.3 or later otherwise have to privileged containers
•  http://jkshah.blogspot.com/2015/09/is-it-privilege-to-run-container-in.html
•  Support multiple volumes in your image
•  PITR archives
•  Full Backup directory
•  PostgreSQL Extensions
•  Setup replication support
•  Out of box replication setup
•  Monitoring Tool
•  Your favorite monitoring agent
Enterprise Needs for
Databases
Planning a High Availability Strategy
ü Requirements
•  Recovery Time Objective (RTO)
•  What does 99.99% availability really mean?
•  Recovery Point Objective (RPO)
•  Zero data lost?
•  HA vs. DR requirements
ü Evaluating a technology
•  What’s the cost for implementing the technology?
•  What’s the complexity of implementing, and managing the technology?
•  What’s the downtime potential?
•  What’s the data loss exposure?
Availability %	 Downtime / Year	 Downtime / Month *	 Downtime / week	
"Two Nines" - 99% 	 3.65 Days	 7.2 Hours	 1.69 Hours	
"Three Nines" - 99.9% 	 8.76 Hours	 43.2 Minutes	 10.1 Minutes	
"Four Nines" - 99.99% 	 52.56 Minutes	 4.32 Minutes	 1.01 Minutes	
"Five Nines" - 99.999% 	 5.26 Minutes	 25.9 Seconds	 6.06 Seconds	
* Using a 30 day month
Simplified View of HA PostgreSQL
ü  Easy to setup
ü  Handles Infrastructure problems
ü  Exploit Storage features
ü  Exploit replication features
DNS Name
Applications
Somewhere
in Cloud/
Data Center
Causes of Downtime
ü  Planned Downtime
•  Software upgrade (OS patches, SQL Server cumulative updates)
•  Hardware/BIOS upgrade
ü  Unplanned Downtime
•  Datacenter failure (natural disasters, fire)
•  Server failure (failed CPU, bad network card)
•  I/O subsystem failure (disk failure, controller failure)
•  Software/Data corruption (application bugs, OS binary corruptions)
•  User Error (shutdown a SQL service, dropped a table)
Typical Plan of action
ü  Minimize reasons that leads to downtime
ü  Faster recovery time (Balanced checkpoints)
ü  Proxies for fast switching between production and DR copy
ü  Shared Storage for HA
ü  PostgreSQL Synchronous Replication to go beyond
HA PostgreSQL with Shared Storage
ü  Ability to leverage hardware Snapshots/Restore
ü  Automated Failover using OS Clustering Software
ü  Block Level Replication for DR
ü  Distributed Shared Storage getting popular
Virtual IP or
DNS or
pgPool or
pgBouncerApplications
Site 1
PostgreSQL Replication
ü  Single master, multi-slave
ü  Cascading slave also possible
ü  Mechanism based on WAL (Write-Ahead Logs)
ü  Multiple modes and multiple recovery ways
•  Warm standby
•  Asynchronous hot standby
•  Synchronous hot standby
ü  Slaves can perform read operations optionally
•  Good for read scale
ü  Node failover, reconnection possible
HA PostgreSQL with Sync Replication
ü  Synchronous Replication within Data Center
ü  Low Down Time (lower than HA)
ü  Automated Failover for hardware issues including Storage
Virtual IP or
DNS or
pgPool or
pgBouncer
Applications
Site 1
PostgreSQL Replication
ü  In-core replication does great replication
•  But no automated failover
•  “failback” (pg_rewind – thank god)
•  Load Balanced IP Address
•  Get your own proxy (haproxy ?,
pgbouncer?, pgpool?)
•  No-way to preserve connections
Photo Credit: dundanim/ Shutterstock.com
Just PostgreSQL?
ü  Need more projects
•  pgPool2 / HAProxy /pgbouncer
•  Repmgr, etc
ü  Some Customers at this time prefer Cloud DBaaS
•  Heroku
•  Amazon RDS
ü  Some end up preferring Enterprise version of DBaaS
•  appOrbit J
Modern HA Projects
ü  Patroni / Governor
•  https://github.com/zalando/patroni (Python)
•  Docker container
•  Etcd
•  HAProxy
ü  Stolon
•  https://github.com/sorintlab/stolon (Golang)
•  Docker
•  Etcd /Consul
•  Custom Proxy
Governor
https://github.com/compose/governor/blob/master/postgres-ha.pdf
Stolon
https://github.com/sorintlab/stolon/blob/master/doc/architecture_small.png
Basic Container based HAArchitecture
ü  Need a distributed store to store configuration status
•  Consul
•  Zookeeper
•  etcd
ü  PostgreSQL Cluster Peer (Self Managing)
•  Determines local instance status and updates configuration status
•  Master regularly updates its status, failing which it is considered failed
•  If master fails, election based on least lag and new leader takes over
•  Other standby now follows the new master
•  Potentially a third party can even provision the dead master as slave
Some New Trends in Container World
ü Binaries and data often separated
•  One lives in Container image and other in Volumes
ü No longer pg_xlog deployed on separate volumes
•  Underlying storage technologies leads to inconsistent point in time restore
causing DB to be unusable
ü  No new table spaces
•  Hard to get easy replication setups done on the fly
•  Could lead to lost data if new tablespaces are not on volumes
ü  Replications setup with automation rather than manually by Admins
Some New Trends in Container World
ü Adoption of Micro services
•  Leading to lots of smaller databases for each micro service
ü Faster Updates
•  Schema changes sometimes need to be backward compatible
ü Repeatable Deployments
•  Need to redeploy at a moment’s notice
Deployment of PostgreSQL “Cluster”
ü  Can be made self healing
ü  Integrate with pg_rewind to reuse master as slave
ü  Integrate with shared storage to leverage snapshot create new slaves
Virtual IP
Applications
Instance 1
Instance 2
Instance 3
Shared
Storage
But Wait I have multiple DB Servers
ü  I need my clusters to dynamically grow (read scaling)
ü  I also want things to auto-heal as much as it can
Applications
Kubernetes
ü Production grade container orchestrator
ü Horizontal scaling
•  Setup rules to scale slaves
ü ConfigMap
•  postgresql.conf
•  pg_hba.conf
ü Secrets
•  Username passwords
•  Certificates
Kubernetes
ü Persistent Storage features evolving
•  Plugins for storage drivers
ü External Services
•  Services are accessible from all nodes
•  Shared Storage plugins makes your Stateful containers also HA
•  Powerful Combination along with PostgreSQL Replication
•  can spin up fast slaves for multi-TB databases
Production Grade Orchestrator
ü  Can even add rules to spin up new slaves as for read load
Operations
Applications
But Wait .. Need to support Multi-Geo
ü  It could be DR Strategy
ü  It could be Compliance requirements
ü  Service Discovery now getting complicated
Operations
Applications
Consul
•  Service Discovery
•  Failure Detection
•  Multi Data Center
•  DNS Query Interface
{
"service": {
"name": ”mypostgresql",
"tags": ["master"],
"address": "127.0.0.1",
"port": 5432,
"enableTagOverride": false,
}
}
nslookup master.mypostgresql.service.domain
nslookup mypostgresql.service.domain
Service Discovery
ü  Uniform DNS name for your database
ü  Cloud-agnostic naming
ü  Certificates created using DNS names you own
ü  No Single Point of Failures
Operations
Applications
PostgreSQL Enhancement
ü  SRV Record of NameServer
•  https://en.wikipedia.org/wiki/SRV_record
•  IP:Port
ü  PostgreSQL LIBPQ Client Enhancement
•  Support Service Discovery using SRV Records
•  servicename is passed
•  libpq looks up the SRV Record from nameserver
•  Connects port provided by SRV record
Summary
ü  PostgreSQL “Cluster” deployments is the wave of change
ü  Container is one of the technology but not the solution
Your Feedback is Important!
ü  We’d like to understand your use of Postgres for HA / DR.
ü  If interested,
ü  Twitter: @jkshah
ü  Email: jignesh@apporbit.com
Thanks.
Questions?
Follow me on twitter: @jkshah
Blog: http://jkshah.blogspot.com Full copies of your applications
at the push of a button
We are
HIRING !!!

PostgreSQL High Availability in a Containerized World

  • 1.
    POSTGRESQL HIGH AVAILABILITY INA CONTAINERIZED WORLD Jignesh Shah Chief Architect, Data Platform
  • 2.
    About @jkshah ü  appOrbit • Focus is on data management of applications running in Containers ü  VMware •  Lead and manage Postgres and Data Management teams at VMware for various products embedding PostgreSQL running in virtualized embedded instances ü  Sun Microsystems •  Team Member of first published SpecJAppServer 2004 benchmark with PostgreSQL •  Performance of PostgreSQL on Solaris/Sun Servers ü  Working with PostgreSQL community since 2005 •  http://jkshah.blogspot.com/2005/04/profiling-postgresql-using-dtrace-on_22.html ü  Working with Container technologies (Solaris Zones) since 2004 •  http://jkshah.blogspot.com/2004/08/db2-working-under-solaris-10-zones_30.html
  • 3.
  • 4.
    What are Containers? ü OSLevel virtualization where kernel allows for multiple isolated user- space instances Operating System Bare Metal Server OS Bare Metal Server Hypervisor OS Operating System Bare Metal Server C C C C C OS Bare Metal Server Hypervisor OS C C C C
  • 5.
    Advantages of Containers ü Lowerfootprint ü Very Quick Startup and Shutdown ü Density ü Nesting
  • 6.
    Disadvantages of Containers ü SameKernel version ü Cannot run other OS natively ü Security (to be improved) ü Not a complete solution for enterprise needs
  • 7.
    Where to usecontainer? ü Recreate identical environment (cookie-cutter) ü Resource Grouping of specific processes in heavily loaded server ü Handling multiple versions of software applications ü Ephemeral application instances (Dev/Test) ü Production instances (Growing everyday) ü Many more
  • 8.
    Docker – PopularContainer engine •  Installation # sudo tee /etc/yum.repos.d/docker.repo <<-'EOF' [dockerrepo] name=Docker Repository baseurl=https://yum.dockerproject.org/repo/main/centos/7/ enabled=1 gpgcheck=1 gpgkey=https://yum.dockerproject.org/gpg EOF # yum install docker-engine # systemctl enable docker.service # systemctl start docker.service
  • 9.
    Docker ü Quick Guide touse a docker based container # docker run --name mycontainer –e POSTGRES_PASSWORD=mysecretpassword -d postgres # docker exec -ti mycontainer psql -U postgres # docker stop mycontainer # docker rm mycontainer # docker rmi postgres
  • 10.
    Container Volumes ü Persists beyondthe life of a Docker container •  VOLUME command in Dockerfile or •  Using –v using docker run command •  Automatically created if not already present during docker run •  Not part of docker push/pull operations •  Can select a non-local directory using --volume-driver •  Third party components required to get multi-host support (NFS, etc ) ü Different options using –v •  -v /hostsrc/data:/opt/data:ro # for read only volumes (default rw) •  -v /hostsrc/data:/opt/data:Z # Z – private volume, z – shared volume •  -v /etc/nginx.conf:/etc/nginx.conf # for mounting a single file only ü Volumes can be shared from another container using --volumes-from on same host ü Starting from docker 1.9 gives first class status to Docker Volumes
  • 11.
    PostgreSQL Container asa DB server ü Maybe you want a database server standalone •  Not all database clients will be in the same host •  Need to limit memory usage •  Need different layout of how files are distributed ü Use the –p option to make the port available even to non containers clients ü Use –m to limit memory usage by the DB server (by default it can see and use all) •  Note this does not set shared buffers automatically with the library image docker run --name mycontainer -m 4g -e POSTGRES_PASSWORD=mysecretpassword -v /hostpath/pgdata:/var/lib/postgresql/data -p 5432:5432 -d postgres
  • 12.
    PostgreSQL in anenterprise environment ü However for a real production use case we would need •  Bigger shared memory configurations •  Need different layout of how files are distributed •  Ability to backup the database •  Ability to setup replication •  etc ü In short we need a more custom image of PostgreSQL
  • 13.
    Best Practices forcustom image ü For production install customize the docker image •  Allocate proper memory limits - example 8GB •  All pagecache usage shows up as docker container memory usage •  Bump up shared buffers and other parameters as required •  Hint: use PostgreSQL 9.3 or later otherwise have to privileged containers •  http://jkshah.blogspot.com/2015/09/is-it-privilege-to-run-container-in.html •  Support multiple volumes in your image •  PITR archives •  Full Backup directory •  PostgreSQL Extensions •  Setup replication support •  Out of box replication setup •  Monitoring Tool •  Your favorite monitoring agent
  • 14.
  • 15.
    Planning a HighAvailability Strategy ü Requirements •  Recovery Time Objective (RTO) •  What does 99.99% availability really mean? •  Recovery Point Objective (RPO) •  Zero data lost? •  HA vs. DR requirements ü Evaluating a technology •  What’s the cost for implementing the technology? •  What’s the complexity of implementing, and managing the technology? •  What’s the downtime potential? •  What’s the data loss exposure? Availability % Downtime / Year Downtime / Month * Downtime / week "Two Nines" - 99% 3.65 Days 7.2 Hours 1.69 Hours "Three Nines" - 99.9% 8.76 Hours 43.2 Minutes 10.1 Minutes "Four Nines" - 99.99% 52.56 Minutes 4.32 Minutes 1.01 Minutes "Five Nines" - 99.999% 5.26 Minutes 25.9 Seconds 6.06 Seconds * Using a 30 day month
  • 16.
    Simplified View ofHA PostgreSQL ü  Easy to setup ü  Handles Infrastructure problems ü  Exploit Storage features ü  Exploit replication features DNS Name Applications Somewhere in Cloud/ Data Center
  • 17.
    Causes of Downtime ü Planned Downtime •  Software upgrade (OS patches, SQL Server cumulative updates) •  Hardware/BIOS upgrade ü  Unplanned Downtime •  Datacenter failure (natural disasters, fire) •  Server failure (failed CPU, bad network card) •  I/O subsystem failure (disk failure, controller failure) •  Software/Data corruption (application bugs, OS binary corruptions) •  User Error (shutdown a SQL service, dropped a table)
  • 18.
    Typical Plan ofaction ü  Minimize reasons that leads to downtime ü  Faster recovery time (Balanced checkpoints) ü  Proxies for fast switching between production and DR copy ü  Shared Storage for HA ü  PostgreSQL Synchronous Replication to go beyond
  • 19.
    HA PostgreSQL withShared Storage ü  Ability to leverage hardware Snapshots/Restore ü  Automated Failover using OS Clustering Software ü  Block Level Replication for DR ü  Distributed Shared Storage getting popular Virtual IP or DNS or pgPool or pgBouncerApplications Site 1
  • 20.
    PostgreSQL Replication ü  Singlemaster, multi-slave ü  Cascading slave also possible ü  Mechanism based on WAL (Write-Ahead Logs) ü  Multiple modes and multiple recovery ways •  Warm standby •  Asynchronous hot standby •  Synchronous hot standby ü  Slaves can perform read operations optionally •  Good for read scale ü  Node failover, reconnection possible
  • 21.
    HA PostgreSQL withSync Replication ü  Synchronous Replication within Data Center ü  Low Down Time (lower than HA) ü  Automated Failover for hardware issues including Storage Virtual IP or DNS or pgPool or pgBouncer Applications Site 1
  • 22.
    PostgreSQL Replication ü  In-corereplication does great replication •  But no automated failover •  “failback” (pg_rewind – thank god) •  Load Balanced IP Address •  Get your own proxy (haproxy ?, pgbouncer?, pgpool?) •  No-way to preserve connections Photo Credit: dundanim/ Shutterstock.com
  • 23.
    Just PostgreSQL? ü  Needmore projects •  pgPool2 / HAProxy /pgbouncer •  Repmgr, etc ü  Some Customers at this time prefer Cloud DBaaS •  Heroku •  Amazon RDS ü  Some end up preferring Enterprise version of DBaaS •  appOrbit J
  • 24.
    Modern HA Projects ü Patroni / Governor •  https://github.com/zalando/patroni (Python) •  Docker container •  Etcd •  HAProxy ü  Stolon •  https://github.com/sorintlab/stolon (Golang) •  Docker •  Etcd /Consul •  Custom Proxy
  • 25.
  • 26.
  • 27.
    Basic Container basedHAArchitecture ü  Need a distributed store to store configuration status •  Consul •  Zookeeper •  etcd ü  PostgreSQL Cluster Peer (Self Managing) •  Determines local instance status and updates configuration status •  Master regularly updates its status, failing which it is considered failed •  If master fails, election based on least lag and new leader takes over •  Other standby now follows the new master •  Potentially a third party can even provision the dead master as slave
  • 28.
    Some New Trendsin Container World ü Binaries and data often separated •  One lives in Container image and other in Volumes ü No longer pg_xlog deployed on separate volumes •  Underlying storage technologies leads to inconsistent point in time restore causing DB to be unusable ü  No new table spaces •  Hard to get easy replication setups done on the fly •  Could lead to lost data if new tablespaces are not on volumes ü  Replications setup with automation rather than manually by Admins
  • 29.
    Some New Trendsin Container World ü Adoption of Micro services •  Leading to lots of smaller databases for each micro service ü Faster Updates •  Schema changes sometimes need to be backward compatible ü Repeatable Deployments •  Need to redeploy at a moment’s notice
  • 30.
    Deployment of PostgreSQL“Cluster” ü  Can be made self healing ü  Integrate with pg_rewind to reuse master as slave ü  Integrate with shared storage to leverage snapshot create new slaves Virtual IP Applications Instance 1 Instance 2 Instance 3 Shared Storage
  • 31.
    But Wait Ihave multiple DB Servers ü  I need my clusters to dynamically grow (read scaling) ü  I also want things to auto-heal as much as it can Applications
  • 32.
    Kubernetes ü Production grade containerorchestrator ü Horizontal scaling •  Setup rules to scale slaves ü ConfigMap •  postgresql.conf •  pg_hba.conf ü Secrets •  Username passwords •  Certificates
  • 33.
    Kubernetes ü Persistent Storage featuresevolving •  Plugins for storage drivers ü External Services •  Services are accessible from all nodes •  Shared Storage plugins makes your Stateful containers also HA •  Powerful Combination along with PostgreSQL Replication •  can spin up fast slaves for multi-TB databases
  • 34.
    Production Grade Orchestrator ü Can even add rules to spin up new slaves as for read load Operations Applications
  • 35.
    But Wait ..Need to support Multi-Geo ü  It could be DR Strategy ü  It could be Compliance requirements ü  Service Discovery now getting complicated Operations Applications
  • 36.
    Consul •  Service Discovery • Failure Detection •  Multi Data Center •  DNS Query Interface { "service": { "name": ”mypostgresql", "tags": ["master"], "address": "127.0.0.1", "port": 5432, "enableTagOverride": false, } } nslookup master.mypostgresql.service.domain nslookup mypostgresql.service.domain
  • 37.
    Service Discovery ü  UniformDNS name for your database ü  Cloud-agnostic naming ü  Certificates created using DNS names you own ü  No Single Point of Failures Operations Applications
  • 38.
    PostgreSQL Enhancement ü  SRVRecord of NameServer •  https://en.wikipedia.org/wiki/SRV_record •  IP:Port ü  PostgreSQL LIBPQ Client Enhancement •  Support Service Discovery using SRV Records •  servicename is passed •  libpq looks up the SRV Record from nameserver •  Connects port provided by SRV record
  • 39.
    Summary ü  PostgreSQL “Cluster”deployments is the wave of change ü  Container is one of the technology but not the solution
  • 40.
    Your Feedback isImportant! ü  We’d like to understand your use of Postgres for HA / DR. ü  If interested, ü  Twitter: @jkshah ü  Email: jignesh@apporbit.com
  • 41.
    Thanks. Questions? Follow me ontwitter: @jkshah Blog: http://jkshah.blogspot.com Full copies of your applications at the push of a button We are HIRING !!!