KEMBAR78
Writing infinite scalability web applications with PHP and PostgreSQL | PDF
Writing innite scalability web
applications with PHP and
PostgreSQL
PHPDay 2013 - Verona
Gabriele Bartolini
2ndQuadrant Italia
gabriele.bartolini@2ndQuadrant.it
@_GBartolini_
sabato 18 maggio 13
Gabriele Bartolini
•Co-Founder and Manager of 2ndQuadrant Italia
• Data Architect, Business critical environments (OLTP)
• Data warehousing
•Co-Founder Italian PostgreSQL Users Group
•Co-Founder PostgreSQL Europe
•PostgreSQL Contributor and Advocate
sabato 18 maggio 13
Gabriele Bartolini 1.0 (1995-2005)
•Web mining
•Web programmer
• C, C++ and PHP
• Web accessibility
• IWA
• W3C (EARL)
•Open source developer
• ht://Dig (C++)
• ht://Check (C++, MySQL,
PHP)
• ht://Miner (C++,
PostgreSQL)
sabato 18 maggio 13
PostgreSQL = Postgres
100% open source, available at www.postgresql.org
sabato 18 maggio 13
Who’s using Postgres?
... in a production environment
sabato 18 maggio 13
Who’s planning to use Postgres?
sabato 18 maggio 13
Who’s familiar with sharding?
sabato 18 maggio 13
Traditional PHP application
HTML 5
Javascript
CSS 3
PHP
Postgres
JSON
ResultSet
sabato 18 maggio 13
Postgres 9.2+ PHP Application
HTML 5
Javascript
CSS 3
PHP
Postgres
JSON
sabato 18 maggio 13
HTML 5
Javascript
CSS 3
PHP
Postgres
PL/Proxy
Postgres PostgresPostgres
“NoSQL”
......
NoSQL with Postgres
sabato 18 maggio 13
Sharding
•Horizontal partitioning
technique
•Logically and physically
split a table by distributing
data across different
databases/servers
• Reduced number of rows in
each shard
• Reduced index size
•Some DBMS have the
"DISTRIBUTED BY" clause
for "CREATE TABLE"
•Query distribution
• Parallel processing
• Shared nothing architecture
sabato 18 maggio 13
Table 1
Table 1
Shard 1
Table 1
Shard 2
Table 1
Shard 3
Table 1
Shard ...
Table 1
Shard n
Table 1
SELECT * FROM Table1;
Parallel execution on the ‘n’ shards
sabato 18 maggio 13
Part I
Postgre(s|SQL)
sabato 18 maggio 13
PostgreSQL
•Started in 1995 (Postgres started in 1986)
•Current major version is 9.2
• Supported major versions are 8.4, 9.0, 9.1 and 9.2
•The PostgreSQL License (BSD style)
• Not subject to monopoly nor acquisitions
•Platform independent (C language)
sabato 18 maggio 13
Some major features
•Client/Server
•Concurrency (MVCC)
•Referential integrity
•ACID transactions
•Views
•Functions
•Triggers
•Tablespaces
•Partitioning
•Schemas
•Point In Time Recovery
•Master/Slave Replication
sabato 18 maggio 13
Features for developers
•Standard compliant (e.g.
SQL, XML, UTF8, JSON, ...)
•Client interfaces: C/C++,
PHP, JDBC, ODBC, .Net,
Python, Perl, Ruby, ...
•Vast gamma of ORMs
•Several procedural
languages
•Versatile and extensible
• custom data types
• intelligence in the database
• extensions (from 9.1)
• PostGIS
•Database Unit testing
sabato 18 maggio 13
PostgreSQL 9.3
•Autumn 2013
•JSON operators and
functions
•Updatable views
•Materialised views
•Writable foreign tables
•Support for LATERAL
•...
•http://wiki.postgresql.org/
wiki/What
%27s_new_in_PostgreSQL_
9.3
sabato 18 maggio 13
NO ... DISTRIBUTED BY
•PostgreSQL does not support transparent distribution of
data and queries (yet)
•Sharding can be implemented through PL/Proxy using
functions
sabato 18 maggio 13
Part II
Architecture and scalability
sabato 18 maggio 13
Scalability
Property of a database to grow according to both
technical and business requirements
sabato 18 maggio 13
Server
dbdb
Single
SMP Server
Vertical scalability
sabato 18 maggio 13
Server
db
Commodity
Hardware
Server Server
db
Horizontal scalability
sabato 18 maggio 13
Disclaimer
For the sake of simplicity, every database server
described in the following architecture diagrams,
deliberately represents a single point of failure.
Due to its high availability/disaster recovery features and
open source tools, PostgreSQL is widely used in mission
critical environments.
sabato 18 maggio 13
Server
postgres
Physical/Virtual
server
Single server architecture
apache/php
Pros: cost
Cons: performance (CPU, RAM, I/O), scalability, business
continuity, security, no db connection pooling
sabato 18 maggio 13
Web server
Two-server architecture
apache/php
Pros: flexibility (hardware, OS), cost optimisation, security,
performance (CPU, RAM, I/O), db connection pooling
Cons: business continuity, scalability
DB Server
postgres
pgbouncer
Virtual, +CPU, +RAM, -I/O
Physical, =CPU, =RAM, +I/O
sabato 18 maggio 13
Benchmark
•2 x Intel Xeon E5630 CPUs
• 48 GB RAM
• Storage:
• 2 SAS 7.2k rpm, RAID 1
• 2 SAS 15k rpm, RAID 1
• 6 SAS 15k rpm, RAID 1+0
•850 MB/s seq read
•13k UPDATE/sec 32 clients
• 1.1k/sec for “large” dbs
•PostgreSQL 9.2/CentOS 6
•Rental costs:
• 1.8k € setup + 350 €/month
sabato 18 maggio 13
Web server
Multiple web servers architecture
apache/php
DB Server
postgres
pgbouncer
pgbouncer
Web server
apache/php
pgbouncer
HA Proxy
HTTP Balancing
Pros: scalability and
continuity (Web)
Cons: scalability
and continuity (DB)
sabato 18 maggio 13
Web server
Read only database servers
apache/php
DB Master
postgres
pgbouncer
pgbouncer
Web server
apache/php
pgbouncer
HA Proxy
HTTP Balancing
DB Hot Standby
postgres
pgbouncer
Streaming
Replication
RO
RORW RW
sabato 18 maggio 13
Web server
Database sharding architecture
DB Access Server
postgres
pgbouncer
Web server
HA Proxy
Web layer
pl/proxy
Database Access Layer
DB Shard ...
postgres
pgbouncer
DB Shard n
postgres
pgbouncer
DB Shard 1
postgres
pgbouncer
Database
Data Layer
“NoSQL”“NoSQL”
“Infinite” horizontal scalability
Shared
nothing
architecture
sabato 18 maggio 13
Architecture recap
•Do not get carried away by current trends and buzzwords
•Make your decisions about architecture based on:
• business continuity requirements
• results of both database and application benchmarks
•Always start with a simple architecture (two-server)
•Enhance gradually up to sharding for infinite scalability
sabato 18 maggio 13
Part III
(Open Source) Technologies
sabato 18 maggio 13
PgBouncer
•Lightweight connection
pooler for PostgreSQL
•Written in C by Skype
•Recycling connections on:
• sessions
• transactions
• statements (for PL/Proxy)
•“Virtual databases”
• Real databases can be on
different backend servers
•Online reconfiguration,
restart and upgrade
• Online statistics
• Supports PAUSE/RESUME
sabato 18 maggio 13
Conguration example
[databases]
db1_ro = host=pg2 dbname=db1 port=5432
* = host=pg1 port=6432
[pgbouncer]
listen_addr = *
listen_port = 6432
pool_mode = session
...
sabato 18 maggio 13
Statement
Transaction
Session/Transaction/Statement
Connection/Session
Transaction
INSERT
UPDATE
Transaction
INSERT
SELECT
UPDATE
BEGIN
COMMIT
sabato 18 maggio 13
hstore
•Data type in PostgreSQL
•Key/value pairs as a single
value
•Useful for storing:
• unstructured data
• many attributes for a
record
•Keys and values are stored
as text
•Set of operators
• ->, @>, ?, ...
•Indexes
•Integrates with JSON in 9.3
sabato 18 maggio 13
JSON Support
•Introduced in PostgreSQL 9.2
•Data type
•Generators (9.3)
•Operators (9.3)
•Integration with hstore (9.3)
sabato 18 maggio 13
CREATE TABLE pictures (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL
REFERENCES users(id),
title TEXT,
metadata JSON,
thumbnail BYTEA,
picture BYTEA,
created_at TIMESTAMP DEFAULT now(),
updated_at TIMESTAMP DEFAULT now()
);
sabato 18 maggio 13
INSERT INTO pictures
(user_id, title, metadata)
VALUES (
100,
'A picture of my Stratocaster',
'{"latitude":43.883,"longitude":
11.1,"width":2240,"height":1680}'
);
Storing a JSON value
sabato 18 maggio 13
WITH m AS (
SELECT 43.883 AS latitude,
11.1 AS longitude,
2240 AS width, 1680 AS height
)
INSERT INTO pictures
(user_id, title, metadata)
SELECT 100,
'Another picture of my Stratocaster',
to_json(m) FROM m;
Storing a JSON value #2
sabato 18 maggio 13
WITH p AS (
SELECT title, metadata, created_at
FROM pictures WHERE id = 1
)
SELECT to_json(p) AS json FROM p;
{"title":"A picture of my Stratocaster","metadata":
{"latitude":43.883,"longitude":11.1,"width":
2240,"height":1680},"created_at":"2013-05-11
13:11:37.788636"}
9.3 Returning a JSON object
sabato 18 maggio 13
WITH p AS (
SELECT id, title, metadata, created_at
FROM pictures WHERE user_id = 100
)
SELECT json_agg(p) AS json FROM p;
[{"id":1,"title":"A picture of my Stratocaster","metadata":
{"latitude":43.883,"longitude":11.1,"width":2240,"height":
1680},"created_at":"2013-05-11 13:15:21.078747"}, +
{"id":2,"title":"Another picture of my Stratocaster","metadata":
{"latitude":43.883,"longitude":11.1,"width":2240,"height":
1680},"created_at":"2013-05-11 13:15:21.078747"}]
9.3 Returning an array of JSONs
sabato 18 maggio 13
9.3 A taste of JSON operators
SELECT id, (metadata->'longitude',
metadata->'latitude') AS coords
FROM pictures
WHERE user_id = 100;
id | coords
----+---------------
1 | (11.1,43.883)
2 | (11.1,43.883)
...
sabato 18 maggio 13
PL/Javascript
•Procedural Language
•Allows to implement functions inside PostgreSQL using the
Javascript Language
•Available as an extension
•Powered by V8 JavaScript Engine
sabato 18 maggio 13
CREATE FUNCTION plv8_test(keys TEXT[],
vals TEXT[]) RETURNS TEXT
LANGUAGE plv8 IMMUTABLE STRICT AS
$BODY$
var o = {};
for(var i=0; i<keys.length; i++) {
o[keys[i]] = vals[i];
}
return JSON.stringify(o);
$BODY$;
SELECT plv8_test(ARRAY['name', 'age'],
ARRAY['Angus', '58']);
sabato 18 maggio 13
PL/Proxy
•Procedural language for
dening proxy functions
•Written in C by Skype
•The body specifies:
• remote function (target)
• destination (shard)
•Execution can happen on
one, some or all shards
• Execution is parallel if
shards > 1
•Queries are run in auto-
commit mode
• PgBouncer (statement)
sabato 18 maggio 13
The need for an API
•PL/Proxy forces to work
with functions
•One database cluster with:
• access/proxy layer
• data layer (shards)
•Database cluster ~
Singleton class
•Business logic in the
database
•Programmers need to
dene an API with access/
authorisation rules
• Proxy functions (PL/Proxy)
• Remote functions (any)
•Supports SQL/MED
sabato 18 maggio 13
Database sharding architecture
pg-access-1
postgres
pl/proxy
pg-data-2
postgres
pg-data-1
postgres
Database
Access
Layer
Database
Data
Layer
s00 s01 s02 s03
- Denition of the cluster
- Data/Queries distributed by user (hash)
- Functions using the ‘plproxy’ language
- User data tables
- Functions (any language)
sabato 18 maggio 13
Dene the cluster with SQL/MED
CREATE SERVER mycluster
FOREIGN DATA WRAPPER plproxy
OPTIONS (
connection_lifetime '1800',
s0 'dbname=shard00 host=pg-data-1',
s1 'dbname=shard01 host=pg-data-1',
s2 'dbname=shard02 host=pg-data-2',
s3 'dbname=shard03 host=pg-data-2'
);
sabato 18 maggio 13
Proxy function for GET
CREATE OR REPLACE FUNCTION
get_pictures(i_uid INTEGER)
RETURNS json
LANGUAGE plproxy
AS
$BODY$
CLUSTER 'mycluster';
RUN ON hashint4(i_uid);
$BODY$;
sabato 18 maggio 13
Data layer function for GET
CREATE OR REPLACE FUNCTION
get_pictures(i_uid INTEGER)
RETURNS json
LANGUAGE SQL AS
$BODY$
WITH p AS (
SELECT id, title, metadata, created_at
FROM pictures WHERE user_id = $1
) SELECT json_agg(p) AS json FROM p;
$BODY$;
sabato 18 maggio 13
Proxy function for INSERT
CREATE OR REPLACE FUNCTION
insert_picture(i_uid INTEGER, i_title
TEXT, i_metadata JSON, i_thumbnail BYTEA,
i_picture BYTEA)
RETURNS INTEGER LANGUAGE plproxy
AS $BODY$
CLUSTER 'mycluster';
RUN ON hashint4(i_uid);
$BODY$;
sabato 18 maggio 13
Data function for INSERT
CREATE OR REPLACE FUNCTION
insert_picture(i_uid INTEGER, i_title
TEXT, i_metadata JSON, i_thumbnail BYTEA,
i_picture BYTEA)
RETURNS INTEGER LANGUAGE SQL
AS $BODY$
INSERT INTO pictures (user_id, title,
metadata, thumbnail, picture) VALUES ($1,
$2, $3, $4, $5) RETURNING id;
$BODY$;
sabato 18 maggio 13
Data representation in PHP
// Obviously all of this will come from a form
$user_id = 100;
$title = 'A picture of my Stratocaster';
$metadata = array (
'latitude' => 43.883,
'longitude' => 11.1,
'width' => 2240,
'height' => 1680, // ... EXIF?
);
$thumbnail = ... // Binary stream
$picture = ... // Binary stream
sabato 18 maggio 13
INSERT using PDO
// $access_layer_conninfo = 'pgsql:...'
$connection = new PDO($access_layer_conninfo);
$stmt = $connection->prepare('SELECT
insert_picture(?, ?, ?, ?, ?)'); // NOSQL!
$stmt->bindParam(1, $user_id, PDO::PARAM_INT);
$stmt->bindParam(2, $title, PDO::PARAM_STR);
$stmt->bindParam(3, json_encode($metadata),
PDO::PARAM_STR);
$stmt->bindParam(4, $thumbnail, PDO::PARAM_LOB);
$stmt->bindParam(5, $picture, PDO::PARAM_LOB);
$stmt->execute(); // TODO: Check and get the ID
sabato 18 maggio 13
Memcached
•Distributed system for caching of objects in memory
• key/value pairs
• volatile data
•pgmemcache extension
•Integration of database and applications (read scaling)
sabato 18 maggio 13
PostgreSQL
Memcached
m[k] = v
PHP
trigger
+
pgmemcache
sabato 18 maggio 13
Part IV
Testing and deployment
sabato 18 maggio 13
Test-Driven DB development
•Database development
technique
•Write tests before actual
features
•pgTAP is an extension for
TDDD in PostgreSQL
•Allows to check for objects,
privileges, functions,
triggers, data, queries, etc.
•Transactional
•Provides with statistics
• via SQL or pg_prove
•Supports “Test Anything
Protocol” (TAP)
• Continuous integration
sabato 18 maggio 13
BEGIN;
SELECT plan(3);
-- insert test 1 here
-- insert test 2 here
-- insert test 3 here
SELECT * FROM finish();
ROLLBACK;
pgTAP group test example
sabato 18 maggio 13
SELECT results_eq(
'SELECT * FROM active_users()',
'SELECT * FROM users WHERE active',
'active_users() should return active
users'
);
pgTAP example with ‘results_eq’
sabato 18 maggio 13
Deployment with EXTENSION
•Extensibility
•Deployment is a crucial part
of any ICT organisation
•PostgreSQL 9.1 introduces
EXTENSION for packaging
in-database applications
•Fully transactional (atomic)
•Allows for upgrades and
downgrades
•Simplifies logical dump/
restore of objects
•DevOps with EXTENSION:
• Devs write extensions
• Ops deploy them
sabato 18 maggio 13
Extensions in detail
•Extensions require developer to write:
• Control file (metadata, settings, dependencies, ...)
• Installation scripts (SQL statements)
• Update scripts (SQL statements, optional)
• Source code / binaries (optional)
sabato 18 maggio 13
Extension management
-- INSTALLATION
CREATE EXTENSION pictures_app;
-- REMOVAL
DROP EXTENSION pictures_app;
-- UPDATE (fully atomic)
ALTER EXTENSION pictures_app UPDATE;
sabato 18 maggio 13
Part V
Conclusions
sabato 18 maggio 13
Conclusions
•Postgres is much more than a “traditional” database
•PHP integrates perfectly with it, in multiple flavours (native
interface, PDO, Doctrine, etc.)
•Start exploiting “vertical scalability”
•Always perform benchmark!
•Evaluate costs and complexity of “horizontal scalability”
•Postgres is “business friendly” in terms of licensing
sabato 18 maggio 13
Feel free to approach me about
•Open Source Community
•License
•Total Cost of Ownership
•Development life cycle
•Quality of code and
documentation
•Business Continuity
•Security and data protection
•PostgreSQL Eco-system
•Performance and scalability
•Software development
sabato 18 maggio 13
Community
•Next events:
• PGDay.IT (Autumn 2013)
• PGConf.EU (Dublin, Oct 29
- Nov 1, 2013)
•Italian PostgreSQL Users
Group (ITPUG)
• www.itpug.org
•PostgreSQL Europe
• www.postgresql.eu
sabato 18 maggio 13
References
•PostgreSQL: www.postgresql.org
•PgBouncer: http://wiki.postgresql.org/wiki/PgBouncer
•PL/Proxy: http://wiki.postgresql.org/wiki/PL/Proxy
•PL/V8: http://pgxn.org/dist/plv8/
•pgTAP: www.pgtap.org
sabato 18 maggio 13
Questions?
E-mail: gabriele.bartolini@2ndQuadrant.it
Twitter: @_GBartolini_
sabato 18 maggio 13
Thank you!
Copyright (c) 2013 2ndQuadrant Italia
License: Creative Commons BY-NC-SA 3.0
http://creativecommons.org/licenses/by-nc-sa/3.0/it/deed.en
sabato 18 maggio 13

Writing infinite scalability web applications with PHP and PostgreSQL

  • 1.
    Writing innite scalabilityweb applications with PHP and PostgreSQL PHPDay 2013 - Verona Gabriele Bartolini 2ndQuadrant Italia gabriele.bartolini@2ndQuadrant.it @_GBartolini_ sabato 18 maggio 13
  • 2.
    Gabriele Bartolini •Co-Founder andManager of 2ndQuadrant Italia • Data Architect, Business critical environments (OLTP) • Data warehousing •Co-Founder Italian PostgreSQL Users Group •Co-Founder PostgreSQL Europe •PostgreSQL Contributor and Advocate sabato 18 maggio 13
  • 3.
    Gabriele Bartolini 1.0(1995-2005) •Web mining •Web programmer • C, C++ and PHP • Web accessibility • IWA • W3C (EARL) •Open source developer • ht://Dig (C++) • ht://Check (C++, MySQL, PHP) • ht://Miner (C++, PostgreSQL) sabato 18 maggio 13
  • 4.
    PostgreSQL = Postgres 100%open source, available at www.postgresql.org sabato 18 maggio 13
  • 5.
    Who’s using Postgres? ...in a production environment sabato 18 maggio 13
  • 6.
    Who’s planning touse Postgres? sabato 18 maggio 13
  • 7.
    Who’s familiar withsharding? sabato 18 maggio 13
  • 8.
    Traditional PHP application HTML5 Javascript CSS 3 PHP Postgres JSON ResultSet sabato 18 maggio 13
  • 9.
    Postgres 9.2+ PHPApplication HTML 5 Javascript CSS 3 PHP Postgres JSON sabato 18 maggio 13
  • 10.
    HTML 5 Javascript CSS 3 PHP Postgres PL/Proxy PostgresPostgresPostgres “NoSQL” ...... NoSQL with Postgres sabato 18 maggio 13
  • 11.
    Sharding •Horizontal partitioning technique •Logically andphysically split a table by distributing data across different databases/servers • Reduced number of rows in each shard • Reduced index size •Some DBMS have the "DISTRIBUTED BY" clause for "CREATE TABLE" •Query distribution • Parallel processing • Shared nothing architecture sabato 18 maggio 13
  • 12.
    Table 1 Table 1 Shard1 Table 1 Shard 2 Table 1 Shard 3 Table 1 Shard ... Table 1 Shard n Table 1 SELECT * FROM Table1; Parallel execution on the ‘n’ shards sabato 18 maggio 13
  • 13.
  • 14.
    PostgreSQL •Started in 1995(Postgres started in 1986) •Current major version is 9.2 • Supported major versions are 8.4, 9.0, 9.1 and 9.2 •The PostgreSQL License (BSD style) • Not subject to monopoly nor acquisitions •Platform independent (C language) sabato 18 maggio 13
  • 15.
    Some major features •Client/Server •Concurrency(MVCC) •Referential integrity •ACID transactions •Views •Functions •Triggers •Tablespaces •Partitioning •Schemas •Point In Time Recovery •Master/Slave Replication sabato 18 maggio 13
  • 16.
    Features for developers •Standardcompliant (e.g. SQL, XML, UTF8, JSON, ...) •Client interfaces: C/C++, PHP, JDBC, ODBC, .Net, Python, Perl, Ruby, ... •Vast gamma of ORMs •Several procedural languages •Versatile and extensible • custom data types • intelligence in the database • extensions (from 9.1) • PostGIS •Database Unit testing sabato 18 maggio 13
  • 17.
    PostgreSQL 9.3 •Autumn 2013 •JSONoperators and functions •Updatable views •Materialised views •Writable foreign tables •Support for LATERAL •... •http://wiki.postgresql.org/ wiki/What %27s_new_in_PostgreSQL_ 9.3 sabato 18 maggio 13
  • 18.
    NO ... DISTRIBUTEDBY •PostgreSQL does not support transparent distribution of data and queries (yet) •Sharding can be implemented through PL/Proxy using functions sabato 18 maggio 13
  • 19.
    Part II Architecture andscalability sabato 18 maggio 13
  • 20.
    Scalability Property of adatabase to grow according to both technical and business requirements sabato 18 maggio 13
  • 21.
  • 22.
  • 23.
    Disclaimer For the sakeof simplicity, every database server described in the following architecture diagrams, deliberately represents a single point of failure. Due to its high availability/disaster recovery features and open source tools, PostgreSQL is widely used in mission critical environments. sabato 18 maggio 13
  • 24.
    Server postgres Physical/Virtual server Single server architecture apache/php Pros:cost Cons: performance (CPU, RAM, I/O), scalability, business continuity, security, no db connection pooling sabato 18 maggio 13
  • 25.
    Web server Two-server architecture apache/php Pros:flexibility (hardware, OS), cost optimisation, security, performance (CPU, RAM, I/O), db connection pooling Cons: business continuity, scalability DB Server postgres pgbouncer Virtual, +CPU, +RAM, -I/O Physical, =CPU, =RAM, +I/O sabato 18 maggio 13
  • 26.
    Benchmark •2 x IntelXeon E5630 CPUs • 48 GB RAM • Storage: • 2 SAS 7.2k rpm, RAID 1 • 2 SAS 15k rpm, RAID 1 • 6 SAS 15k rpm, RAID 1+0 •850 MB/s seq read •13k UPDATE/sec 32 clients • 1.1k/sec for “large” dbs •PostgreSQL 9.2/CentOS 6 •Rental costs: • 1.8k € setup + 350 €/month sabato 18 maggio 13
  • 27.
    Web server Multiple webservers architecture apache/php DB Server postgres pgbouncer pgbouncer Web server apache/php pgbouncer HA Proxy HTTP Balancing Pros: scalability and continuity (Web) Cons: scalability and continuity (DB) sabato 18 maggio 13
  • 28.
    Web server Read onlydatabase servers apache/php DB Master postgres pgbouncer pgbouncer Web server apache/php pgbouncer HA Proxy HTTP Balancing DB Hot Standby postgres pgbouncer Streaming Replication RO RORW RW sabato 18 maggio 13
  • 29.
    Web server Database shardingarchitecture DB Access Server postgres pgbouncer Web server HA Proxy Web layer pl/proxy Database Access Layer DB Shard ... postgres pgbouncer DB Shard n postgres pgbouncer DB Shard 1 postgres pgbouncer Database Data Layer “NoSQL”“NoSQL” “Infinite” horizontal scalability Shared nothing architecture sabato 18 maggio 13
  • 30.
    Architecture recap •Do notget carried away by current trends and buzzwords •Make your decisions about architecture based on: • business continuity requirements • results of both database and application benchmarks •Always start with a simple architecture (two-server) •Enhance gradually up to sharding for infinite scalability sabato 18 maggio 13
  • 31.
    Part III (Open Source)Technologies sabato 18 maggio 13
  • 32.
    PgBouncer •Lightweight connection pooler forPostgreSQL •Written in C by Skype •Recycling connections on: • sessions • transactions • statements (for PL/Proxy) •“Virtual databases” • Real databases can be on different backend servers •Online reconfiguration, restart and upgrade • Online statistics • Supports PAUSE/RESUME sabato 18 maggio 13
  • 33.
    Conguration example [databases] db1_ro =host=pg2 dbname=db1 port=5432 * = host=pg1 port=6432 [pgbouncer] listen_addr = * listen_port = 6432 pool_mode = session ... sabato 18 maggio 13
  • 34.
  • 35.
    hstore •Data type inPostgreSQL •Key/value pairs as a single value •Useful for storing: • unstructured data • many attributes for a record •Keys and values are stored as text •Set of operators • ->, @>, ?, ... •Indexes •Integrates with JSON in 9.3 sabato 18 maggio 13
  • 36.
    JSON Support •Introduced inPostgreSQL 9.2 •Data type •Generators (9.3) •Operators (9.3) •Integration with hstore (9.3) sabato 18 maggio 13
  • 37.
    CREATE TABLE pictures( id SERIAL PRIMARY KEY, user_id INTEGER NOT NULL REFERENCES users(id), title TEXT, metadata JSON, thumbnail BYTEA, picture BYTEA, created_at TIMESTAMP DEFAULT now(), updated_at TIMESTAMP DEFAULT now() ); sabato 18 maggio 13
  • 38.
    INSERT INTO pictures (user_id,title, metadata) VALUES ( 100, 'A picture of my Stratocaster', '{"latitude":43.883,"longitude": 11.1,"width":2240,"height":1680}' ); Storing a JSON value sabato 18 maggio 13
  • 39.
    WITH m AS( SELECT 43.883 AS latitude, 11.1 AS longitude, 2240 AS width, 1680 AS height ) INSERT INTO pictures (user_id, title, metadata) SELECT 100, 'Another picture of my Stratocaster', to_json(m) FROM m; Storing a JSON value #2 sabato 18 maggio 13
  • 40.
    WITH p AS( SELECT title, metadata, created_at FROM pictures WHERE id = 1 ) SELECT to_json(p) AS json FROM p; {"title":"A picture of my Stratocaster","metadata": {"latitude":43.883,"longitude":11.1,"width": 2240,"height":1680},"created_at":"2013-05-11 13:11:37.788636"} 9.3 Returning a JSON object sabato 18 maggio 13
  • 41.
    WITH p AS( SELECT id, title, metadata, created_at FROM pictures WHERE user_id = 100 ) SELECT json_agg(p) AS json FROM p; [{"id":1,"title":"A picture of my Stratocaster","metadata": {"latitude":43.883,"longitude":11.1,"width":2240,"height": 1680},"created_at":"2013-05-11 13:15:21.078747"}, + {"id":2,"title":"Another picture of my Stratocaster","metadata": {"latitude":43.883,"longitude":11.1,"width":2240,"height": 1680},"created_at":"2013-05-11 13:15:21.078747"}] 9.3 Returning an array of JSONs sabato 18 maggio 13
  • 42.
    9.3 A tasteof JSON operators SELECT id, (metadata->'longitude', metadata->'latitude') AS coords FROM pictures WHERE user_id = 100; id | coords ----+--------------- 1 | (11.1,43.883) 2 | (11.1,43.883) ... sabato 18 maggio 13
  • 43.
    PL/Javascript •Procedural Language •Allows toimplement functions inside PostgreSQL using the Javascript Language •Available as an extension •Powered by V8 JavaScript Engine sabato 18 maggio 13
  • 44.
    CREATE FUNCTION plv8_test(keysTEXT[], vals TEXT[]) RETURNS TEXT LANGUAGE plv8 IMMUTABLE STRICT AS $BODY$ var o = {}; for(var i=0; i<keys.length; i++) { o[keys[i]] = vals[i]; } return JSON.stringify(o); $BODY$; SELECT plv8_test(ARRAY['name', 'age'], ARRAY['Angus', '58']); sabato 18 maggio 13
  • 45.
    PL/Proxy •Procedural language for definingproxy functions •Written in C by Skype •The body specifies: • remote function (target) • destination (shard) •Execution can happen on one, some or all shards • Execution is parallel if shards > 1 •Queries are run in auto- commit mode • PgBouncer (statement) sabato 18 maggio 13
  • 46.
    The need foran API •PL/Proxy forces to work with functions •One database cluster with: • access/proxy layer • data layer (shards) •Database cluster ~ Singleton class •Business logic in the database •Programmers need to define an API with access/ authorisation rules • Proxy functions (PL/Proxy) • Remote functions (any) •Supports SQL/MED sabato 18 maggio 13
  • 47.
    Database sharding architecture pg-access-1 postgres pl/proxy pg-data-2 postgres pg-data-1 postgres Database Access Layer Database Data Layer s00s01 s02 s03 - Definition of the cluster - Data/Queries distributed by user (hash) - Functions using the ‘plproxy’ language - User data tables - Functions (any language) sabato 18 maggio 13
  • 48.
    Dene the clusterwith SQL/MED CREATE SERVER mycluster FOREIGN DATA WRAPPER plproxy OPTIONS ( connection_lifetime '1800', s0 'dbname=shard00 host=pg-data-1', s1 'dbname=shard01 host=pg-data-1', s2 'dbname=shard02 host=pg-data-2', s3 'dbname=shard03 host=pg-data-2' ); sabato 18 maggio 13
  • 49.
    Proxy function forGET CREATE OR REPLACE FUNCTION get_pictures(i_uid INTEGER) RETURNS json LANGUAGE plproxy AS $BODY$ CLUSTER 'mycluster'; RUN ON hashint4(i_uid); $BODY$; sabato 18 maggio 13
  • 50.
    Data layer functionfor GET CREATE OR REPLACE FUNCTION get_pictures(i_uid INTEGER) RETURNS json LANGUAGE SQL AS $BODY$ WITH p AS ( SELECT id, title, metadata, created_at FROM pictures WHERE user_id = $1 ) SELECT json_agg(p) AS json FROM p; $BODY$; sabato 18 maggio 13
  • 51.
    Proxy function forINSERT CREATE OR REPLACE FUNCTION insert_picture(i_uid INTEGER, i_title TEXT, i_metadata JSON, i_thumbnail BYTEA, i_picture BYTEA) RETURNS INTEGER LANGUAGE plproxy AS $BODY$ CLUSTER 'mycluster'; RUN ON hashint4(i_uid); $BODY$; sabato 18 maggio 13
  • 52.
    Data function forINSERT CREATE OR REPLACE FUNCTION insert_picture(i_uid INTEGER, i_title TEXT, i_metadata JSON, i_thumbnail BYTEA, i_picture BYTEA) RETURNS INTEGER LANGUAGE SQL AS $BODY$ INSERT INTO pictures (user_id, title, metadata, thumbnail, picture) VALUES ($1, $2, $3, $4, $5) RETURNING id; $BODY$; sabato 18 maggio 13
  • 53.
    Data representation inPHP // Obviously all of this will come from a form $user_id = 100; $title = 'A picture of my Stratocaster'; $metadata = array ( 'latitude' => 43.883, 'longitude' => 11.1, 'width' => 2240, 'height' => 1680, // ... EXIF? ); $thumbnail = ... // Binary stream $picture = ... // Binary stream sabato 18 maggio 13
  • 54.
    INSERT using PDO //$access_layer_conninfo = 'pgsql:...' $connection = new PDO($access_layer_conninfo); $stmt = $connection->prepare('SELECT insert_picture(?, ?, ?, ?, ?)'); // NOSQL! $stmt->bindParam(1, $user_id, PDO::PARAM_INT); $stmt->bindParam(2, $title, PDO::PARAM_STR); $stmt->bindParam(3, json_encode($metadata), PDO::PARAM_STR); $stmt->bindParam(4, $thumbnail, PDO::PARAM_LOB); $stmt->bindParam(5, $picture, PDO::PARAM_LOB); $stmt->execute(); // TODO: Check and get the ID sabato 18 maggio 13
  • 55.
    Memcached •Distributed system forcaching of objects in memory • key/value pairs • volatile data •pgmemcache extension •Integration of database and applications (read scaling) sabato 18 maggio 13
  • 56.
  • 57.
    Part IV Testing anddeployment sabato 18 maggio 13
  • 58.
    Test-Driven DB development •Databasedevelopment technique •Write tests before actual features •pgTAP is an extension for TDDD in PostgreSQL •Allows to check for objects, privileges, functions, triggers, data, queries, etc. •Transactional •Provides with statistics • via SQL or pg_prove •Supports “Test Anything Protocol” (TAP) • Continuous integration sabato 18 maggio 13
  • 59.
    BEGIN; SELECT plan(3); -- inserttest 1 here -- insert test 2 here -- insert test 3 here SELECT * FROM finish(); ROLLBACK; pgTAP group test example sabato 18 maggio 13
  • 60.
    SELECT results_eq( 'SELECT *FROM active_users()', 'SELECT * FROM users WHERE active', 'active_users() should return active users' ); pgTAP example with ‘results_eq’ sabato 18 maggio 13
  • 61.
    Deployment with EXTENSION •Extensibility •Deploymentis a crucial part of any ICT organisation •PostgreSQL 9.1 introduces EXTENSION for packaging in-database applications •Fully transactional (atomic) •Allows for upgrades and downgrades •Simplifies logical dump/ restore of objects •DevOps with EXTENSION: • Devs write extensions • Ops deploy them sabato 18 maggio 13
  • 62.
    Extensions in detail •Extensionsrequire developer to write: • Control file (metadata, settings, dependencies, ...) • Installation scripts (SQL statements) • Update scripts (SQL statements, optional) • Source code / binaries (optional) sabato 18 maggio 13
  • 63.
    Extension management -- INSTALLATION CREATEEXTENSION pictures_app; -- REMOVAL DROP EXTENSION pictures_app; -- UPDATE (fully atomic) ALTER EXTENSION pictures_app UPDATE; sabato 18 maggio 13
  • 64.
  • 65.
    Conclusions •Postgres is muchmore than a “traditional” database •PHP integrates perfectly with it, in multiple flavours (native interface, PDO, Doctrine, etc.) •Start exploiting “vertical scalability” •Always perform benchmark! •Evaluate costs and complexity of “horizontal scalability” •Postgres is “business friendly” in terms of licensing sabato 18 maggio 13
  • 66.
    Feel free toapproach me about •Open Source Community •License •Total Cost of Ownership •Development life cycle •Quality of code and documentation •Business Continuity •Security and data protection •PostgreSQL Eco-system •Performance and scalability •Software development sabato 18 maggio 13
  • 67.
    Community •Next events: • PGDay.IT(Autumn 2013) • PGConf.EU (Dublin, Oct 29 - Nov 1, 2013) •Italian PostgreSQL Users Group (ITPUG) • www.itpug.org •PostgreSQL Europe • www.postgresql.eu sabato 18 maggio 13
  • 68.
    References •PostgreSQL: www.postgresql.org •PgBouncer: http://wiki.postgresql.org/wiki/PgBouncer •PL/Proxy:http://wiki.postgresql.org/wiki/PL/Proxy •PL/V8: http://pgxn.org/dist/plv8/ •pgTAP: www.pgtap.org sabato 18 maggio 13
  • 69.
  • 70.
    Thank you! Copyright (c)2013 2ndQuadrant Italia License: Creative Commons BY-NC-SA 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/it/deed.en sabato 18 maggio 13