Tutorial 05: Introduction to PostGIS
1. What is a spatial database:
A spatial database has column data types specifically designed to store
objects in space—these data types can be added to database tables. The
information stored is usually geographic in nature, such as a point location
or the boundary of a lake. The spatial database also provides special
functions and indexes for querying and manipulating that data, which can
be called from a query language such as Structured Query Language
(SQL).
A spatial database is often used as just a storage container for spatial data,
but it can do much more than that. Although a spatial database need not be
relational in nature, most are. A spatial database gives you a storage tool,
an analysis tool, and an organizing tool all in one.
Presenting data visually is not a spatial database’s only goal. Spatial
database help us to answer different questions related to geographic
phenomena. Without spatial support, these questions would require you to
collect or derive additional values for each data point.
2. Introduction to PostGIS:
PostGIS is a free and open source library that spatially enables the free and
open source PostgreSQL object-relational database management system
(ORDBMS). We want you to choose PostgreSQL as your relational
database and PostGIS as your spatial database extender for PostgreSQL.
PostGIS started as a project of Refractions Research and has since been
adopted and improved on by government, public organizations, and private
companies.
Proj4—Provides projection support
Geometry Engine Open Source (GEOS)—Advanced geometry
processing support
Geospatial Data Abstraction Library (GDAL)—Provides many
advanced raster processing features
Computational Geometry Algorithms Library (CGAL/SFCGAL)
Enables advanced 3D analysis.
Most of these projects, including PostGIS, now fall under the umbrella of
the Open Source Geospatial Foundation (OSGeo).
Mohammed Mahmoud
1
The foundation of PostGIS is the PostgreSQL ORDBMS, which provides
transactional support, gist index support for spatial objects, and a query
planner out of the box. It’s a great testament to the power and flexibility of
PostgreSQL that Refractions chose to build on top of PostgreSQL rather
than on any other open source database. PostgreSQL supports many of the
newer ANSI SQL features. PostGIS supports Open Geospatial Consortium
(OGC) standards and the new SQL Multimedia Spec (SQL/MM) spatial
standard.
PostGIS provides many spatial operators, spatial functions, spatial data
types, and spatial indexing enhancements to PostgreSQL. If you add to the
mix the complementary features that PostgreSQL and other related projects
provide, then you have a jam-packed powerhouse at your disposal that’s
well suited for sophisticated GIS analysis and is a valuable tool for learning
GIS.
2.1. PostGIS Alternatives:
Oracle Spatial: Oracle was the one that started it all. In Oracle 7, joint
development efforts with Canadian scientists gave birth to SDO (Spatial
Data Option). In later releases, Oracle redubbed this lovechild Oracle
Spatial.
MICROSOFT SQL SERVER: Microsoft introduced spatial support in
their SQL Server 2008 offering with its built-in Geometry and Geodetic
Geography types and companion spatial functions. To Microsoft’s
credit, you’ll get the same feature set with their express, standard,
enterprise, and data center offerings. You may just be limited regarding
database size, how many processors you can use, and what query plan
features you’re allowed.
SPATIALITE: Our favorite kid on the block is SpatiaLite, which is an
add-on to the open source SQLite portable database. SpatiaLite is
especially interesting because it can be used as a low-end companion to
PostGIS and other high-end, spatially enabled databases. It can run on
an Android smartphone, and it can be used to create master/slave
applications to provide basic lightweight spatial support for portable
devices.
MYSQL: MySQL has had elementary spatial support since version 4,
but as a database MySQL is handicapped by its lack of a powerful SQL
engine. Its primary audience is still developers who are looking for a
database that will store something, rather than do something.
Mohammed Mahmoud
2
ARCGIS BY ESRI: We must give a nod to Esri, which has long
packaged its spatial database engine (SDE) with its ArcGIS for Server
product. The SDE engine is integrated into the ArcGIS line of products
and is often used to spatially enable or augment legacy or weak database
products, such as Microsoft SQL Server 2005 and Oracle Locator.
2.2. Downloading PostGIS:
PostGIS extension could be downloaded from PostGIS website or using
PostgreSQL Stack builder. For windows user go to the following link
(http://download.osgeo.org/postgis/windows/), and then download
PostGIS installer according to your PostgreSQL version.
For example we are using PostgreSQL 9.4, so we will open pg94 folder.
Mohammed Mahmoud
3
Then download the installer according to your operating system type x32
or x64. To download PostGIS using Stack Builder from start menu open
Application Stack Builder.
Then select your PostgreSQL database server and click Next.
Mohammed Mahmoud
4
From categories list expand Spatial Extensions and the Select the installer
according to your operating system type x32 or x64.
Specify the download directory and then click Next. Wait until download
complete and then start the installation.
Mohammed Mahmoud
5
Click Next to start installation or select skip installation to install PostGIS
later.
Click I Agree.
Mohammed Mahmoud
6
Let Create Spatial database unchecked, we will create it later, and click
Next.
Select the in whicj to install PostGIS it is necessary to be PostgreSQL
database server directory. Then click Next.
Mohammed Mahmoud
7
Do you want to register the GDAL_DATA prompt is new for PostGIS 2.0.
This is because in order to do operations that require raster transformations
or other rater warping / clipping etc, PostGIS uses GDAL epsg files. The
windows build, makes a local copy of these in the PostgreSQL install\gdal-
data folder and saying yes will automatically add an GDAL_DATA
environment variable putting this path in for you. If you use GDAL already
(or you are running both PostGIS 32-bit and 64-bit, chances are you already
have this environment variable set and you may not want to overwrite it.
PostGIS will work happily with an existing one, but just remember if you
uninstall a PostGIS 2.0 or your GDAL, these functions may stop working
and you'll need to repoint the environment variable.
Enable Raster drivers. In 2.0.6, 2.1.3 all raster drivers were disabled by
default. Saying yes to this prompt allows the most common drivers (that
are considered safe). E.g. they don't call out to web services etc. If you are
not content with the list shown there, you may want to explicitly enable
additional drivers using the new PostGIS 2.2. GUC raster features PostGIS
GDAL Enabled Raster Drivers GUC, that can either be set using ALTER
SYSTEM (for 9.4+), or ALTER DATABASE for specific databases.
Mohammed Mahmoud
8
Enable Out of database rasters. In 2.0.6, 2.1.3 out of database rasters were
disabled by default. If you need them, say yes to this prompt. Again if you
want each database to have different settings, you can opt for the new
PostGIS 2.2 GUC route. Enable Out of Database rasters GUC.
Once the installation completed click on Close button.
2.3. Create a Spatial Database using SQL:
Once you’ve successfully installed the binaries, you can create a database
with a command such as this:
CREATE DATABASE postgis_db;
You next enable PostGIS in your database by connecting to the database
and running the single-line SQL command in the following listing.
\connect postgis_db;
CREATE EXTENSION postgis;
Mohammed Mahmoud
9
After a PostGIS install, do a quick verification of the version to make sure
the installation succeeded. Execute the following query:
SELECT postgis_full_version();
Unlike conventional character-based databases, spatial databases must be
experienced visually. When you view a bitmap file, you’d much rather see
the rendered bitmap than the bits themselves. Similarly, you’d much rather
see your spatial objects rendered rather than their textual representations.
Many visualization tools are available for free download, with OpenJump
and QGIS being two of the more popular ones. We encourage you to install
multiple viewing tools for comparison.
2.4. Spatial data types:
There are four key spatial types offered by PostGIS: geometry, geography,
raster, and topology. PostGIS has always supported the geometry type from
its inception. It introduced support for geography in PostGIS 1.5. PostGIS
2.0 raised the bar more by incorporating raster and network topology
support. Although PostGIS 2.1 introduces many more functions, perhaps
the most important feature it provides is faster speed, particularly for raster
and geography operations:
Geometry—The planar type. This was the very first model and it’s still
the most popular type that PostGIS supports. It’s the foundation of the
other types. It uses the Cartesian math you learned about in high school
geometry.
Geography—The spheroidal geodetic type. Lines and polygons are
drawn on the earth’s curved surface, so they’re curved rather than
straight lines.
Raster—The multiband cell type. Rasters model space as a grid of
rectangular cells, each containing a numeric array of values.
Topology—The relational model type. Topology models the world as a
network of connected nodes, edges, and faces.
These four types can coexist in the same database and even as separate
columns in the same table.
Geometry Types:
Mohammed Mahmoud
10
In two dimensions, you can represent all geographical entities with three
building blocks: points, linestrings, and polygons The geometry type treats
the world as a flat Cartesian grid. The geometry model is intuitively
appealing and computationally speedy, but it suffers from one major
shortcoming—the flat earth.
Geography Type:
The curvature of the earth comes into play when you’re modeling anything
that extends beyond the visual horizon. Although geometry works for
architectural floor plans, city blocks, and runway diagrams, it comes up
short when you model shipping lanes, airways, or continents, or whenever
you consider two locations that are far apart. You can still perform distance
computations without abandoning the Cartesian underpinnings by
sprinkling a few sines and cosines into your formulas, but the minute you
need to compute areas, the math becomes intractable.
A better solution is to use a new family of data types based on geodetic
coordinates—geography. This new family shields the PostGIS user from
the complexity of the math. As a trade-off, geography offers fewer
functions, and it trails geometry in speed. You’ll find the same point,
linestring, and polygon data types in geography; just keep in mind that the
linestrings and polygons conform to the curves of the earth.
Raster Type:
Geometry and geography are vector-based data types. Loosely speaking,
anything you can sketch with an ultra-fine pen without running short on
ink lends itself to vector representation. Vectors are well suited to modeling
designed or constructed features, but suppose you snap a colored photo of
the coral-rich Tasmania sea.
Raster data is consists of microscopic rectangles and assign a color value
to each rectangle (a mosaic of pixels). Each pixel stores three different
color values: the intensity of red, green, and blue (hence the term RGB). In
raster-speak, each color is called a band. The scale of each pixel’s edges
corresponds to the underlying spatial reference system—if you’re
measuring in meters, the pixel represents one square meter of area. Raster
data almost always originates from instrumental data collection and often
serves as the raw material for generating vector data. As such, you’ll
encounter plenty more sources of raster data than vector data. Rasters
appear in the following applications:
Land coverage or land use.
Mohammed Mahmoud
11
Temperature and elevation variations. This is a single-band raster where
each square holds a measured temperature or elevation value.
Color aerial and satellite photos. These have four bands—one for each
of the colors of the RGBA color space.
Topology type:
Topology models take on this jigsaw perspective of the world. Topology
recognizes the inherent interconnection of geographic features and exploits
it to help you better manage data. Topology isn’t concerned with the exact
shape and location of geographic features, but with how they’re connected
to each other. Topology is useful in the following applications:
Parcel (land lot) data, where you want to ensure that the change of one
parcel boundary adjusts all other parcels that share that boundary
change as well.
Road management, water boundaries, and jurisdiction divisions. U.S.
Census MAF/Topologically Integrated Geographic Encoding and
Referencing system (TIGER) data is a perfect example
(http://www.census.gov/geo/www/tiger/).
Architecture.
Spatial Data Type:
Do keep in the back of your mind that PostgreSQL has its own built-in
geometric types. These are point, polygon, lseg, box, circle, and path.
PostgreSQL geometry types are incompatible with the PostGIS geometry
type and have little or no third-party visualization support. These geometry
types have existed since the dawn of PostgreSQL and don’t follow the
SQL/MM standards, nor do they support spatial coordinate systems. Using
them as mathematical tools is fine, but stay away from them for GIS use or
if you need visualization support.
LAYERS: Geometry, geography, and raster columns in a spatial table are
often referred to as layers or feature classes when displayed in mapping
applications. Before you begin, you’ll need to create a schema to house the
data for this tutorial:
CREATE SCHEMA vector;
The geometry and geography data types have hierarchical structures. You
can declare a column to be a geometry or geography type, but you can and
Mohammed Mahmoud
12
should be more specific where possible and use subtypes. Subtypes are not
data types, but for geography and geometry are type modifiers. Some
examples of geometry subtypes are POINTZ, POINT, LINESTRING,
LINESTRINGM, POLYGON, POLYGONZ, POLYHERALSURFACE,
POLYHEDRALSURFACEZ, TIN, and TINZ. A typical type declaration in
PostGIS is geometry(POINT,4326), where geometry is the data type,
POINT is the subtype type modifier, and 4326 is the SRID type modifier.
One attribute that all data types have is a spatial reference identifier
(SRID). In order to use a particular SRID, it must be present in the
spatial_ref_sys table.
At the dawn of PostGIS, geometry was the only data type available. The
geometry data type was so named because its basis is analytical geometry
(Cartesian geometry).
Point:
The following statement is used to create a new table to store points using
GEOMETRY datatype and POINT subtype.
CREATE TABLE vector.my_points (
id INTEGER PRIMARY KEY,
name VARCHAR(50),
geom GEOMETRY(POINT, 4326)
);
The following SQL statement is used to insert point into my_points table.
INSERT INTO vector.my_points VALUES (
1, 'Point One',
ST_GeomFromText('POINT(1 1)', 4326)
);
Linestring:
Connected straight lines between two or more distinct points form
linestrings. Individual lines between points are called segments. Segments
aren’t data types or subtypes in PostGIS, but it is possible for a linestring
to have just one segment. The following statement is used to create a new
table to store line features:
CREATE TABLE vector.my_lines (
id INTEGER PRIMARY KEY,
name VARCHAR(50),
Mohammed Mahmoud
13
geom GEOMETRY(LINESTRING, 4326)
);
The following SQL statement is used to insert a line into my_lines table.
INSERT INTO vector.my_lines VALUES (
1, 'Line One',
ST_GeomFromText('LINESTRING(0 0, 1 1, 1 -1)', 4326)
);
Polygons:
Closed linestrings are the building blocks of polygons. Let’s start by
creating a triangle. Any closed linestring with three distinct, non-collinear
points will build a triangle. By definition, a polygon contains all the
enclosed area, and its boundary is the linestring that forms it. All points
enclosed by the linestring and the points on the linestring itself form the
polygon. The closed linestring outlining the boundary of the polygon is
called the ring of the polygon when used in this context; more specifically,
it’s the exterior ring. The following statement is used to create a new table
to store polygon features:
CREATE TABLE vector.my_polygons (
id INTEGER PRIMARY KEY,
name VARCHAR(50),
geom GEOMETRY(POLYGON, 4326)
);
The following SQL statement is used to insert a polygon into my_polygons
table.
INSERT INTO vector.my_polygons VALUES (
1, 'Polygon One',
ST_GeomFromText('POLYGON((0 0, 1 1, 1 -1, 0 0))', 4326)
);
The following SQL statement is used to insert a polygon with hole into
my_polygons table.
INSERT INTO vector.my_polygons VALUES (
2, 'Polygon One',
ST_GeomFromText('POLYGON(
(32 15, 32 18, 35 18, 35 15, 32 15),
(33 16, 33 17, 34 17, 34 16, 33 16))', 4326)
Mohammed Mahmoud
14
);
Multi-geometries:
In PostGIS and other spatial databases, each of the single geometry
subtypes we’ve just covered has a collection counterpart: multipoints,
multilinestrings, and multipolygons. In addition, PostGIS has a data type
called geometrycollection. This data type can contain any kind of geometry
as long as all geometries in the set have the same spatial reference system
and the same coordinate dimensions.
Multipoint:
CREATE TABLE vector.multi_points (
id INTEGER PRIMARY KEY,
name VARCHAR(50),
geom GEOMETRY(MULTIPOINT, 4326)
);
INSERT INTO vector.multi_points VALUES (
1, 'Multi Point 1',
ST_GeomFromText('MULTIPOINT(32 15, 32 17, 32 19)', 4326)
);
Multilinestring:
CREATE TABLE vector.multi_lines (
id INTEGER PRIMARY KEY,
name VARCHAR(50),
geom GEOMETRY(MULTILINESTRING, 4326)
);
INSERT INTO vector.multi_lines VALUES (
1, 'Multi Line 1',
ST_GeomFromText('MULTILINESTRING((0 0,0 1,1 1),(-1 1,-1 -1))',
4326)
);
Multipolygon:
Mohammed Mahmoud
15
CREATE TABLE vector.multi_polygon (
id INTEGER PRIMARY KEY,
name VARCHAR(50),
geom GEOMETRY(MULTIPOLYGON, 4326)
);
INSERT INTO vector.multi_polygon VALUES (
1, 'Multi Polygon 1',
ST_GeomFromText('MULTIPOLYGON(
((2.25 0,1.25 1,1.25 -1,2.25 0)),((1 -1,1 1,0 0,1 -1))
)', 4326)
);
Mohammed Mahmoud
16