SWAE3001 GIS for Environmentalists
LECTURE 2
Spatial Data and Spatial Data Models
Learning Outcomes
After learning the material covered in this lecture, you should
be able to:
• Explain the difference between data and information
• Describe the main characteristics of spatial data
• Provide a definition of ‘spatial data model’
• Distinguish between rasters and vectors
• Describe a spatial data structure
Spatial and non-spatial data
Maps as Numbers
• GIS requires that both data and maps be represented as
numbers.
• The GIS places data into the computer’s memory in a
physical data structure (i.e. files and directories).
• Files can be written in binary or as ASCII text.
• Binary is faster to read and smaller, ASCII can be read by
humans and edited but uses more space.
Distinction between Data and Information
• There is a clear distinction between data and information
• Data are observations we make from monitoring the real
world
• Data are collected as facts or evidence that may be
processed to give them meaning and hence turn them into
information
• Information = data with meaning and context added
Types of GIS data
Two broad types of data can be identified:
• Continuous data - A collection of spatial distributions and
is referred to as continuous data. Examples include altitude,
rainfall, temperature, etc.
• Object-based data - Composed of identifiable entities and
is referred to as discrete data. Examples include roads,
rivers, land parcels, etc.
Representing Geographic Data: Continuous vs. Discrete
David Sandwell and Walter H.F. Smith
Discrete (objects) Continuous (field)
Objects with well-defined boundaries in Finite number of variables, each one defined
otherwise empty space at every possible position
GIS data models
The two types of data - field-based and object-based are
implemented in two geographic data models:
Data as Features
Figure 2.7 Real-world objects commonly stored as a point
Figure 2.8 Real-world objects stored as lines
Figure 2.9 Real-world objects commonly represented as an area
A simple spatial entity model
Figure 3.2 A simple spatial entity model for Happy Valley
Figure 3.9 Raster and vector spatial data
Figure 3.9 Raster and vector spatial data (Continued)
Figure 3.10 Effect of changing resolution in the raster (left) and vector worlds (right)
Raster model
Raster
A raster data model uses a grid
• One grid cell is one unit or holds one attribute.
• Every cell has a value, even if it is “missing.”
• A cell can hold a number or an index value standing for an
attribute.
• A cell has a resolution, given as the cell size in ground units.
Generic structure for a grid
Thematic Layers
Figure 3.11 A simple raster data structure
Figure 3.12 Feature coding of cells in the raster world
The mixed pixel problem
Raster Encoding
Raster Data Formats
• Most raster formats are digital image formats.
• Most GISs accept TIF, GIF, JPEG or encapsulated
PostScript, which are not georeferenced.
• DEMs are true raster data formats.
Images
• Images can be either simple (one layer) or composite (a collection
with multiple layers)
Simple – a black and white image
Composite - multi-spectral satellite images where
• each layer stores the amount of reflectance from a
different wavelength of the electromagnetic spectrum.
• By assigning different colors to each layer, analysts can evaluate
factors such as land cover type and vegetation density
Vector
• A vector data model uses points stored by their real (earth)
coordinates.
• Lines and areas are built from sequences of points in order.
• Lines have a direction to the ordering of the points.
• Polygons can be built from points or lines.
• Vectors can store information about topology.
Vector Structure
• Primitives: points and lines
linking points.
Points: A single set
of coordinates (X and Y) in a
coordinate space.
Lines (Arcs): Set of linked
Points (Nodes).
Polygon : Set of closed lines.
VECTOR TOPOLOGY
Topology is the spatial relationships between
geographic features. It is not to be confused with
topography, the form of the land.
The Components of Topology
Topology has three fundamental components:
a. Connectivity:
Arcs are connected to others (at nodes). This identifies
possible routes and networks, such as rivers and roads, via
the lists of arcs and nodes in the database.
b. Containment:
An enclosed polygon has a measurable area; lists of arcs
define boundaries and closed areas.
c. Contiguity:
The adjacency of polygons can be determined by shared
arcs.
Arc/node map data structure with files
Vector Structure (cont.)
• Vector GIS are designed around point, line, and polygonal
objects and their related attribute data.
• This is commonly known as the georelational model
Method Advantages Disadvantages
Requires greater storage space on
Simple data structure computer
Compatible with remotely sensed or Depending on pixel size, graphical
scanned data output may be less pleasing
Raster
Simple spatial analysis procedures Projection transformations are more
difficult
Accommodates both discrete and
continuous data More difficult to represent topological
relationships
More complex data structure
Not as compatible with remotely
Requires less disk storage space
sensed data
Topological relationships are readily
Software and hardware are often more
maintained
Vector expensive
Graphical output more closely
Some spatial analysis procedures may
resembles hand-drawn maps
be more difficult
Overlaying multiple vector maps is
often time consuming
Common Vector Data Formats
• Shapefiles
• Coverages
• Geodatabases
• CAD files
• Event tables
• Triangulated Irregular Networks (TINs)
Shapefiles
• Simple vector file structure for storing the location and
attribute information of points, lines, and polygons.
• The name "shapefile" is somewhat misleading, because each
shapefile consists of at least three files: shapefile.shp,
shapefile.shx, and shapefile.dbf.
– For example, a shapefile containing locations of parks would
include the files parks.shp, parks.shx, and parks.dbf.
• <Shapefile>.shp and <Shapefile>.shx store information about feature
geometry.
• <Shapefile>.dbf is the shapefile's feature attribute table storedin dBASE
format.
The ArcView Shapefile Model
The ArcView Shapefile Model
Shapefiles (cont.)
• A shapefile can contain only one feature class.
Therefore, a park point feature class (representing
the park office address) must be stored in a different
shapefile than a park polygon feature class
(representing the parks boundary).
• May also include the shapefile's metadata file
(shapefile.shp.xml) and the shapefile's projection
file (shapefile.prj)
Conversions
• Rasterization
– Vector to raster conversion
– Smooth lines become jagged
– Areas smaller that the pixel size disappear
– Mixed pixel problem
• Vectorization
– Raster to vector conversion
– Continuous versus discrete rasters
– Where to draw the lines becomes an issue
• Tolerance
– Lines typically are smoothed
•Transferring and exchanging data
•May create errors
•Especially between different formats such
as vector and raster.
•Vector to raster is easier than raster to vector.