GIS
Data Acquisition and Editing
Introduction
Data
Acquisition is the process of getting
data into the computer
Spatial data can be obtained from different
sources, in different format and can be input
in GIS in different methods
The whole process of data editing and
encoding is called the data stream
Analogue & Digital distinction.
Data Collection Workflow
Planning includes establishing user requirements, garnering
resources, and developing a project plan.
Preparation involves obtaining data, redrafting poor-quality
map sources, editing scanned map images, removing noise,
and setting up appropriate GIS hardware and software
systems to accept data.
Digitizing and transfer are the stages where the majority of
the effort will be expended.
Editing and improvement covers many techniques designed
to validate data, as well as correct errors and improve
quality.
Evaluation is the process of identifying project successes
and failures.
Stages in Data Collection Projects
Planning
Evaluation
Editing / Improvement
Preparation
Digitizing / Transfer
Important Guidelines for Data Capture
Scale or resolution: How much detail do you need for your study?
Measurement level: Do you need ordinal data, or are categories
enough?
Accuracy: How well can your measurement tool capture your data?
Sampling method: Do you collect all the data in all the places you
need?
Timeliness: Do you work with time-sensitive data that change quickly
and need to be collected right away?
Data type: Are the data the appropriate data for you application, both in
subject matter and in format? (Do you need field data or satellite data,
for example, or do you need soil data rather than temperature data?)
Data classification system: Do you use the same data classes as
other layers in your database (for example, land-use classes from 1955
versus 2005)?
Completeness: Have you collected all the data that you need to
answer your question?
Types of Data Acquisition
Primary
Sensory Data: those data most commonly associated with
distant sensing devices, such as the Global Positioning System
(GPS), Total Station and various forms of imagery, including both
aerial photography and digital satellite data
Statistical Data: include field data and census data, both of
which usually rely on some form of direct contact by a person to
collect
Secondary
Data collected for other specific purposes can be converted for use
in GIS e.g. Keyboards, Scanning and Digitizing
Primary Data Capture
is data captured specifically for GIS use
Raster - remote sensing (Primary Capture)
usually involves actual sensor collection
e.g. SPOT or IKONOS satellites and aerial photography
Passive and active sensors
Resolution is key consideration
Spatial
Spectral
Temporal
Primary Data Capture in Raster
Data
Disadvantages are:
Resolution
is often too coarse (especially with
Satellite Mounted Sensors);
Most Optical Sensors are restricted by cloud
cover(Except Thermal and Radar sensors).
Vector Primary Data Capture
Surveying
Locations of objects determines by angle and distance
measurements from known locations
Uses expensive field equipment and crews
Most accurate method for large scale, small areas
GPS
Collection of satellites used to fix locations on Earths
surface
Differential GPS used to improve accuracy
Total Station
Primary Data Capture: Field Data
Assembling
field data can involve conducting
House-to-house
surveys
collecting traffic data along roads
recording the air temperature and other
atmospheric data
gathering soils, vegetation, insects, or any
number of other environmental samples.
Primary Data Capture: Field Data Sampling
Its
physically
impossible
to
collect
temperature/elevation data everywhere
In each case, one is forced to collect data from
a sample of the total
For GIS, sampling of geographic space is
required
Primary Data Capture: Field Data Sampling
Methods
Clustered: Sampling focuses on distinct areas
that have a lot of features from which one can
sample.
Systematic: Use a specific, often regular, pattern
to sample. For example, one sample at every
meter along a line.
Random: The sampling has no pattern at all
Primary Data Capture: Field Data Sampling
Methods (Cont..)
Stratification
divide data into groups, or strata
Primary Data Capture: Field Data Sampling
Methods (Cont..)
Stratification (Example)
To stratify your sample of who watches certain
television programs in your city, you could divide the
city into sub portions, or neighbourhoods. Then, you
pick a certain number (for example, 25 people) in
each
neighbourhood
to
sample
randomly,
systematically (for example, every fifth house), or
clustered (such as where housing density is highest).
Secondary Geographic Data Capture
(SGDC)
Data
collected for other specific purposes
can be converted for use in GIS
Raster conversion
Scanning
of maps, aerial photographs,
documents, etc
Important scanning parameters are spatial and
spectral (bit depth) resolution
SGDC: Vector Secondary Data Capture
Collection of vector objects from maps,
photographs, plans, etc.
Digitizing
Manual (table)
Heads-up and vectorization
Photogrammetry the science and technology of
making measurements from photographs, etc.
COGO Coordinate Geometry
SGDC: 1- Keyboard Entry
Keycoding,
is the entry of data into a file at a
computer terminal.
This technique is used for attribute data that
are only available on paper.
Its may be appropriate for tabular data, or for
small numbers of coordinates pairs read from
a paper map source or pocket GPS.
Text scanners and OCR software can be
used to read data automatically.
SGDC: 2- Manual Digitizing
The
most common method of encoding
spatial features from paper maps.
its also used for map encoding where
topology is required and for digitizing features
of interest from hard-copy aerial photographs.
Manual digitizing requires a digitizing table
that is linked to a computer workstation.
Two modes of digitizing: Point & Stream
modes
Digitizing
SGDC: Digitizing Cont.
Manual
digitizing of paper maps is one of the
main sources of positional error in GIS.
The accuracy of encoding depends on factors
like : scale & resolution of the source map,
the quality of the equipment and software
being used.
Errors can be introduced by incorrect
registration of the map document or handwobble
SGDC: Digitizing Cont.
Manual
digitizing can also be used to digitize
low volume of data on demand from scanned
and geocorrected digital map images.
Many
GIS packages provides facilities for
onscreen digitizing using raster backdrop
images as a guide
SGDC: 2 (a)- Heads-Up Digitizing
Head-Up
Screen Digitization
to create vectors from raster layers is to digitize
vector objects manually straight off a computer
screen using a mouse or digitizing cursor and a
GIS Software.
3- Automatic Digitizing
Manual
digitizing is a time consuming tedious
process.
When a large number of complex maps need
to be digitized then a more expensive
alternative is used: automatic digitizing
Two methods:
Scanning
Automatic line following
Scanning
Scanning
Most
commonly used method and
appropriate when raster data are required.
It is a piece of hardware for converting an
analogue source document into a digital
raster format.
3 types:
Flat bed scanners
Rotating drum scanners
Large-format feed scanners
Scanning (Flat Bed)
Scanning (Rotating Drum)
Scanning (Large-format feed scanners)
KartoScan FB VLS
Practical problems: scanning
The possibility of optical distortion when using the flat
bed scanners.
The automatic scanning of unwanted information.
The selections of appropriate scanning tolerance to
ensure important data are encoded and background
data are ignored.
The format of files produced and the input of data to
GIS software.
The amount of editing required producing data suitable
for analysis.
Automatic line follower
Appropriate where digital
versions of clear, distinctive
lines on a map are required.
It mimics manual digitizing and
uses a laser and light sensitive
device to follow the lines on the
map.
It is a vector device and
produces output as (x,y) coordinates.
Some difficulties faced when
digitizing dashed or contour
lines.
4- Electronic Data Transfer
Its
appropriate when the data is already
available in digital form
In most time there is a need to transform of
convert the data to an appropriate format
compatible with the GIS software.
Most GIS software will allow data conversion
Obtaining data from other sources requires
users to address a range of important
questions.
Cont.
Spatial
data may be collected in digital form
and transferred from devices such as GPS,
total stations, and data loggers.
Data may be purchased from a supplier or
obtained from an agency.
Remotely sensed data are normally provided
in electronic form.
Data Editing
After
data encoding, data may include some
errors derived from the original source data,
or errors introduced during the encoding
process
Its better to intercept errors before they
contaminate the GIS DB.
Data editing or cleaning can be done
through four processes
1- Detecting & Correcting errors
Errors
in input data may derive from three
main sources:
Errors in the source data
Errors introduced during encoding (inputting)
Errors propagated during data transfer & conversion
Errors
in attribute data are easy to spot &
may be identified using manual comparison
with the original data
Errors in spatial data are often more difficult
to identify and correct.
Figure 5.11
Examples of spatial error in vector data
Examples of original data problems and the corrected data after
processing
Figure 5.12
Source: Laser-Scan. Copyright 2005 LS 2003 Ltd. All rights reserved
Cont.
Most
GIS packages will provide a suite for
editing tools.
Corrections can be done on-screen or
automatically.
Errors are also present in raster data.
2- Re-projection, transformation
& Generalization
Once
spatial and attribute data have been
encoded and edited, it may be necessary to
process the data geometrically in to provide a
common framework of reference.
The projection system
Different sources (co-ordinate system)
Different origins
Different unit of measurements
Different orientation
Scale & Resolution
Figure 5.15
Topological mismatch between data in different projections
Source: Courtesy of Peter H. Dana
3- Edge matching & rubber
sheeting
When
a map extends across two or three more
map sheets differences or mismatches
between adjacent map sheets may need to be
resolved.
The process involves three basic steps
First mismatches at sheet boundaries must be resolved
Second topology must be rebuilt as new lines &
polygons have been created from the segments that lie
across map sheets
Third, redundant map sheet boundary lines are deleted
or dissolved
Figure 5.17
Edge matching
Cont.
Rubber
sheeting involves stretching the map
in various directions as if it were drawn on a
rubber sheet.
Objects on the map that are accuratly placed
are tacked down and kept still
Others that are in the wrong location pr have
the wrong shape are stretched to fit with the
control points
Figure 5.18
Rubber sheeting
Geocoding address data
Its
the process of converting an address into a
point location
The address itself, a postcode or another nongeographic descriptor is used to determine the
geographical co-ordinates of a location
Geocoding can be affected by the quality of
data
4- Updating &maintaining
spatial DBs
The
world is a very dynamic place and things
change.
Using old and out-of-date map information
would cost time and money
Keeping Dbs up-to-date avoids problems and
is a key aspect of ongoing data editing and
maintenance.
Towards an integrated DB
Each
thematic layer in the DB must be
encoded, corrected and transformed to
create a GIS ready for analysis.