KEMBAR78
GIS Unit 2 | PDF | Data Compression | Fahrenheit
0% found this document useful (0 votes)
178 views25 pages

GIS Unit 2

TIN (Triangular Irregular Network) data models have several advantages over raster data models including more detailed representation of terrain, efficient storage of data in flat areas using fewer triangles, and ability to describe surfaces at varying levels of detail. TIN data models also have disadvantages such as taking more time to process, requiring corrections along triangle edges, and difficulties in spatial analysis involving other data layers. Vector data models provide more accurate representations and allow zooming without loss of detail but have more complex data structures and processing requirements than raster models. Raster data is simple to understand and process but loses detail, has topology and accuracy issues, and requires large storage.

Uploaded by

33 ABHISHEK WAGH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views25 pages

GIS Unit 2

TIN (Triangular Irregular Network) data models have several advantages over raster data models including more detailed representation of terrain, efficient storage of data in flat areas using fewer triangles, and ability to describe surfaces at varying levels of detail. TIN data models also have disadvantages such as taking more time to process, requiring corrections along triangle edges, and difficulties in spatial analysis involving other data layers. Vector data models provide more accurate representations and allow zooming without loss of detail but have more complex data structures and processing requirements than raster models. Raster data is simple to understand and process but loses detail, has topology and accuracy issues, and requires large storage.

Uploaded by

33 ABHISHEK WAGH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Unit II

Advantages of TIN
1) Terrain parameter like slope and aspect are calculated for each triangle and is stored as
attribute of the facet.
2) Efficient since require few triangles in flat areas.
3) Easy for certain analysis such as slope, aspect, volume.
4) Ability to describe the surface at different level of resolution
5) Non-redundant data
6) TIN works better than raster as it gives more detailed representation For higher density of
data point - if your observations are a lot for small area that the data points are very high
then you can have much better representation well this kind of condition is also true in
case of raster, but there the size of the cell will not change, here the size of the triangle
and shape of the triangle will change.
7) If there is variability in the observation; that means, there are lots of changes in a small
area or surface roughness can be represented nicely with TIN.
8) break point features are more accurate - sometimes you are having ridges or valleys
which are sharp changes, elevation changes features and these two can be represented in
a much better fashion using TIN data model rather than raster
9) Position of input feature remains unchanged.
10) Preserves precision of input data
Disadvantages of TIN
1) It takes more time for processing to generate TIN file - TIN takes more time because lot
many things have to be calculated and organized
2) Errors along the edges often need correction
3) Analysis involving comparison with other layers difficult.
4) In case of raster you can extract raster, to make a subset of raster, but in case of TIN the
clip or subset cannot be achieved
5) More expensive to build and process
6) Less efficient than processing raster data.
Vector Data
Advantages of vector data model
1) In comparison with the raster data model, vector data models tend to be better representations
of reality due to the accuracy and precision of points, lines, and polygons.
The geometry of the vector model inherits the accuracy of the original data, as collected by field
surveyors, GPS, photogrammetry, etc, since the structure of the model is based on storing the
actual co-ordinates describing the location of different objects. This means that measurements
done in vector database are as exact as the original data.
2) Vector data tend to be more compact in data structure, so file sizes are typically much smaller
than their raster counterparts
3) Vector data also provides an increased ability to alter the scale of observation and analysis. As
each coordinate pair associated with a point, line, and polygon represents an infinitesimally exact
location, zooming deep into a vector image does not change the view of a vector graphic in the
way that it does a raster graphic.
4) Topology is inherent in the vector model. This topological information results in simplified
spatial analysis (e.g., error detection, network analysis, proximity analysis, and spatial
transformation) when using a vector model.
5) Sometimes fast – Many operations are easy to perform on vector model data, e.g. network
analysis (tracing lines and measuring distances along networks).
6) Vector data structures demand much less computer storage space than raster data structure.
7) Often topological – Most GIS software handles complete topological data structures, which
speeds up the data retrieval and gives information about contiguity and connectivity.
8) Light data easily manageable

Disadvantages of vector model

1) The data structure tends to be much more complex than the simple raster data model. As the
location of each vertex must be stored explicitly in the model, there are no shortcuts for
storing data like there are for raster models (e.g., the run-length and quad-tree encoding
methodologies).
2) The implementation of spatial analysis can also be relatively complicated due to minor
differences in accuracy and precision between the input datasets. Similarly, the algorithms
for manipulating and analyzing vector data are complex and can lead to intensive processing
requirements, particularly when dealing with large datasets.
3) The location of each vertex needs to be stored explicitly
4) For effective analysis, vector data must be converted into a topological structure. This is
often processing intensive and usually requires extensive data cleaning.
5) Topology is static, any updating or editing of the vector data requires re-building of the
topology.
6) Algorithms for manipulative and analysis functions are complex and may be processing
intensive. Often, this inherently limits the functionality for large data sets, e.g. a large
number of features.
7) Continuous data, such as elevation data, is not effectively represented in vector form. Usually
substantial data generalization or interpolation is required for these data layers.
8) Not compatible with remote sensing images data.
9) Complex Data Structures
10) Overlay creates difficulties when Combination of several vector polygon maps
11) Simulation is difficult because each unit has a different topological form
12) Display and plotting can be expensive, particularly for high quality color
13) Spatial analysis and filtering within polygons are impossible
14) The vector data model can be slow to process complex datasets especially on low-end
computers.

Raster Data
Advantages of Raster Data:
 It is very simple data structure - Each grid location represented in the raster image

correlates to a single value


 Continuous features are best represented using raster.

 Overlay analysis is easy to perform with raster model.

 A powerful format for advanced spatial and statistical analysis


 The ability to perform fast overlays with complex datasets

 Raster graphics is inexpensive and ubiquitous. Nearly everyone currently owns some sort

of raster image generator, namely a digital camera, and few cellular phones are sold today
that don’t include such functionality. Similarly, a plethora of satellites are constantly
beaming up-to-the-minute raster graphics to scientific facilities across the globe.
 easy interpretation and maintenance of the graphics, relative to its vector counterpart

 The ability to compress the datasets using either a lossy or lossless compression.

 Easy to understand. Conceptually, the raster data model is easy to understand. It

arranges data into columns and rows. Each pixel represents a piece of territory.
 Processing speed. Raster’s simple data structure and its uncomplicated math produce

quick results. For example, to calculate a polygon’s area, the computer takes the area
contained within a single cell (which remains consistent throughout the layer) and
multiples it by the number of cells making up the polygon. Likewise, the speed of many
analysis processes, like overlay and buffering, are faster than vector systems that must
use geometric equations.
 Powerful format for advanced spatial & statistical analysis

 Easy mathematical operation

 Inexpensive

Disadvantages of Raster Data:

 Topology is not present and has to be represented explicitly.

 Huge storage requirement even for storing simple data.

 For storing multiple attributes at a given cell, multi band data set is required.

 Small features or details are often not observed in the data set depending on spatial

resolution (pixel size).


 Raster files are typically very large. Storage requirement is high.

 Output images are less ―pretty‖ than their vector counterparts. This is particularly

noticeable when the raster images are enlarged or zoomed. Depending on how far one
zooms into a raster image, the details and coherence of that image will quickly be lost
amid a pixilated sea of seemingly randomly colored grid cells.
 The raster data model is that it is not suitable for some types of spatial analyses. For

example, difficulties arise when attempting to overlay and analyze multiple raster
graphics produced at differing scales and pixel resolutions. Combining information from
a raster image with 10 m spatial resolution with a raster image with 1 km spatial
resolution will most likely produce nonsensical output information as the scales of
analysis are far too disparate to result in meaningful and/or interpretable conclusions. In
addition, some network and spatial analyses (i.e., determining directionality or
geocoding) can be problematic to perform on raster data.
 There can be spatial inaccuracies due to the limits imposed by the raster dataset cell

dimensions.
 The use of large cells to reduce data volumes means that phenomenon logically

recognizable structures can be lost and there can be a serious loss of information
 Raster datasets are potentially very large. Resolution increases as the size of the cell

decreases; however, normally cost also increases in both disk space and processing
speeds. For a given area, changing cells to one-half the current size requires as much as
four times the storage space, depending on the type of data and storage techniques used.
 There is also a loss of precision that accompanies restructuring data to a regularly spaced

raster-cell boundary.
 Projection transformations are time consuming

 Accuracy. Sometimes accuracy is a problem due to the pixel resolution. Imagine if you

had a raster layer with a 30 by 30 meter resolution, and you wanted to locate traffic stop
signs in that layer. The entire 30 by 30 meter pixel would represent the single stop
sign. If you converted this raster layer to vector, it might place the stop sign at what was
the pixel’s center. Sometimes problems of accuracy (and appearance) can be resolved by
selecting a smaller pixel resolution, but this has database consequences.
 Scaled up cannot possible without losing quality

Raster Data Compression (Raster Data Encoding)


 Data compression refers to the reduction of data volume.
 A variety of techniques are available for image compression. Compression techniques
can be lossless or lossy.
 Lossy compression is the method which eliminates the data which is not noticeable. In
Lossy compression, a file does not restore or rebuilt in its original form. Data’s quality is
compromised. Lossy compression reduces the size of data.
 Lossless Compression does not eliminate the data which is not noticeable. Lossless
Compression, A file can be restored in its original form. Does not compromise the data’s
quality.
 The wavelet transform is the latest technology for image compression, treats an image as
a wave and progressively decomposes the wave into simpler wavelets.
One of the main disadvantages of raster data model is data storage. Huge space is required for
storing data depending on the resolution of raster. For finer resolution and multi band data sets,
the storage requirement grows exponentially. This problem can be solved by using various
compression models developed for raster data structure. All these models basically use a file
structure for storing the data. However, for optimizing the storage different models use different
types of data encoding methods.
Following are the most important and widely used raster data encoding methods.

1) Exhaustive enumeration / Cell-by-cell raster encoding


By convention, raster data is normally stored row by row from the top left corner. Every pixel is
given a single value, hence there is no compression when many like values are encountered. This
minimally intensive method encodes a raster by creating records for each cell value by row and
column. This method could be thought of as a large spreadsheet wherein each cell of the
spreadsheet represents a pixel in the raster image. Ideal to store the cell values that change
continuously, e.g. DEM.
2) Run Length Encoding:

This method encodes cell values in runs of similarly valued pixels and can result in a highly
compressed image file. If a raster contains groups of cells with identical values, run length
encoding can compress storage. Instead of storing each cell, each component stores a value and a
count of cells with that value. If there is only one cell the storage doubles, but for three or more
cells there is a reduction. The longer and more frequent the consecutive values are, the greater
the compression that will be achieved. The run-length encoding method is useful in situations
where large groups of neighboring pixels have similar values (e.g., discrete datasets such as land
use/land cover or habitat suitability) and is less useful where neighboring pixel values vary
widely (e.g., continuous datasets such as elevation or sea-surface temperatures). This image
encoding method reduces data volumes because each line is recorded more efficiently.
In this method, each row in the image is checked for a group of similar pixels. Instead of storing
all values in a group, a single value is used for entire group. Run length encoding stores cells on
a row-by-row basis. Instead of recording each individual cell’s values, run length encoding
groups cell values by row.
Take this line of data: AAAAAABBBBCCCCCCCCC
It can be rendered as: 6A4B9C
Example 1:
In example 1, 0 is used to represent gray color and 1 is used to represent red color. The first row
is blank and is stored as (0,8). This means there are 8 cells and they are all zeros. In the second
row, there are 4 consecutive zeros so it gets a value of (0,4). After this, we have three
consecutive cells with the value 1 so it gets a value of (1,3). This continues until it reaches the
bottom-right cell.
Example 2

In example 2, different colors are assigned different label e.g for yellow color x label is used.
The first row contains 4 yellow which is reduced by 4x, 2 green is represented by 2w and so on.
This continues until it reaches the bottom-right cell.

3) Block Encoding

 This method is a generalization of run-length encoding to two dimensions. Instead of


sequences of 0s or 1s, square blocks are counted. For each square the position, the size and,
the contents of the pixels are stored.
 File structure is used for storing the image data
 Here data reduction happens in 2 dimensions (along row and column) at a time
 The block coding raster storage technique assigns areas that are blocks to reduce
redundancy.
 The block coding raster image compression method subdivides an entire raster image
into hierarchical blocks. It’s an extension of the run length encoding technique, but extends
it to two dimensions.
 In the example above: Instead of storing 64 grid cells, all it takes is just 7 blocks. Using
block coding, it requires one 3×3 block, two 2×2 blocks and four 1×1 cell blocks to encode
this raster image.
 Image size and no of different features in the image are stored in the first row of file.
 After that, block size (4×4, 3×3, 2×2, 1×1 etc.,), no of blocks and the starting pixel position
of each block are stored in the subsequent lines.

4) Chain Encoding

 File structure is used for storing the data of image.


 Chain coding defines the outer boundary using relative positions from a start point. The
sequence of the exterior is stored where the endpoint finishes at the start point. During the
encoding, the direction is stored as an integer. However, in this example we use cardinal
directions for simplicity. For example, the value 0 is north and 1 is east.
 Here, the boundary of a region (group of same pixels) is stored using a chain of pixels.
 The coordinates of the chain are recorded using several pairs of values, where first number
in the pair indicates the direction of movement of chain and second number indicates
number of pixels in the chain along the direction.
 Chain Codes is also called Freeman chain code or boundary chain code, the chain code can
compress raster data effectively, and it is very convenient for estimating area, length,
concave and convex of turning direction, so it is more suitable for storing graphic data. The
disadvantage is that it is difficult to modify and edit the boundaries, such as merging and
inserting them. Local modification will change the overall structure, which is inefficient.
Moreover, because chain code stores the boundaries of each area as a unit, the boundaries of
adjacent areas will be stored repeatedly, resulting in redundancy.
 In the example, we start at position (5,2). From here we define the border using cardinal
directions and number of movements. We move east 3 positions until we hit the edge. At
this location, we move south 4 positions. This process continues until the end point hits the
start point.

5) Quad Tree Encoding

 This technique makes use of the principle, where a single pixel can be subdivided into any
no of small pixels with same value inside it.
 This principle has great advantage when it comes to storage efficiency.
 Quad tree storage is a technique, where image is divided into 4 quadrants
 Each quadrant is divided into 4 sub quadrants if quadrant have mixed pixels (all pixels are
not same)
 If quadrant is having similar pixels, the sub-divisions are not made for that quadrant.
 A tree like structure is formed to store the details of divisions.
 All points, where division of quadrants is happening are called as nodes.
 The points where, no subdivision is happening are called as leafs.
 This method divides a raster into a hierarchy of quadrants that are subdivided based on
similarly valued pixels. The division of the raster stops when a quadrant is made entirely
from cells of the same value. A quadrant that cannot be subdivided is called a ―leaf node.‖
 Example 1

Example 2

In this example 2, image is divided into 4 quadrants of equal size. Quadrant 1(Top-left) contains
same value i.e is 1, so no subdivision happens. Here 1 becomes leaf node. Quadrant 2 (Top-right)
contains same value i.e is 2, so no subdivision happens and 3 becomes leaf node. Quadrant 3
(Bottom-left) contains all different values, so again subdivision of quadrant took place. It is again
divide into same size quadrant till get same value pixel. 7, 10, 31, 11 all becomes leaf node.
Quadrant 4 (Bottom-right) contains same value i.e is 8, so no subdivision happens and 8
becomes leaf node.
Example 3

Raster data file formats


A multitude of raster file format types are available for use in GIS. The selection of raster
formats has dramatically increased with the widespread availability of imagery from digital
cameras, video recorders, satellites, and so forth. Raster imagery is typically 8-bit (256 colors) or
24-bit (16 million colors). Due to ongoing technological advancements, raster image file sizes
have been getting larger and larger. To deal with this potential constraint, two types of file
compression are commonly used: lossless and lossy. Lossless compression reduces file size
without decreasing image quality. Lossy compression attempts to exploit limitations of the
human eye by removing information from the image that cannot be sensed. Lossy compression
permanently eliminate certain redundant information. It leads to greater redunction in file size.
Lossy compression results in smaller file sizes than lossless compression.
Raster formats

 ADRG – ARC Digitized Raster Graphics


 DRG – Digital raster graphic
 ECRG – Enhanced Compressed ARC Raster Graphics
 ECW – Enhanced Compressed Wavelet (from ERDAS
 IMG – image file format used by ERDAS
 JPEG2000 – Open-source raster format
 MrSID – Multi-Resolution Seamless Image Database
1) American Standard Code for Information Interchange ASCII Grid

ASCII uses a set of numbers (including floats) between 0 and 255 for information storage and
processing. They also contain header information with a set of keywords. Older format where
you can "see" the raster data but very slow. ASCII text files store GIS data in a delimited format.
This could be comma, space or tab-delimited format. Going from non-spatial to spatial data, you
can run a conversion process tool like ASCII to raster.

File extension- *.asc


2) Multiresolution Seamless Image Database (MrSID)
An example of a raster file format with explicit georeferencing information is the
proprietary MrSID (Multiresolution Seamless Image Database) format. This lossless
compression format was developed by LizardTech, Inc., for use with large aerial photographs or
satellite images, whereby portions of a compressed image can be viewed quickly without having
to decompress the entire file.
The MrSID format is frequently used for visualizing orthophotos. A proprietary compression
technique especially for maintaining the quality of large images. Allows for a high compression
ratio and fast access to large amounts of data at any scale.

MrSIDs have impressive compression ratios. Color images can be compressed at a ratio of over
20:1. LizardTech’s GeoExpress is the software package capable of reading and writing MrSID
format.
Single file—extension *.sid
World file—extension *.sdw

3) Enhanced Compression Wavelet (ECW)


Like MrSID, the proprietary ECW (Enhanced Compression Wavelet) format also includes
georeferencing information within the file structure. It is a wavelet-based, lossy compression,
similar to JPEG 2000. This lossy compression format was developed by Earth Resource
Mapping (ERMapper) and supports up to 255 layers of image information. Due to the potentially
huge file sizes associated with an image that supports so many layers, ECW files represent an
excellent option for performing rapid analysis on large images while using a relatively small
amount of the computer’s RAM (Random Access Memory), thus accelerating computation
speed.
ECW is a compressed image format typically for aerial and satellite imagery. This GIS file type
is known for its high compression ratios while still maintaining quality contrast in images. ECW
format was developed by ER Mapper, but it’s now owned by Hexagon Geospatial.

File extension- *.ecw

4) ERDAS Imagine (IMG)

ERDAS Imagine (IMG) files is a proprietary file format developed by Hexagon Geospatial. IMG
files are commonly used for raster data to store single and multiple bands of satellite data. IMG
files use a hierarchical format (HFA) that are optional to store basic information about the file.
For example, this can include file information, ground control points and sensor type.
Each raster layer as part of an IMG file contains information about its data values. For example,
this includes projection, statistics, attributes, pyramids and whether or not it’s a continuous or
discrete type of raster.

File extension- *.img

5) Digital raster graphic (DRG)

Digital Raster Graphic is a raster file format. From scanning a paper USGS topographic map
for use on a computer a digital image is created called DRGs. The DRGs which are created
by USGS are typically scanned at 250 dpi and then the DRGs are saved as a TI FF file in the
server. The Raster data image usually includes the original border information, referred to
as the ―map collar‖. The raster map file is projected by UTM and georeferenced to the
surface of the earth. File extension is based on product.
6) ARC Digitized Raster Graphic (ADRG)
ARC Digitized Raster Graphics is a standard National Imagery and Mapping Agency
(NIMA) digital product. ADRG is designed to support applications that require a raster map
background display.
ADRGs are the digitized maps and transformed charts. The intended exchange medium for
ADRG is a compact disk (CD-ROM). The ADRG’s charts transformed into a specific
georegistration framework and accompanied by ASCII encoded support files. ADRG is
geographically referenced using the equal arc-second raster chart/map (ARC) system in
which the globe is divided into 18 latitudinal bands, or zones. The data consists of raster
images and other graphics generated by scanning source documents. Data file extension *.img
or *.ovr
7) Enhanced Compressed ARC Raster Graphics (ECRG)
Enhanced Compressed ARC Raster Graphics (ECRG) file is an Enhanced Compressed ARC
Raster Graphics. ECRG is geographically referenced using the ARC system in which the
globe is divided into 18 latitudinal bands, or zones. ECRG uses JPEG 2000 compression.
Distributed by the NGA. CADRG/ECRG is geographically referenced using the ARC
system. The data consists of raster images and other graphics generated by scanning source
documents. CADRG achieves a nominal compression rat io of 55:1. ECRG uses JPEG 2000
compression using a compression ratio of 20:1. File extension is based on product.

Native JPEG, TIFF, and PNG files do not have georeferenced information associated with them
and therefore cannot be used in any geospatial mapping efforts. In order to employ these files in
a GIS, a world file must first be created. A world file is a separate, plaintext data file that
specifies the locations and transformations that allow the image to be projected into a standard
coordinate system. The filename of the world file is based on the name of the raster file, while
a w is typically added into to the file extension. The world file extension name for a JPEG is
JPW; for a TIFF it is TFW; and for a PNG it is PGW.

8) Portable Network Graphics (PNG)

Provides a well-compressed, lossless data compression for raster files also background
transparency. PNG files are 24-bit images that support lossless compression. PNG images are
best options for professionals who are related to web. PNG files are designed for efficient
viewing in web-based browsers such as Internet Explorer, Mozilla Firefox, Netscape, and Safari.
PNG image format supports palette-based images, grayscale images, and full color RGB images.
Its file extension is .png and if you want to get a transparent background image, you have to get
PNG image format. This image format is mostly used for getting transparent image background
for logo, banner, and something like this.
It supports a large range of bit depths from monochrome to 64-bit color. Its features include
indexed color images of up to 256 colors and effective 100 percent lossless images of up to
16 bits per pixel.

9) Graphic Interchange Format GIF


GIF is has been quite popular in web for lightweight animations. It is generally used for small
images. Currently, they are also used instead of emoticons in various chat apps. But, for scanned
images containing text and for general images with texts also render clear visibility on web.

GIF, a bitmap image format becomes more popular because of its wide support and portability.
GIF supports up to 8 bits per pixel for each image and animations. It also allows separate palette
of 256 color for each frame. Lempel-Ziv-Welch lossless data compression technique is used to
reduce file size without compromising with the visual quality. Such a GIF image can be
downloaded easily even with very slow modems and this fact has made this image format
popular. It is well-suited for images with sharp edges and relatively few gradations of
color.
File extension- *.gif
World fileextension- *.gfw
10) Tagged Image File Format (TIFF)
This format is associated with scanners. It saves the scanned images and reads them. TIFF
can use run length and other image compression schemes. It is not limited to 256 colors like
a GIF. Widespread use in the desktop publishing world. It serves as an interface to several
scanners and graphic arts packages. TIFF supports black-and-white, grayscale, pseudo color,
and true color images, all of which can be stored in a compressed or decompressed format.
Big TIFF is supported (upto 32bit).
This format is popular among graphic artists, photographers, and print media. Scanning, word
processing, faxing, optical character recognition, desktop publishing, image manipulation, and
page-layout applications. TIFF is flexible, adaptable, and capable of storing image data in a
lossless format.

11) JPEG Joint Photographic Experts Group

JPEG is the most common image format on World Wide Web and most of the digital cameras
produce built-in JPEG image. The compression degree of JPEG image can be adjusted and
typically, it gains 10:1 compression and you can feel the loss of quality in image. A JPEG image
is easily editable or compressible, but once you edit a high quality jpeg image, you cannot regain
that quality again by reversing.
The compression benefit has made JPEG image more popular than other image formats. Such a
reduced image data and compressed image format is useful for responsive presentation in web
though it is not usable for drawing and iconic graphics which need to be enlarged. For medical
and scientific imaging data creation, this format is not suitable as JPEG format being a lossy
compression method. It cannot regain its original quality if it undergoes editing several times.
One drawback of using JPEG is that it does not support transparency in web.

Single file extension- *.jpg, *.jpeg, *.jpc, or *.jpe


World file extension- *.jgw

12) JPEG 2000


This format offers both lossy and lossless storage. Its compression methods improve quality and
compression ratios. This file format also includes the features that are missing in JPEG. It is now
not common as general JPEG, but this format is used in professional movie editing especially for
individual movie frames. New JPEG format but not well supported. Open-source raster format.
This compression technique especially used for maintaining the quality of large imagery. Allows
for a high-compression ratio and fast access to large amounts of data at any scale.
JPEG 2000 typically have a JP2 file extension. They are a wavelet compression with the latest
JPG format giving an option for lossy or lossless compression. JPEG 2000 GIS formats require a
world file which gives your raster geolocation. They are an optimal choice for background
imagery because of its lossy compression. JPEG 2000 can achieve a compression ratio of 20:1
which is similar to MrSID format.
Single file extension- *.jp2, *.j2c, *.j2k, or *.jpx

Non-spatial data attributes and their types


Attributes or characteristics attached to spatial data are referred to as non-spatial data. Whatever
spatial data we see in the form of a colourful map on a computer screen is a presentation of
information which remains stored in the form attribute tables.
It consists of the characteristics of spatial features which are independent of all geometric
considerations. The non-spatial data of town comprise of name of the town, its population,
settlement type, means of transportation and communication, administration set-up, education
institutions, occupations and facilities. It is important to note that all the above mentioned data of
town are not dependent on their location identities. Hence, non-spatial data is independent
from location information.
Non-spatial data are stored in GIS as tables. Such tables are known as non-spatial (attribute)
tables. A non-spatial table is represented by rows and columns in which each row shows a spatial
feature and each column represents a characteristic or attribute. The intersection of a row and a
column gives the value of a specific characteristic for a particular feature. A row is also known
as a record or a tuple and a column is known as a field or item or attribute. Each row relates to a
single object of a geospatial data model. Each object will have multiple attributes that describe
the object, usually it is called as attribute table.
Non spatial data generally one-dimensional and independent. Non spatial data is stored in a
form of table in GIS so it also called as tabular data.
Each geographic feature has one or more attributes that identify what the feature is, describe it,
or represent some magnitude associated with the feature. Example of ground water well then
attributes are who owns the well, what is the depth of the well and what are the different water
levels during monsoon, pre monsoon, post monsoon, water quality .
For example- if we are doing demographic analysis of villages then attributes of each point
(representing a village) must have a unique village ID and other demographic information like
total population, number of males & females, number of children etc.
Another example- if we are doing some GIS analysis related to road then each road must have its
unique Road ID. Other attributes may include like road length, road width, current traffic
volume, number of stations etc.
Different types of attributes
Six kinds of basic attribute data are used in GIS.
Nominal/ Categories, Ordinal/ Ranks, Interval, Ratio, Cyclic, Counts and amounts
1) Nominal/ Categories
The simplest type of attribute which is described by name and is used to identify or distinguish
one entity from another with no specific order. Examples are place name, name of house,
categories of land use such as soil or forest. Nominal attributes includes numbers, letters and
even colors. Nominal is a qualitative, non-numerical and non-ranking scale that classifies
features on intrinsic characteristics.
A nominal attribute data provides descriptive information about the object such as the color of
the object, the name of an object so for instance a city name, or the type of an object. This
descriptive information does not imply any order, size, or any other quantitative information.
That means that you cannot state that one attribute is greater than or less than another attribute or
you cannot multiply attributes together, so for instance, it does not make sense to multiply the
color blue by the color red. The only comparisons you can do with nominal attributes are to
check whether to attributes are equal or not equal.
In GIS you may arrange say names in alphabetically order or in seniority order or some order,
but from GIS point of view these nominal attributes do not have any orders. If you do not define
the type of data which is going to come in each column then, normally it will go as a nominal
data by default. It is simple name and that means arithmetic operations cannot be performed on
nominal types of data, but sometimes we have to use because this is easy and easy to understand,
easy to identify.
Places names are proper noun, if we order them, it does not have any meaning as two adjacent
alphabetically ordered places will not be geographically adjacent. Even if we perform arithmetic
operations on say drivers' license number like if I had two driver license number and the third
one will there the output for the value which will come through the addition of to drive driving
license number will not carry any meaning. So, nominal attributes are just simple to describe or
identify certain properties certain features on the map or distinguish them from one entity to
another without any specific order.
Categories are groups of similar things. These help to organize and make sense of your data. All
features with same value for a category are liked in the same way and different from features
with other values. Each serves has to only identify the particular instance of class from other
members of same class.
2) Ordinal/Ranks
It includes list of discrete classes but with an inherent or natural order or sequence.
Example - order of streams (first order, second order, third order). Here, status of third order
means, there are two more higher order streams exist. That means, there is inherent order or
natural sequence exist with the data.
-Level of education (primary, secondary, undergraduate, postgraduate, doctoral, post-doctoral)
-Agricultural land can be rated by classes of soil quality with class 1 being best, class 2 not so
good etc.
- Residential land can be denoted as low density, medium density and high density
In ordinal there is an order means a kind of stairs in sequence, data has to follow order. The
ranks put features in order from higher to lower, it depends on requirements sometimes it may
from lower to higher. Ranks are used when direct measures are difficult or if the quantity
represents a combination of factors. Ranks are relative so you only know where features falls in
order. You don’t know how much higher and how much lower a value than another value.
Ordinal attribute data imply a ranking or order based on their values. These values can be
descriptive text, or numerical. For example, I can describe an object as having a
high/medium/low ranking, or a ranking of 100/50/1. In either case, these ordinal attributes allow
us to specify rank only, and not scale. So for instance, we can state that high is ordered higher
than low, and high is ordered higher than medium, and low is ordered lower than high, but we
cannot say that high is twice high as medium, and medium is twice high as low.
3) Interval
Interval attribute will have natural sequence and the distance between values will have meaning.
For example scale of temperature Celsius is interval and interval because it makes sense to say
30 and 20 degree are the same having the same difference of 10.
Interval attributes imply a rank order and magnitude or scale. Interval attributes use numbers,
however, those numbers do not have a natural zero, and use an arbitrary zero point instead. For
instance if we look at temperature on the Fahrenheit scale, 0°F is not a natural zero point for
temperature, it is a human defined zero point. Therefore, while we can say that 50°F is 10°F
more than 40°F, we cannot say that 50°F is twice hot as 25°F, again, because 0°F is a human
created zero, and not a natural phenomenon. With an interval attribute, addition and subtraction
to make sense but not multiplication since values are relative from that arbitrary zero.
Interval is an ordinal scale with ranking based on numerical values that are recorded with
reference to an arbitrary datum. Temperature readings in degrees centigrade are measured with
reference to an arbitrary zero (i.e. zero degree temperature does not mean no temperature).
4) Ratio
Ratio attribute have the same characteristics as interval variables with starting point. That
means it has inherent order, difference between two values have meaning and it will have 0 or a
starting point. Example map, it has scale bar 0 and some value on the right hand side. It will have
an order and a sequence plus it is having a meaning that a 2 minus 1 is 1 kilometer same would
be the three minus 2 equal to one kilometer and in addition it will have a starting point at 0.
A ratio attribute implies both rank order and magnitude about a natural zero. Ratio data,
unlike interval attribute data, use numerical attributes of addition, subtraction, multiplication, and
division where there is an absolute natural zero. Ratio is an interval scale with ranking based on
numerical values that are measured with reference to an absolute datum. So for example, rainfall
data are recorded in mm with reference to an absolute zero (i.e. zero mm rainfall mean no
rainfall). If we are measuring speed in miles per hour, then a car not moving at all is moving at
zero miles per hour. In terms of temperature, the only measurement that uses a natural zero is
Kelvin, which has absolute zero. At that point, molecular movement ceases to exist.
5) Cyclic/ Directional
In GIS, it is necessary to deal with the data that can be directional or cyclic, which include flow
direction on a map or compass direction or longitude. It is very important in GIS when we want
to show latitude and longitude data as a cyclic manner same as time. Directional data has
numeric values. Directional data specify that data which is going to come in this particular
column or field in database is directional data.
Normal arithmetic operations cannot be performed on this data. For example, in the special
problem number following 359 is 0. After 359, 0 comes not 360 degree and in this average of
two directions such as 359 plus 1 is 180 and there would be the just opposite direction so that
means, on directional data simple arithmetic operations cannot be performed otherwise, you may
lead towards the south instead of north, and it will give you completely opposite directions.
6) Counts and amounts
Counts and amounts shows total number in the particular field as an attribute. A count is actual
number of features on the map. An amount can be any measurable quantity associated with a
feature. For example, number of student in a class. Using a count or amount we can see the
actual value of each feature as well, as it is magnitude compared to other features.
In modern GIS software such as ArcGIS the data measured in ratio interval scales are type
number and the data measured at ordinal or nominal scales are type string.
Spatial Data
Spatial data is geographical representation of features. In other words, spatial data is what we
actually see in the form of maps (containing real-world features) on a computer screen. Spatial
data which related to space.
It includes location, shape, size and orientation information of features or objects. For example, a
particular square in which its center (the intersection of its diagonals) specifies its location, its
shape is a square, length of one of its sides specifies its size and angle its diagonals e.g., the x-
axis specifies its orientation. Spatial data includes spatial relationships, for example, the
arrangement of three stumps in a cricket ground.
Spatial data generally multidimensional and auto correlated.

In general, spatial data can be of two types:

1. Vector data: This data is represented as discrete points, lines and polygons
2. Rastor data: This data is represented as a matrix of square cells.
Spatial Database system and their types.
A database is a collection of related information that permits the entry, storage, input, output
and organization of data. A database management system (DBMS) serves as an interface
between users and their database. A spatial database includes location. It has geometry as
points, lines and polygons. GIS combines spatial data from many sources with many different
people. Databases connect users to the GIS database.
Spatial database system is database system which has additional capabilities for handling spatial
data. Spatial Database system stores, retrieve, manipulate, query, and analyses geometric data
(spatial data). Spatial data is associated with geographic locations such as cities, towns etc. A
spatial database is used to store and query data related to objects in space and also share spatial data
for GIS as well as other applications.
Spatial database offers spatial data types (SDTs) in its data model and query language. It
supports spatial data types in its implementation, providing spatial indexing and efficient
algorithms for spatial join. Spatial data types e.g Point, Line, Polygon, partitions (maps), graphs
(network) provides a fundamental abstraction for modeling structure of geometric entities in
space as well as their relationship, properties and operations. Some spatial databases handle more
complex structures such as 3D objects, topological coverage, linear networks,
and TINs. Geometry or feature data types are used to store spatial data in database.
A spatial database stores objects that have spatial characteristics that describe them and that have
spatial relationships among them. The spatial relationships among the objects are important, and
they are often needed when querying the database.
Queries posed on these spatial data called spatial queries, where predicates for selection deal
with spatial parameters. For example, a query such as ―List all the customers located within
twenty miles of company headquarters‖ will require the processing of spatial data
types. Effectively, each customer will be associated to a <latitude, longitude> position. A
traditional B+ tree index based on customers’ zip codes or other non-spatial attributes cannot be
used to process this query since traditional indexes are not capable of ordering multidimensional
coordinate data. Spatial database system can work with an underlying DBMS.

Example

A road map is a visualization of geographic information. A road map is a 2-dimensional object


which contains points, lines, and polygons that can represent cities, roads, and political
boundaries such as states or provinces.

Three main types of DBMS are available for GIS to store spatial data. Relational (RDBMS),
object (ODBMS), and object-relational (ORDBMS)
Relational (RDBMS)
A relational database comprises a set of tables, each a two-dimensional list (or array) of records
containing attributes about the objects under study. It has the ability to access data organized in
tabular files that can be related to each other by a common field (item) called keys. An RDBMS
has the capability to recombine the data items from different files, for multiuser access. RDBMS
requires few assumptions about how data is related or how it will be extracted from the database.
RDBMS allows the storage of the physical location and shape of geometric objects inside tables
and become a Spatial Database. The Spatial Data Option is designed to make the storage,
retrieval, and manipulation of spatial data easier and more natural to users, benefiting of all the
power of the RDBMS.
ArcSDE – is ESRI’s technology for accessing and managing geospatial data within relational
databases. ArcSDE technology serves as the gateway between GIS clients and the RDBMS. It
enables you to easily store, access, and manage spatial data within an RDBMS package such as:
DB2, Informix, Oracle, PostgreSQL, SQL Server and SQL Server Express. ArcSDE technology supports:
multiuser editing environments, scalability, reliability, security, backup, and integrity (all the
benefits of relational databases), different type of clients to database connection, with different
user levels.
Object database management systems (ODBMS)
In Object database management systems data is stored in form of objects, which are instances of
classes. These classes and objects together makes an object oriented data model. It stores the
location as objects. Objects can be simple as polygons and lines, or be more complex to represent
cities.

The object-based spatial model treats the world as surface littered with recognizable objects (e.g.
cities, rivers), which exist independent of their locations. While a field-based data model sees the
world as a continuous surface over which features (e.g. elevation) vary, using an object-based
spatial database, it is easier to store additional attributes with the objects, such as direction,
speed, etc. Using these attributes can make it easier to answer queries like "find all tanks whose
speed is 10 km and oriented to north". Or "find all enemy tanks in a certain region". Storing
attributes with objects can provide better result presentation and improved manipulation
capabilities in a more efficient way. In a field-based data model, this information is usually
stored at different layers and it is harder to extract different information from various layers.

Object database management systems (ODBMS) were initially designed to address weaknesses
of RDBMS, including the inability to store complete objects directly in the database (both object
state and behavior), poor performance for many types of geographic query.
In Object database management systems each surface features can be abstracted as a class object
with public properties, such as point, line, area and so on. Specific surface features are an
instance of the object. It also has its own attributes and manages various objects hierarchically. It
is good at describing the complex data types. It adds the database functionality to object
programming languages. Its shortcomings are lack of standard, development tools and defense
mechanism. Its model is complex.
Hybrid object-relational DBMS (ORDBMS)
An object-relational database (ORD), or object-relational database management
system (ORDBMS), is a database management system (DBMS) similar to a relational database,
but with an object-oriented database model: objects, classes and inheritance are directly
supported in database schemas and in the query language. It performs complex data handling and
full functionality of DBMS.
Hybrid object-relational DBMS (ORDBMS) can be thought of as an RDBMS engine with an
extensibility framework for handling objects. The ideal geographic ORDBMS is one that has
been extended to support geographic object types and functions through the addition of a
geographic query parser, a geographic query optimizer, a geographic query language,
multidimensional indexing services, and storage management for large files, long transaction
services and replication services.
ORDBMS has the features inherited from both of SQL of relation world and object world in
essence. It also adds flexibility in data server. It supports complex "user-defined" application
object and logic. It uses abstract data type which can hide any complex internal structure and
properties to express spatial object. It also adds that type's operation in user-defined data types.
ORDBMS includes Topologies and methods for analyzing spatial relationships. It has Multi-
dimensional, hierarchical indexes for searching spatial data. It provides facilities to storage both
spatial and non-spatial data in the same database.
The commercial DBMS vendors have released spatial database extensions to their standard
ORDBMS products.

Spatial Database Examples


 Proprietary Esri File Geodatabases stores vectors, rasters, tables, topology and
relationships. Schemas can be set up for data integrity. File geodatabases offer structural,
performance and data management advantages.
 Open source PostGIS adds spatial objects to the cross-platform PostgreSQL database.
The three features that PostGIS delivers to PostgreSQL DBMS are spatial types, indexes
and functions. With support for different geometry types, the PostGIS spatial database
allows querying and managing information about locations and mapping.
 Other database examples include SQL Server (where geometry is just another data type,
like char and int) and Microsoft Access (known as a personal geodatabase in ArcGIS).

Spatial databases provide a mechanism for multiple users to simultaneously access shared spatial
data – similar to a DBMS.

You might also like