KEMBAR78
Data Compression | PDF | Data Compression | Information And Communications Technology
0% found this document useful (0 votes)
120 views23 pages

Data Compression

Data compression is a technique that reduces the size of files by encoding information using fewer bits, which can save resources like storage space and transmission time. It can be classified into lossless and lossy compression, with lossless preserving original data and lossy sacrificing some quality for smaller file sizes. Common examples include ZIP for lossless compression and JPEG for lossy compression, each serving different needs in data storage and transmission.

Uploaded by

umarani7815
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views23 pages

Data Compression

Data compression is a technique that reduces the size of files by encoding information using fewer bits, which can save resources like storage space and transmission time. It can be classified into lossless and lossy compression, with lossless preserving original data and lossy sacrificing some quality for smaller file sizes. Common examples include ZIP for lossless compression and JPEG for lossy compression, each serving different needs in data storage and transmission.

Uploaded by

umarani7815
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Data Compression

DATA COMPRESSION
• Data compression is a technique that is used to encode
real-world information using fewer bits than a
straightforward encoding requires.
• Data compression is often very useful because it
reduces the consumption of resources such as hard disk
space or the time it takes to send files across the
Internet.
• If a digital image can be compressed from 10 MB in size
to 1 MB in size, for example, the compressed image can
be sent across the Internet ten times more quickly than
the uncompressed image.
• if a photo album that consumes 200 GB of image data
can be compressed to 20 GB, then you may not need to
buy a new hard drive to store your photo album!
• There is a cost to using data compression,
however, since a great deal of computational
work is required for a computer to take the
uncompressed file and find a way to eliminate
some of the bits.
• Also, when a file is stored in compressed form,
the computer must take time to uncompress
the file for further processing.
• This computational work also takes time to
perform and sometimes the cost of
compression outweighs the benefits.
American football
• Teams will often try to gain an advantage by
executing plays more quickly than their
opponent can respond.
• The problem is that a single play may take
many words to fully convey since every play
must describe things like the blocking scheme,
the snap count, the position of the running
backs, and the direction and nature of the play
itself.
• “The snap count is 3. We will use a
zone-blocking scheme to the right.
• The fullback must line up to the right, the
halfback will motion to the left, and the
quarterback will hand off to the right.”
• Calling such a play by fully describing it with
full sentences is simply too time consuming
for effective game play
• A simple play-calling scheme might use four digits
to communicate a single play.
• The first digit might indicate the snap count, the
second the blocking scheme, the third will define
the alignment and motion of the backs, and the
fourth will describe the nature of the play itself.
• The play 3518, for example, might be understood
as 3-snap count; 5-zone blocking to the right;
1-fullback aligns right, halfback motions left;
8-quarterback hands off to the right.
• This compression technique takes work by the
players to memorize and learn what the numbers
actually mean but this cost occurs before the
game is played
• The obvious and central idea in data
compression is that unnecessary bits should
be eliminated from the encoding scheme.
• While this is an obvious idea, identification
and elimination of unnecessary bits is often a
very difficult task and largely depends on the
type of information being compressed: text,
images, or audio.
Run-Length Encoding
• Run-length encoding is one of the simplest types
of compression techniques that can be used on
images and even text.
• We will describe how run-length encoding can be
used to encode multiple pixels as a single value
rather than recording each pixel individually.
• Consider, for example, a binary image in which
each pixel is encoded as a single bit that is either 0
(denoting black) or 1 (denoting white). One row of
the image may contain the 32 pixel sequence
11111111110001111111111111111111
• This row contains three runs: 10 white pixels
followed by 3 black followed by 19 white. This
information can be encoded as the 3 byte
sequence {10, 3, 19} by assuming that the data
begins with white runs.
• Note that the raw representation uses 32 bits
of memory while the run-length encoding uses
only 24 bits of memory if we assume that 8 bit
bytes are used to store each run.
how run-length encoding can
be used to compress binary image
• The heart image has 18 columns and 14 rows.
• Raw encoding, where a single bit is used to encode each
pixel, requires 18 × 14 or 252 bits.
• Run-length encoding, where we encode each row by
recording the lengths of each run in the row, uses
significantly fewer bits.
• The first row, for example, is encoded by the five
numbers {3, 3, 5, 4, 3} since the first row starts with
three white pixels followed by three black pixels and so
on.
• If we count the total number of runs in the entire image,
we find that there are a total of 43 runs.
• Since the longest run is 18, we realize that we can
encode each run using five bits.
• Therefore, the total number of bits required to
run-length encode this image is given as 43 runs times 5
bits per run for a total of 43 × 5 or 215 bits.
• You may notice that some runs, rows 3 to 7 for
example, begin with runs of length zero.
• This is because when we decompress the run
lengths we must assume that we are starting with
either white or black runs.
• In this example, the assumption is that each row
starts with a run of white pixels.
• Since there are no white pixels at the beginning of
these rows, we encode the runs as being length
zero.
• When decompressing the run-length data, we
must also know how many columns and rows are
in the image.
• These two pieces of information must be stored in
the image header.
• Compression schemes can be classified as either
lossless or lossy.
• A lossless compression scheme encodes an exact
representation of the original data,
• whereas a lossy compression scheme encodes an
approximate representation.
• The central idea behind lossy compression is to
identify pieces of information that are of little
importance and simply discard those pieces of
information in order to use fewer bits.
• Such compression schemes are known as lossy
because real information is actually lost in the
process.
• There are hundreds of image file types in common
use.
• Among the most popular are the PNG, JPEG, and
GIF formats that are most often used to display
images on the Internet.
• Image file types typically use some compression
scheme to reduce the overall size of the image file.
• The PNG image format uses lossless compression
to reduce the overall file size, whereas JPEG and
GIF typically use lossy compression.
• While the JPEG and GIF file formats do include
limited support for lossless compression, almost all
of the JPEG and GIF images on the Internet have
been compressed
using a lossy technique.
• The digital image consists of 513 columns and 668
rows and is stored in uncompressed form using 24
bits per pixel.
• The uncompressed image therefore requires 513 ×
668 × 24 or 8,224,416 bits.
• JPEG compression reduces the memory
requirement to 55384 bits but significantly
reduces the image quality.
• PNG compression reduces the memory
requirements to 5,668,304 bits without any
reduction in quality.
• JPEG can be controlled to provide much higher
quality results but with a corresponding loss of
compression effectiveness.
• Sound information can also be compressed by
recognizing that certain sound waves are less
important than others when listening to music
or voice recordings.
• The MP3 standard, for example, will discard
higher frequency sound waves (or portions
thereof) in order to encode the sound using
fewer bits
Lossless Compression
• Lossless compression reduces the file size without losing any data.
When the file is uncompressed, it is identical to the original. This
method is commonly used for text files, executable programs, and
data files where accuracy is crucial.
• Example:
• ZIP compression: Files are compressed into a ZIP archive, reducing
their size without losing any data. When you extract the files, they
are identical to the originals.
• PNG image format: PNG (Portable Network Graphics) uses lossless
compression for images. It reduces the file size without sacrificing
image quality, making it suitable for storing images that require high
precision.
• FLAC audio format: FLAC (Free Lossless Audio Codec) compresses
audio files without losing any audio quality. It's often used for
storing music files where audio fidelity is important.
Lossy Compression
• Lossy compression reduces the file size by removing some data
permanently. While this results in a smaller file size, there is a loss of
quality. Lossy compression is often used for multimedia files where some
loss of quality is acceptable in exchange for a significant reduction in size.
• Example:
• JPEG image format: JPEG (Joint Photographic Experts Group) is a
commonly used image format that utilizes lossy compression. By adjusting
the compression level, you can control the trade-off between file size and
image quality. Higher compression levels result in smaller file sizes but
lower image quality.
• MP3 audio format: MP3 (MPEG Audio Layer III) is a popular audio format
that uses lossy compression to reduce the size of audio files. When you
compress an audio file to MP3 format, some audio data is discarded,
resulting in a smaller file size. The degree of compression can be adjusted
to balance between file size and audio quality.
• MPEG video formats: Formats such as MPEG-2 and MPEG-4 use lossy
compression for video files. By removing certain elements of the video
that are less noticeable to the human eye, these formats achieve
significant reductions in file size while maintaining acceptable visual
quality.

You might also like