KEMBAR78
Data Representation | PDF | Data Compression | Byte
0% found this document useful (0 votes)
10 views13 pages

Data Representation

The document explains how computers represent text, sound, and images using binary codes and character sets like ASCII and Unicode. It details the processes of sound sampling, image resolution, and data storage measurements, including methods of data compression such as lossy and lossless compression. Additionally, it provides calculations for file sizes of sound and images, emphasizing the importance of compression for efficient storage and transfer.

Uploaded by

flocciey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views13 pages

Data Representation

The document explains how computers represent text, sound, and images using binary codes and character sets like ASCII and Unicode. It details the processes of sound sampling, image resolution, and data storage measurements, including methods of data compression such as lossy and lossless compression. Additionally, it provides calculations for file sizes of sound and images, emphasizing the importance of compression for efficient storage and transfer.

Uploaded by

flocciey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

• Understand how and why a computer represents text.

Computers represent text using character sets, which are sets of characters and symbols
that are assigned a unique binary code that can be understood by the computer. This is
important because it enables computers to store and display text in a standardized way.

The American Standard Code for Information Interchange (ASCII) is a widely used
character set that uses 7 bits to represent 128 characters, while Unicode is a universal
character set that uses 16 bits to represent 65,536 characters, including characters from
different scripts and languages.

• Understand how and why a computer represents sound, including the sample rate
and sample resolution.

WHAT IS SOUND ?
Sound is vibrations that oscillate in wave forms creating a change in pressure. Each wave
is defined as a frequency; one oscillation can be seen in the diagram below.

Frequency is measured in hertz. The diagram above shows one complete oscillation in 18
seconds; this means the wave has a frequency of 18 hertz.

Sound travels as an analogue waves and we need to convert the analogue wave into
digital representation for computer use. When converting the analogue wave into digital
format there are two main factors we need to consider.
• How often do we want to take a sample of the original sound wave (times per
second), this is known as sample rate.
• At what points on the amplitude do we want to take a sample, the number of points
on the amplitude that we are able to take a sample is called the bit depth or
resolution.

CALCULATING SOUND FILE SIZE


Samples per second x bits per second x time (in seconds)
or
Sample Rate X Bit Depth X Time
or for example
10,000(samples per second) X 16 (bit depth) X 60 (seconds) = 9,600,000 BITS
The result for the example above is shown in Bits but, it would be more appropriate to
show the results in bytes, megabytes. To convert to a more appropriate representation,
firstly you would divide by 8 to see how many bytes and then by 1024 for Kb and 1024 for
Mb.

9,600,000 / 8 = 1,200,000 Bytes


1,200,000 / 1024 = 1171.875 Kilo Bytes
1171.875 / 1024 = 1.14 Mega Bytes

1.14Mb is a much more appropriate term to use for the representation of the file size
example above.

Mono:
Mono audio uses a single channel, meaning all the sound information is combined and
played through one speaker, or both speakers simultaneously with the same signal.
Stereo:
Stereo audio uses two channels, a left and a right, to create a more realistic and
immersive soundscape. The separation of the audio signals into two channels allows for
the creation of a "soundstage," where sounds appear to originate from different
locations.

• Understand how and why a computer represents an image, including resolution and
colour depth.

WHAT IS IMAGE REPRESENTATION?


Just as numbers and letters are represented with Binary, so are images. To a computer
an image is nothing more than a series of ones and zeros which allow the display of
pixels in various colours. Colours are commonly represented in Hexadecimal using the
RGB Red Green Blue system. For example, the chart below shows how pure blue is
represented using HEX, the HEX code for pure blue is #00 00 FF, meaning it has zero red
components, zero green components and the maximum amount of Blue components.

Most software that allows you to manipulate images or colours will allow you to select
colours by their HEX or RGB values.
Remember: Hex is used for our benefit to make it easy for use to see and work with, but
the computer still deals with colour in binary format.

Image quality
The quality of an image depends predominantly on two factors, the amount of Pixels Per
Inch (PPi) and the colour depth. The colour depth is determined by how many bits are
used to represent each pixel, if 4 bits were used, most images would be grey scale,
represented with only 2^4 (16) different shades possible to represent. HEX colour
systems are commonly used and allow for 255 possible representations for each colour,
Red, Green and Blue which equates to 8 bits used for each of the 3 colour, meaning 24
bit colour depth per pixel (2^24 = 16777216), giving over 16 million individual colours.

IMAGE FILE TYPES


There are many different type of images and each format has its unique attributes as
follows:

JPEG or JPG : Joint Photographic Expert Group, a jpeg image has been compressed from
its original size. The compression type is lossy, meaning it does loose some of its quality
during compression. JPGs are good for photographs that require good quality images but
a small data storage size, such as a photograph being used on a website.

Some other popular image file types include.


PNG: PORTABLE NETWORK GRAPHICS - Supports good quality graphics with 24BIT
representation and designed to work well online through web-browsers.

GIF: GRAPHICS INTERCHANGE FORMAT - Normally not compressed and can retain
features such as background transparency. The File can be large due to lack of
compression. Compatible with many platforms.

BMP: BITMAP - Normally not compressed and therefore often maintain good quality but
can take up more file space.

• Understand how data storage is measured.

DATA STORAGE MEASUREMENTS


Data storage is measured in units of bytes, with larger units being used to represent
larger storage capacities. Modern computer storage devices such as hard disk drives,
solid state drives, and USB flash drives have storage capacities that range from a few
gigabytes (GB) to multiple terabytes (TB).
Storage capacity can be represented in different formats, such as megabytes (MB),
gigabytes (GB), terabytes (TB), and so on. However, these units are based on powers of
10, a binary prefix system was introduced, which uses prefixes such as kibi-, mebi-, gibi-,
and tebi- to represent binary multiples of bytes (powers of 2). For example, a kibibyte
(KiB) represents 1024 bytes, while a megabyte (MB) represents 1000 bytes.

The following data storage notations are listed in order of size and an example use for
each is given:
• Bit: 1 bit represents a single binary digit, which can be either a 0 or a 1. Bits are
used to represent the smallest unit of digital data and are often used in
communication protocols and encryption algorithms.
• Nibble: A nibble represents 4 bits of data, or half a byte. Nibbles are not
commonly used in modern computing but were used in older systems to
represent hexadecimal values.
• Byte: A byte represents 8 bits of data and is the basic unit of storage in most
computer systems. Bytes are used to represent individual characters, such as
letters and numbers, as well as larger data types such as images and audio.
• Kibibyte (KiB): 1 KiB represents 1024 bytes of data. KiB is used to represent small
to medium-sized files, such as text documents or low-resolution images.
• Mebibyte (MiB): 1 MiB represents 1024 KiB, or 1,048,576 bytes of data. MiB is used
to represent larger files, such as high-resolution images or short audio clips.
• Gibibyte (GiB): 1 GiB represents 1024 MiB, or 1,073,741,824 bytes of data. GiB is
used to represent even larger files, such as videos or large software applications.
• Tebibyte (TiB): 1 TiB represents 1024 GiB, or 1,099,511,627,776 bytes of data. TiB is
used to represent very large files, such as high-resolution videos or large
databases.
• Pebibyte (PiB): 1 PiB represents 1024 TiB, or 1,125,899,906,842,624 bytes of data.
PiB is used to represent data storage at the petabyte level, such as in large data
centers or cloud storage services.
• Exbibyte (EiB): 1 EiB represents 1024 PiB, or 1,152,921,504,606,846,976 bytes of
data. EiB is used to represent data storage at the exabyte level, such as in
scientific research or big data applications.
CALCULATING IMAGE FILE SIZE
IMAGE 1 : 8 x 8 pixels IMAGE 2 : 16 x 16 pixels

To calculate the file size of the two images above, firstly although only two colours have
been used, lets presume a colour depth of 8 bit for both images.
Image 1 calculation:
Firstly; calculate the total amount of pixels used in the image
8 pixels wide by 8 pixels tall: 8 x 8 = 64 (A total of 64 pixels used to represent the entire
image). Secondly; multiply the total pixels used by the colour depth (8 bits have been
used to represent the content of each pixel). 64 pixels in total multiplied by 8(colour
depth): 64 x 8 = 512
Answer: Image 1 has a file size of 512 bits

Image 2 calculation:
16 pixel by 16 pixels: 16 x 16 = 256
256 pixels x 8 bit colour depth: 256 x 8 = 2048
Answer: Image 2 has a file size of 2048 bits

In practise images are much larger and the pixel density is much greater than the two
example above.
IMAGE 3: Dimensions 1600 pixels x 1200 pixels : Bit Depth 24
Follow the same steps to calculate the file size of image 3, this time convert the file size
from bits too Megabytes.
Step 1: Multiply the number of horizontal pixels by the number of vertical pixels: 1600 x
1200 = 1920000 pixels
Step 2: Multiply the total pixels used by the colour/bit depth: 1920000 x 24 = 46080000
bits
Step 3: Divide the answer from step 2 by 8 to give the number of bytes used: 46080000 /
8 = 5760000 bytes
Step 4: Divide the number of bytes used by 1024 to give the value in Kilobytes: 5760000 /
1024 = 5625 Kilobytes
Step 5: Divide the number of kilobytes used by 1024 to give the value in Megabytes: 5625
/ 1024 = 5.4 Megabytes
The original image 3 has a file size of 5.2Mb, this value excludes any metadata storage
used and was the file size before being compressed for use over the internet

• Understand the purpose of and need for data compression.

WHAT IS DATA COMPRESSION

Many files such as images, videos and even text documents can take up large amounts of
memory, meaning an increased need for storage space and slower transfer speeds do to
file size, this has led to the need for data compression. Data Compression is the process
of making files require less memory to store. When people talk about 'file size' they are
usually referring to the memory required to store the file and not the physical size of the
document or file.

There are two main methods of file compression, lossy and lossless. Each type of file
compression has its benefits and disadvantages.
Some reasons for compression are:

✓ Compression makes the file size smaller so less space is needed to store the file
✓ Compression makes the file size smaller so files transfer faster over a network such as
the internet
✓ Compression makes the file size smaller which helps with file streaming

• Understand how files are compressed using lossy and lossless compression
methods.

LOSSY COMPRESSION
The key element of Lossy compression is that the file will lose quality when it is
compressed. The loss of quality is not important for many files and in many cases we do
not even notice the reduction in quality. Some key points on Lossy compression are:

✓ Lossy compression reduces the file size by removing some of the data, because of this
an exact match of the original data cannot be recreated. Quality is lost.

✓ Lossy compression uses an algorithm that looks to remove detail that is barely
noticeable, for example if pixels next to each other in an image are almost the same
colour then the Lossy algorithm will give them the same value to reduce the bytes
needed to store the detail.

✓ Lossy compression is often used on files such as images and sound files such as MP3s
and JPGs

✓ Lossy compression is often not a good option for files such as text documents

✓ Lossy compression can make files sizes smaller that is possible with Lossless
compression
LOSSLESS COMPRESSION
The key element to lossless compression is that no quality is lost during the process of
compression. Lossless compression is used when it is important to maintain the original
quality. Some key points of Lossless compression are:

✓ Lossless compression will not remove any quality from the file, the compressed version
will be the same as the original when uncompressed.
✓ Lossless compression uses an algorithm that looks for repeat data, this can be groups
and categorised and a token be given for where each group will be used in the
reconstruction

✓ Lossless compression is often used on files such as text files and images such as
DOCXs, GIFs and PNGs

✓ Lossless compression is often not a good option for audio files and high colour images

✓ Lossless compression is more limited than Lossy compression with how small the file
size can be made
COMPRESSION METHODS
Run Length Encoding (RLE)
Run Length Encoding is a method of compression that looks for repeating patterns and
then encodes them into one item of data of a specified length.

Take the top row of the image, it has 8 white pixels and then the second row 1 white
pixel, 2 red pixels, 2 white and so on. An uncompressed representation of the image
would represent each pixel individually for example the binary for the row, if this was an
8 bit image then the top row has 8 pixels with 8 bits used to represent the colour of each
pixel meaning it take 8pixels x 8bits = 64bit to represent the 8 white pixels. With run
length encoding we can simply encode this as 8 white pixels in a row 8W this would
mean 8 bits would be used to represent the length of the pattern and 8 bits to represent
the colour meaning using run-length encoding the top row could be compressed from 64
bits to just 16 bits.
END OF CHAPTER QUESTIONS
1:What is a character set?
A) A set of fonts used by a particular software program
B) A set of colours used in graphic design
C) A set of characters and symbols assigned a unique binary code that can be understood
by a computer
D) A set of keyboard shortcuts for commonly used commands

2: What is the American Standard Code for Information Interchange (ASCII)?


A) A set of computer hardware standards
B) A type of computer virus
C) A character set used for representing text
D) A programming language used for web development

3: Why was Unicode developed?


A) To provide a standardized way of representing text across different languages and
scripts
B) To provide a more efficient way of compressing digital images
C) To provide a faster way of processing mathematical equations
D) To provide a more secure way of encrypting data

4: What is the difference between ASCII and Unicode?


A) ASCII is a universal character set, while Unicode is only used for representing English
text
B) ASCII uses 8 bits to represent characters, while Unicode uses 16 bits
C) ASCII is a newer character set than Unicode
D) ASCII can only represent characters from the English language, while Unicode can
represent characters from any language or script

5: Why is the use of character sets important in computing?


A) It enables computers to store and display text in a standardized way, regardless of the
language or alphabet used
B) It makes it easier for programmers to write code
C) It helps to prevent computer viruses and malware
D) It allows computers to perform mathematical calculations more efficiently
1: Fill in the blanks
A microphone captures sound in an analogue format, before the computer can represent
this a .............................. is needed to convert the sound to a .................... format.
2: Describe what is meant by the term 'sampling'.
3: Explain why the sample rate effects the quality of the sound
4: Describe what is meant by the term 'bit depth'.
5: Describe what is meant by the term 'channel'.
SOUND FILE SIZE
1: Calculate the file size for the following sound sample, give the size in MB
• Mono
• Sample rate: 42000
• Bit Depth: 16bit
• Time: 60 seconds
2: Calculate the file size for the following sound sample, give the size in MB
• Stereo
• Sample rate: 20000
• Bit Depth: 16bit
• Time: 120 seconds
3: What happens to the file size as more channels are represented?

4: Calculate the file size for the following sound sample, give the size in MB
• 4 channel
• Sample rate: 44100
• Bit Depth: 16
• Time: 180

1: Explain what is meant by the term ‘pixel’?


2: What is meant by the term ‘megapixel’?
3: What is meant by the term 'colour depth'?
4: Suggest two things that happen when you increase the colour depth.
5: What is meant by the term ‘image resolution’?
6: Suggest two things that happen when you increase the image resolution.
7: What is a Vector Graphic?
8: What is Meta Data
9: Give an example of information that could be stored in Meta data

1: What is the size of a byte in bits?


a) 4 bits
b) 8 bits
c) 16 bits
d) 32 bits

2: Which of the following is a larger unit of storage than a gigabyte?


a) Megabyte
b) Terabyte
c) Kilobyte
d) Petabyte

3: How many bytes are in a mebibyte (MiB)?


a) 1024 bytes
b) 1000 bytes
c) 1,048,576 bytes
d) 1,000,000 bytes

4: What is the next largest storage unit after a tebibyte (TiB)?


a) Zebibyte (ZiB)
b) Pexabyte (PeB)
c) Exbibyte (EiB)
d) Yobibyte (YiB)

5: How many bits are in a nibble?


a) 1 bit
b) 2 bits
c) 4 bits
d) 8 bits

1: An image has the following properties:


• Dimensions 1024 pixels x 575 pixels
• Bit Depth 24
Calculate the image size, show your workings at each stage and display the result in Mb.

2: An image has the following properties:


• Dimensions 12 inch x 11 inch
• Bit Depth 16
• 72 pixels per inch
Calculate the image size, show your workings at each stage and display the result in Mb.

3: Besides the size and bit depth of an image, what else could have an impact on the total
file size?
4:Give two situation where reducing the file size of an image is needed, explain the
impact of doing this.
1: What is the main difference between lossy and lossless compression?
a) Lossy compression retains all original data
b) Lossless compression reduces file size by permanently discarding data
c) Lossy compression reduces file size by discarding some data that can be reconstructed
d) Lossless compression makes no changes to the original data

2: Which of the following is an example of lossless compression?


a) JPEG
b) MP3
c) ZIP
d) MPEG

3: Which of the following compression methods is best suited for compressing text
documents?
a) Lossy compression
b) Lossless compression
c) Run-length encoding
d) Huffman coding

4: What is run-length encoding?


a) A type of lossy compression that discards some data
b) A type of lossless compression that compresses repeating sequences of data
c) A type of compression that only works on images
d) A type of encryption that scrambles data to prevent unauthorized access

5: Which of the following is an example of lossy compression?


a) PNG
b) TIFF
c) WAV
d) MP4

6: How does lossy compression achieve smaller file sizes?


a) By discarding some data that can be reconstructed
b) By compressing repeating sequences of data
c) By encrypting the data
d) By using an algorithm to reorder the data

7: Which of the following is an advantage of lossless compression over lossy


compression?
a) Smaller file sizes
b) Retains all original data
c) Less processing power required
d) Better suited for compressing multimedia files

8: Which compression method is most commonly used for compressing digital images?
a) Lossy compression
b) Lossless compression
c) Run-length encoding
d) Huffman coding

You might also like