KEMBAR78
Chapter 1 Data Representation | PDF
0% found this document useful (0 votes)
132 views22 pages

Chapter 1 Data Representation

This document provides an overview of key concepts in data representation, including: 1) All data needs to be converted to binary to be processed by computers, using number systems like binary, denary, and hexadecimal. 2) Converting between number systems, performing binary operations like addition and shifting, and representing negative numbers using two's complement. 3) Representing text using character sets like ASCII and Unicode, and representing sound by sampling waves and converting them to binary data.

Uploaded by

Muhammad Aoun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views22 pages

Chapter 1 Data Representation

This document provides an overview of key concepts in data representation, including: 1) All data needs to be converted to binary to be processed by computers, using number systems like binary, denary, and hexadecimal. 2) Converting between number systems, performing binary operations like addition and shifting, and representing negative numbers using two's complement. 3) Representing text using character sets like ASCII and Unicode, and representing sound by sampling waves and converting them to binary data.

Uploaded by

Muhammad Aoun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

3$*(

'$7$
5(35(6(17$7,21
3$*(
3$*(

Data Representation
Syllabus
1.1 Number Systems
1 Understand how and why computers use x Any form of data needs to be converted to
binary to represent all forms of data binary to be processed by a computer
x Data is processed using logic gates and
stored in registers
2 (a) Understand the denary, binary and x Denary is a base 10 system
hexadecimal number systems x Binary is a base 2 system
x Hexadecimal is a base 16 system
(b) Convert between x Values used will be integers only
(i) Positive denary and p ositive binary x
positive Conversions in both directions, e.g., denary
(ii) Positive denary a andnd positive to binary or binary to denary
hexadecimal x Maximum binary number length of 16-bit
(iii) Positive hexadecimal
hexadede
ecimal a nd positive
and
binary
3 Undersrrssta
Understand and
nd hhow ow a nd
and d why hex
eexxadec m l iiss x
eccim
hexadecimal imal A reas within
Areas wiithhin computer
com
ompuute
ter
er sc cieencce tthat
science hat
usedd ass a bene bee efi
f cial metho
beneficial ho off data
ho
hod
method hexadecimal is used should be identified
repres senta atit onn
representation x Hexadecimal is easier for humans to
uunderstand
un
unde
nde
ders
rsta
rs tand tthan
ta hann bi
ha b
bina
ina
nary
y, as iitt iss a sshorter
binary, hort
ho rtter
representation of the binary
4 (a) Add
dd binary
d two positive 8-bit bina
nary integers
na
(b) Understand the concept of ooverflow
verflow and x An overflow error will occur if the value is
why it occurs in binary addition
adddition greater than 255 in an 8-bit register
x A computer or a device has a predefined
limit that it can represent or store, for
example 16-bit
x An overflow error occurs when a value
outside this limit should be returned
5 Perform a logical binary shift on a positive x Perform logical left shifts
8-bit binary integer and understand the x Perform logical right shifts
effect this has on the positive binary integer x Perform multiple shifts
x Bits shifted from the end of the register are
lost and zeros are shifted in at the opposite
end of the register
x The positive binary integer is multiplied or
divided according to the shift performed
3$*(

x The most significant bit(s) or least significant


bit(s) are lost
6 Use two’s complement to represent positive x Convert a positive binary or denary integer
and negative 8-bit binary integers to a two’s complement 8-bit integer and
vice versa
x Convert a negative binary or denary
integer to a two’s complement 8-bit integer
and vice versa
1.2 Text, Sound & Images
1 Understand how and why a computer x Text is converted to binary to be processed
represents text and the use of character by a computer
sets, including American standard code for x Unicode allows for a greater range of
information interchange (ASCII) and characters and symbols than ASCII,
Unicode including different languages and emojis
x Unicode requires more bits per character
than ASCII
2 Understand how and w hy a computer x
hy
why A sound wave is sampled for sound to be
represents sound, including
innccllud
din the
ing tth
h effects of
he converted to binary, which is processed by
the samp
plee rrate
sample ate a
ate
at nd sa
and ssample
ample rresolution
esoluutio
ioon a coompputerr
computer
x The
Th
he sample
The samp
sa mple
mp llee rate
rate is the thhe number
numb
nu berr of
of samples
sa
samp
amp
m les
taken in a second
x The sample
The samp
sa mple
mp l resolution
le res
esol
olut
ol uttioon is the
utio he nnumber
umbe
um berr of
be o b its
bits
per sample
x The accuracy of the recording and the file
size increases as the sample rate and
resolution increase
3 Understand how and why a computer x An image is a series of pixels that are
represents an image, including the effects converted to binary, which is processed by
of the resolution and color depth a computer
x The resolution is the number of pixels in the
image
x The color depth is the number of bits used
to represent each color
x The file size and quality of the image
increases as the resolution and color depth
increase
3$*(

1.3 Data Storage & Compression


1 Understand how data storage is measured x Including:
 bit
 nibble
 byte
 kibibyte (KiB)
 mebibyte (MiB)
 gibibyte (GiB)
 tebibyte (TiB)
 pebibyte (PiB)
 exbibyte (EiB)
x The amount of the previous denomination
present in the data storage size, e.g.,
 bits in a byte
 1024 mebibytes in a gibibyte
2 Calculate the file lee ssize
ize of of an im
magee fi
image file
le and
nd x A sw
An wer
Answers ers muustt bee gi
must givven in
given in th
he uunits
the niits sspecified
pe
ecified
a sound
so
oun
und file,
file
fi lee, usin
inng information
using innffoormationn given
give
ve
en inn the
he question
que
uest
s ion
ioon
x Information given may include:
 Image
Im
mag
agee re
resosolu
so luuttiion
resolution o a n ccolor
nd
and olor
ol orr d epth
ep
depth th
 Sound sample rate, resolution and
length of track
3 Understand the purpose of a nd need for x
and Compression exists to reduce the size of
data compression the file
x The impact of this is, e.g.,
 Less bandwidth required
 Less storage space required
 Shorter transmission time
4 Understand how files are compressed using x Lossless compression reduces the file size
lossy and lossless compression methods without permanent loss of data, e.g. run
length encoding (RLE)
x Lossy compression reduces the file size by
permanently removing data, e.g., reducing
resolution or color depth, reducing sample
rate or resolution
3$*(

1.1 Number Systems


Binary Representation
The Binary Number System is the fundamental building component of every computer.
Because this system only, contains1’s and 0’s, it was chosen. Computers may be represented
using the binary system because they have countless millions of tiny "switches" that must be
in the ON or OFF position. When a switch is in the ON position, it is represented by 1 and
when it is in the OFF position, by 0.
Binary System
Binary Number System have only two digits i.e. 0 and 1, also known as Base 2 number
system because it has only 2 digits that is why it is represented using powers of 2.
27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1
1 0 1 1 1 0 0 1

Denary
D
Deenary
yNNumber
um
mbe
b r SSystem
ystem
Denary
Deenary N Number
umberr SSystem
um uses
ystem use es digits from 0 – 9 and their combinations, also known as Base
10
1 0 number
nuum
mbe
b r system.
sy
ysttem
e . It countss in
in multiple
mul
ultititipl
ple of
ple
pl o 10’s
10’’s such
such
c as
as 10,
10, 100,
100,
0 1000
100
00 and
and so on.
on.
104 1 3
10 102 101 100
10000 1000
10
000 100 10 1
3 5 7 4 2

Binary To Denary Conversion


In order, to convert Binary to Denary we have to multiply Binary Number to the powers of
2.

For Example: (10110111) 2 ื (?) 10


27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1
1 0 1 1 0 1 1 1

1 ൈ 27 + 0 ൈ 26 + 1 ൈ 25 + 1 ൈ 24 + 0 ൈ 23 + 1 ൈ 22 + 1 ൈ 21 + 1 ൈ 20

1 ൈ 128 + 0 ൈ 64 + 1 ൈ 32 + 1 ൈ 16 + 0 ൈ 8 + 1 ൈ 4 + 1 ൈ 2 + 1 ൈ 1
128 + 0 + 32 + 16 + 0 + 4 + 2 + 1 = 183
3$*(

(10110111) 2 ื (183) 10
Denary to Binary Conversion
There are two methods to convert Denary Number to Binary Number.
Method 1
In this method, subtract the largest possible power of 2 and keep doing this until the value
0 is reached. This will give us the following 8-bit binary number:

For Example: (152) 10 ื (?) 2


152 – 128 = 24; 24 – 16 = 8; 8 – 8 = 0
27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1
1 0 0 1 1 0 0 0

(152) 10 ื (?) 2

Method 2
Inn this
his method
thi method
oddwwe successively
vveely divide
e successive idee the
divvid the number
nuumbe
ber by
y22,, un
uuntil the
he rremainder
ntitilil th ema
main
innde 0.. Re
der iss 0
der RRead
ead
d the
remainders
remama
main
a nders ffrom
ders om top to bottom.
rom
ro botttom.

FFor
Fo Example:
or Ex
xam 173) 10 ื ((?)
le:: ((173)
ampl
ple:
pl 17 ?) 2

2 173 1
2 86 0
2 43 1
2 21 1
2 10 0
2 5 1
2 2 0
1
(173) 10 ื (10101101) 2

Hexadecimal System
Hexadecimal Number System is based on 16 different digits (0 – 9, A – F), also known as
Base 16. A to F are used to represent (hex) digits i.e. A = 10; B = 11, C = 12; D = 13;
E = 14; F = 15.
164 163 162 161 160
65536 4096 255 16 1
3 2 A C 5
3$*(

As we know, 16 = 24 this means 1 hex digit is equal to 4 binary digits.


Binary Hexadecimal Denary
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 5
0110 6 6
0111 7 7
1000 8 8
1001 9 9
1010 10 A
1011 11 B
1100 12 C
1101 13 D
1110
11110
10 14 E
1111
1
111
111
11 15 F

Binary
B
Binary
y tto
o He
Hexadecimal
ex
xaade
d cimal
To convert
con vertt Binary
onvert Binarry Number toto Hexadecimal Number we split the bits into groups of 4 bits,
and
a
annnd convert
d co
onv ert it iinto
nveer
ert Hexadecimal
n o Hexadeci
nt cima
ci al Number
Numb
Nu mber
mb e and
er and if
if number
numb
nu er of
mber
mb of bits
bits is
bi is not
not a multiple
mul
ultitipl
ulti ple of 4 w
ple
pl we
e can
add
ad
dd 0’ 0’s
0 o tthe
’s to h left.
he

010) 2 ื (?) 16
For Example: (101111100010)
(1011111000
00
0
1 0 1 1 1 1 1 0 0 0 1 0
8 4 2 1 8 4 2 1 8 4 2 1
8 + 2 + 1 = 11 8 + 4 + 2 = 14 2
B E 2

(101111100010) 2 ื (BE2) 16
Hexadecimal to Binary
For converting Hexadecimal to Binary, take each Hexadecimal digit and write down the 4-
bit code corresponding to the digit.

For Example: (D9F) 16 ื (?) 2


D 9 F
8 4 2 1 8 4 2 1 8 4 2 1
1 1 0 1 1 0 0 1 1 1 1 1
3$*(

Hexadecimal to Denary
For converting Hexadecimal to Denary multiply it with the respective power of 16.

For Example: (8BE) 16 ื (?) 10


8 B E
162 161 160
256 16 1
8 ൈ 256 = 2048 11 ൈ 16 = 176 1 ൈ 14 = 14

2048 + 176 + 14 = 2238

(8BE) 16 ื (2238) 10
Denary to Hexadecimal
To convert Denary Numberer to Hexadecimal Number successive division by 16 until 0 is
er
reached. Remainders from b ottom to top to get the Hexadecimal Number
bottom

2) 100 ื ((?)
For Example: (2322
(2322) ?) 16

16 2334
23344 14 = E
16 145
14
145 1
9
((2322)
(232
3 10 ื ((91E)
22)) 10 91EE) 16
91

Uses
Us
ses o
off H
Hexadecimal
exadecimal SSystem
ystem
Hexadecimal System is easier
eassier to use as 1 hex digit = 4 binary digits.
Address
Internet Protocol (IP) Addre
ess
Each device connected to a network is given an unique address known as the Internet
Protocol (IP) Address. An IPv4 address is a 32-bit number written in denary or hexadecimal
form: e.g. 109.108.158.1 (or 77.76.9e.01 in hex). IPv4 has updated by the adoption of
IPv6. An IPv6 address is a 128-bit number broken down into 16-bit chunks, represented by
a hexadecimal number. For example: FDEC:BA98:7654:3210:ADFC:BDFF:2990:FFFF
Error Codes
Hex is often used in error messages on your computer. The hex number refers to the memory
location (In computing, this is an address in the primary memory where data values are
stored) of the error. This helps programmers to find and then fix problems.
Media Access Control (MAC) Address
All network adapters and network devices have a Media Access Control (MAC) address.
This is also known as the 'physical address' and is a unique address determined during the
3$*(

manufacture of each device. This address is given as a set of 6 pairs of hexadecimal


numbers. An example of a MAC address would be: A0-1D-48-FE-5E-F5. You can determine
the physical address of the network adapters in a computer running the Windows operating
system by typing the following command in to a command prompt: ipconfig/all.
Hyper Text Markup Language (HTML) Color Code
HTML is a markup language that is used for processing, definition and presentation of text
like size of the text, color, bold, italic etc. It uses <tags>.
HTML is often used to represent colors of text on the computer screen. All colors can be
made up of different combinations of the three primary colors (red, green and blue). The
different intensity of each color (red, green and blue) is determined by its hexadecimal
value. This means different hexadecimal values represent different colors using the format
#RRGGBB. # symbol represents that the number written in hex form. This system uses two
hex digits to represent a single color e.g. #FA6200.
There are a possible 256 vvalues
alues for red, 256 values for green and 256 values for blue
giving a total of 256 x 25
56 x 256 (i.e. 16 777 216) possible colors.
256

Addition of Bin
Binary
nary D
Digits
ig
gits
Two
Tw
wo Bi
Bit Ad
Bit A
Addition
dditi
dition
on
n
1st Bit
Bt
Bi 2ndd Bit Bit Addition Sum Carry
0 0 0+0 0 0
0 1 0+1 1 0
1 0 1+0 1 0
1 1 1+1 0 1

Three Bit Addition


1st Bit 2nd Bit 3rd Bit Bit Addition Sum Carry
0 0 0 0+0+0 0 0
0 0 1 0+0+1 1 0
0 1 0 0+1+0 1 0
0 1 1 0+1+1 0 1
1 0 0 1+0+0 1 0
1 0 1 1+0+1 0 1
1 1 0 1+1+0 0 1
1 1 1 1+1+1 1 1
3$*(

Example: Add (11001110) 2 + (00100011) 2


1st no 1 1 0 0 1 1 1 0
2nd no 0 0 1 0 0 0 1 1
Sum 1 1 1 0 1 1 0 1
Carry 0 0 0 0 0 0 1 0

Overflow
This type of addition generates an extra bit (9th bit) which shows that sum has exceeded its
value and is also known as Overflow Error and also represents that this number is too large
to be stored in computer using 8 bits.
Example: Add 111 + 223 (01101111) 2 + (11011111) 2
1st no 0 1 1 0 1 1 1 1
2nd no 1 1 0 1 1 1 1 1
Sum 1 0 1 0 0 1 1 1 0
Carry 1 1 1 1 1 1 1 1

Logical
Log
giccal B
Binary
inary
y SShifts
hifts
Comp
Computers
C put
uterrs ca
cann ccarry
arry out a llogical
ogical shift on a sequence of binary numbers. The logical shift
means
meeans ns moving
movivinnng
vi g the
the binary number
number to the left or to the right. Each shift left is equivalent to
nu
mu
m ultltip
ltip plyyinng th
multiplying thee bi
b nary numbe
binary ber by 2 a
be
number nd eeach
and acch sh
a shif
ift
if
f ri
shift righ
ghtt iss eequivalent
gh
right quiv
quival
iv a ent
alennt to divid
ividin
id
din
ing
dividingg th
he bi
the bin
nary
binary
number
numb
mber
mb er by
by 2.2. If
If any place is
is empty fill it with 0. But we can shift only to certain extent after
which it will contain 0’s onlyy and that would result in the generation of error message.
Example: (00011111) 2 = ((31)
31) 10
Most Significant Bit is the lef
left-most
ft-most bit.
128 64 32 16 8 4 2 1
0 0 0 1 1 1 1 1

Now, fill the empty place left after shift is 0.


128 64 32 16 8 4 2 1
0 0 1 1 1 1 1 0

(00111110) 2 = (62) 10 ฺ 31 ൈ 21

Two places shift on the left


128 64 32 16 8 4 2 1
0 1 1 1 1 1 0 0
3$*(

(01111100) 2 = (124) 10 ฺ 31 ൈ 22

Three places shift on the left


128 64 32 16 8 4 2 1
1 1 1 1 1 0 0 0

(11111000) 2 = (248) 10 ฺ 31 ൈ 23

Four places shift on the left


128 64 32 16 8 4 2 1
1 1 1 1 0 0 0 0

(11110000) 2 = (240) 10 ് 31 ൈ 24
The left-most 1-bit has lost which means it has exceeded its limit which results in error.

Two’s Complement
For representing negative
neg
egat
eg attiv
atiive numbers
nnuu we take the most significant bit negative
Exam
mpl
plle:
e: (110
e:
Example: ((11000110)
(1110
000
001
110) 2
11

-12
-128
128
12 8 64 32 16 8 4 2 1
1 1 0 0 0 1 1 0

-128
-12
28 + 64
4 + 4 + 2 = (-58
(-58)
8)2

To take Two’s Complement


nt we need to invert the bits and then add ‘1’ to LSB (Least
nt
Significant Bit).

Example: (01111110)2 = (126)10

Step 1: Invert all the bits


-128 64 32 16 8 4 2 1
0 1 1 1 1 1 1 0
1 0 0 0 0 0 0 1

Step 2: Add 1 to LSB


-128 64 32 16 8 4 2 1
1 0 0 0 0 0 0 1
+ 1
1 0 0 0 0 0 1 0

(-126)10
3$*(

1.2 Text, Sound & Images


Character Sets
Every word is made up of symbols or characters. When you press a key on a keyboard, a
number is generated that represents the symbol for that key. This is called a character code.
A complete collection of characters is a character set.
Text and numbers can be encoded in a computer as patterns of binary digits. Hexadecimal
is a shortcut for representing binary. ASCII and Unicode are important character sets that
are used as standard.
ASCII
The ASCII character set is a 7-bit set of codes that allows 128 different characters. That is
enough for every upper-case letter, lower-case letter, digit and punctuation mark on most
keyboards. ASCII is only used for the English language.
Extended ASCII uses 8-bit co ccodes
des (0 to 255 in denary or 0 to FF in hexadecimal). This gives
another 128 codes to a llllow
loow
allow w for characters in non-English alphabets and for some graphical
included:
characters to be inc
clu
lude
ed:

Unicode
Unicode can represent all languages of the world, thus supporting many operating systems,
search engines and internet browsers used globally. There is overlap with standard ASCII
3$*(

code, since the first 128 (English) characters are the same, but Unicode can support several
thousand different characters in total. It will support up to four bytes per character.
The Unicode consortium was set up in 1991. Version 1.0 was published with five goals; these
were to:
 Create a universal standard that covered all languages and all writing systems
 Produce a more efficient coding system than ASCII
 Adopt uniform encoding where each character is encoded as 16-bit or 32-bit code
 Create unambiguous encoding where each 16-bit and 32-bit value always represents
the same character
 Reserve part of the code for private use to enable a user to assign codes for their own
characters and symbols (useful for Chinese and Japanese character sets, for example).

Representation of Sound
Soundwaves are vibrations in the air. The human ear senses these vibrations and interprets
them as sound. Each sound wave has a frequency, wavelength and amplitude. The amplitude
specifies the loudness of the sound.
3$*(

Sound waves vary continuously. This means that sound is analogue. Computers cannot work
with analogue data, so sound waves need to be sampled in order to be stored in a
computer. Sampling means measuring the amplitude of the sound wave. This is done using
an analogue to digital converter (ADC).
To convert the analogue data to digital, the sound waves are sampled at regular time
intervals. The amplitude of the sound cannot be measured precisely, so approximate values
are stored.

Sampling Resolutio
on: N
Resolution: um
mber of bit per sample. It is also known as bit depth.
Number
Sampling
Samp
pling Rate
ng R atte
Itt iiss th
the number
he nu
umber
mb ssamples
berr of sound saamples taken per second. Measured in Hertz (Hz). 1 Hz means
one e sample
sa
amplle per
mple per second.
sec
e ond.
Higher
Highher
Hi er the
he sampling
the samp ing rate, greater
ampl
pl ter the
grreatte
g ter the file
le size.
fille
fi siz
i e.
Pros
1. Larger dynamic range
2. Better sound quality
3. Less sound distortion
Cons
1. Produces larger file size
2. Takes longer to transmit/download music files
3. Requires greater processing power
CDs have a 16-bit sampling resolution and a 44.1kHz sample rate - that is 44100 samples
every second. This gives high-quality sound reproduction.

Representation of Bitmap Images


Bitmap Images are made up of Pixels (Picture elements); an image is made up of a two-
dimensional matrix of a Pixel. Pixels can take different shapes such as:
3$*(

Each Pixel can be represented as Binary Number, and so bitmap image is stored in a
computer as a series of binary numbers, so that:
 A black and white image only requires 1 bit per pixel – this mean that each pixel can
be of one of two colors, representing 1 or 0.
 If each pixel is represented by 2 bits, then each pixel can be of one of four colors (22
= 4) representing the FOUR combinations of 0 & 1 i.e. 00, 01, 10, 11.
 If each pixel is represe
ent
n ed by 3 bits, then each pixel can be of one of four colors (23
represented
= 8) representing the EI
EEIGHT
IGHT combinations of 0 & 1 i.e. 000, 001, 010, 011, 100, 101,
110, 111.
Color Depth
Each
Ea ch color
ach oloorr is
col represented
is re
epr
pres
essented by number
by nummbe bits.
b r off b Ann 8-bit
itis. A bit color
8-b color depth
depth means
pth means tthat
ans hatt ea pixel
each piixxel can
be
be onenee ooff 256
25 56 ccolors (because
olors (bec cause 2 = 256). Modern computers have
ca 8 have a 24-bit color depth,
whic
icch means
which meeanns over
overr 16 million
millioon different colors can be represented.
IImag
Im
magge Re
Image R e olu
esoluti
tiion refers too thee number
Resolution num
umbe
mbe
berr of pixels
pix
ixel
els
ls that
thhatt make
mak
akee up
p an
an image;
ima
im age;
age; for
for example,
exa
xamp
amp
mple
lee, an
imag
ag
age co
image ccontains
ont
ntains 4096 × 30 072 pixels (12,582,912 pixels in total).
3072

Drawback of using high resolution images is the increase in file size. As the number of pixels
used to represent the image is increased, the size of the file will also increase. It impacts on
the time to download an image from the internet or the time to transfer images from device
to device. A certain amount of reduction in resolution of an image is possible before the loss
of quality becomes noticeable.
3$*(

1.3 Data Storage & Compression


Measurement of Data Storage
Bit (Binary Digit): Basic Unit of all computing memory storage terms and is either 1 or 0.
Byte: Smallest memory unit (1 byte = 8 bits).
Name of Memory Size Equivalent Denary Values
Kilobyte (KB) 103 Bytes
Megabyte (MB) 106 Bytes
Gigabyte (GB) 109 Bytes
Terabyte (TB) 1012 Bytes
Petabyte (PB) 1015 Bytes
Exabyte (EB) 1018 Bytes

The above system of numbering now only refers to some storage devices but is technically
inaccurate. It is based on the
he SI (base 10) system of units. 1 TB hard disk drive would allow
he
the storage of 1 ൈ 10 byt
12 bytes
yytttes according to this system.
measured
Memory size is measu suureed inn terms of powers of 2, another system has been adopted by the
IEC (International
al Electrotechnical
Ele
lect
c rotech
ct chnica
ch ca
al Commission)
C mm
Co mis
i sion
onn) that
thhatt is
is based
b se
ba ed on the
thee binary
bin
inarry system.
inar syst
stem..
st
Name
Na
ame
me ofof Memory
Me
Me Size
Sizze Number
Nu
umb
mber
mbeer ofof Bytes
Byte
Bytees Equivalent
Equi
Eq uiiva
vale
lent
le ntt D
Denary
enarry V
enar
en Value
alue
Kilobyte
Kilo
Kilooby
byte (KiB) 210 Bytes 1024 Bytes
Megabyte
Mega
Me g by
ga yte
t (MiB) 210 Bytes 1048576 Bytes
Gigabyte
Giga
Gi gaaby
b te (GiB) 2300 BBytes
ytes
ytes 1073741824
1073
10 073
73747418
74 1824
18 24 BBytes
ytes
ytess
TTerabyte
Te
era
rabyte (TiB) 240 Bytes 1099511627776 Bytes
Petabyte (PiB) 250 Bytes 1125899906842624 Bytes
Exabyte (EiB) 260 Bytes 1152921504606846976 Bytes

This system is more accurate.


accurate
e. Internal memories (such as RAM and ROM) should be measured
using the IEC system. A 64GiB RAM store 64 x 230 bytes of data (68 719476 736 bytes).
Calculation of File Size
Calculation of the file size required to hold a bitmap image and a sound sample.
File Size of Image = Image Resolution (In Pixels) ൈ Color Depth (In Bits)
Size of Mono Sound File = Sample Rate (In Hz) ൈ Sample Resolution in Bits ൈ Length of
Sample (In Seconds)
Stereo sound file, multiply the result by two.
3$*(

Example
A photograph is 1024 ൈ 1080 pixels and uses a color depth of 32 bits. How many
photographs of this size would fit onto a memory stick of 64 GB.
Multiply number of pixels in vertical and horizontal directions to find total number of pixels
= [1024 ൈ 1080] = 1 150 920 Pixels.
Multiply number of pixels by color depth then divide by 8 to give the number of bytes = 1
105 920 ൈ 32 = 35 389 440/8 bytes = 4 423 680 bytes.
64 GB = 64 ൈ 1024 ൈ 1024 = 68 719 476 736 bytes.
Divide the memory stick size by the file size = 68 719 476 736/4 = 423 680 = 15 534
photos.
Data Compression
Sound and Image files can be b sometimes very large. It is necessary to reduce (or compress)
the size of a file for the following
folllowing reasons:
 To save storage sp spa
pac
ace oonn devices such as the hard disk drive/solid state drive
space
 To reduce th he ttime
the imee taken
e too stream
en st mus
u icc orr video
a music vi fiile » to
file to reduce the time taken to
uupload,
up plo
load
ad,, do
d ownwnllo
downloadload or trransfer
err a fil
transfer le a
file cro
ossss a nnetwork
across ettwoorkk
 The
The download/upload
down
ownlo
wnnlo
l ada /upload process
procces
esss uses
es up
up network
netw
ne wor
orkk bandwidth
band
ba dwiidt dth - this
thhis is
is the
thhe maximum
maxiimum
ma m ra
rat
te of
rate
ttransfer
ransf
ansf
sfer
fer ooff data
da acrosss a network, measured in bits per second. This occurs whenever a
da
filee is
fi is do
d
dow
ownlooaded, forr eexample,
downloaded, xam
ampl
am
mpl
ple,
e, ffrom
ro
om a seservrver
rverr. Compressed
server. Comp
Co mpre
mp reess
s ed
e files
filles contain
con
onta
ta
ainn fewer
few
ewer
er bits
biti s of
d
da ata
datata tthan
h n un
ha uuncompressed
ncompresseed files and therefore use less bandwidth, which results in a faster
d
da a
ata
datata
a ttransfer
ransfer rate.
 Reduced file size also reduces reduces costs. For example, when using cloud storage, the cost is
based on the size of thee files stored. Internet Service Provider (ISP) may charge a user
based on the amount off data downloaded.

Lossy & Lossless File Compression


Two types of File Compression: Lossy & Lossless

Lossy File Compression


This removes unnecessary data from the file. Original File cannot be retrieved after
compression (loss of some details). This technique have to decide which part to keep and
which one remove.
For example, Applying a lossy file compression algorithm to:
 An image, it may reduce the resolution and/or the bit/colour depth.
 A sound file, it may reduce the sampling rate and/or the resolution.
3$*(

Lossy Files are smaller in comparison to Lossless Files and benefits storage and data transfer
rate.
Some of Lossy Files Compression algorithms are: MP3, MP4, JPEG
MP3
When Internet file-sharing boomed into popularity with Napster and the iPod, the MP3
cornered the market for one reason: it had a small footprint. Without broadband
connections, it was impractical at the time to share file sizes larger than the MP3 standard
2 – 3 Megabytes. And that preference has stuck for some time now even though MP3 does
not have nearly the same amount of quality as WAV or AIFF files. Despite this growing base
of people using higher quality formats, there are still those who prefer the So, if you have
a slower internet connection or limited hard drive space, MP3 could be your file format of
choice. If you’re worried about quality loss, don’t fret too much about it. While, yes, there is
a noticeable drop off in sound quality, MP3 files fall square under the “good enough”
umbrella.
MP4
This format allowss tthe storage
he stora
he rra
a of multimedia files rather than just sound - music, videos,
age
photos and ani animation
nim
ni matition
onn can a allll b
be stored
e st
stooredd iinn th
the MP4
he MP
M 4 foformat.
form
r at
at.t. As w with MP3,
itith MP
M 3,, thi
3 this i a llossy
is is o sy
os y file
compression
co
ommppre
pre
r ss ssioon foformat,
mat, but it sstill
orm till rretains
etai
et
tai
ai s a
ains ann acc
acceptable
c ep
ccepta
ab quality
blle qu
q uallitityy of ssound
qual ound
ou daand d vvideo.
nd ideo
id eoo. Mo
Movies,
example,
for ex ample,, ccould
xampl ould be streamed
stre
eamed over the internet using the MP4 format without losing any
real
rea
all d
a discernible
issce
erni
rnniib
ble qquality.
u lity.
ua
JJPEG
PEG
EG
G
When a camera takes a ph photograph,
hotograph, it produces a raw bitmap file which can be very large
in size. These files are temporary
poorary in nature. JPEG is a lossy file compression algorithm used
tempo
p
for bitmap images. As withh MP3, once the image is subjected to the JPEG compression
algorithm, a new file is forme
m d and the original file can no longer be constructed.
formed
 Human eyes don't detect differences in color shades quite as well as they detect
differences in image brightness (the eye is less sensitive to color variations than it is to
variations in brightness)
 Separating pixel color from brightness, images can be split into 8x8 pixel blocks, for
example, which then allows certain 'information' to be discarded from the image without
causing any real noticeable deterioration in quality.

Lossless File Compression


In this type of Compression original file can be retrieved from the compressed one. This is
beneficial for the crucial data, files, applications etc. None of the details is lost.
3$*(

Run Length Encoding (RLE)


Run Length Encoding (RLE) can be used for lossless compression of a number of different file
formats:
RLE on Text Data
Consider the following text string: 'aaabbbbccccddddd'. Assuming each character requires
lbyte then this string needs 16 bytes. If we assume ASCII code is being used, then the string
can be coded as follows:

a a a b b b b c c c c d d d d d
03 97 04 98 04 99 05 100

a = 97; b = 98; c = 99; d = 100


2nd number represents ‘no of times’ digit is occurring.
One Problem can be encou untered with a string such as 'cdcdcdcdcd' where RLE compression
encountered
isn't very effective. Too cope cope with this, we use a flag. A flag preceding data indicates that
what follows are the hee number
numbeer of repeating units (for example, 255 05 97 where 255 is the
flag and the oothertherr ttwo
th woo numb
w bers iindicate
numbers ndic
nd
dicate
ic e th
ic tthat
att tthere
h re a
he re fi
are ffive
vee iitems
t ms w
te itih AS
with ASCCII co
ASCII od e 9
code 7). Wh
7)
97). W
Whenen
fflllag
a flag ag is
ag is not
not used,
u ed
us d, the next xt byte(s)
byyt
yte(e(sss)) are
are
e taken
tak
akken with
wiithh their
the
heir
eir face
face value
va
aluee and
annd a run
ruun ooff 1 (for
exam
example,mpl
ple,
e,, 01 99
01 99 means onee character with ASCII code 99 follows).
aaaaaaaa
aa
aaa
aaaa
aa
aaa bb
bbbbbbbbbb
bbbbbbbbb
b c d c d c d eeeeeeee
08 9
08 977 10 98 01 99 01 100 01 99 01 100 01 99 01 100 08 101

The original string contains 32


32 characters and would occupy 32 bytes of storage.
The coded version contains 18
18 values and would require 18 bytes of storage.
Flag is introduced of 255 in this case produces:

255 08 97 255 10 98 99 100 99 100 99 100 255 08 101

This has 15 values and would, therefore, require 15 bytes of storage. This is a reduction in
file size of about 53% when compared to the original string.
3$*(

Example 1: Black & White Images

Example 2: Colored Images

RRLE
Thee RLLE co de hhas
code as 92 values,
valuees, which means the compressed file will be 92 bytes in size. This
gives
vees a file
give file reduction of about
fil
fi bout 52%. It should be noted that the file reductions in reality will
ab
not be as large as this due tto o other data which needs to be stored with the compressed file
(e.g. a file header).
3$*(

You might also like