UNIT – III
Data Representation: Data types, Complements, Fixed Point Representation, Floating Point
Representation.
Computer Arithmetic: Addition and subtraction, multiplication Algorithms, Division Algorithms,
Floating– point Arithmetic operations. Decimal Arithmetic unit, Decimal Arithmetic operations.
Chapter 1: Data Representation: Data types, Complements, Fixed Point Representation, Floating Point
Representation.
1. Data Types
The data types found in the registers of digital computers may be classified as being one
of the following categories:
(1) Numbers used in arithmetic computations,
(2) Letters of the alphabet used in data processing, and
(3) Other discrete symbols used for specific purposes.
All types of data, except binary numbers, are represented in computer registers in
binary-coded form. This is because registers are made up of flip-flops and flip-flops are two-
state devices that can store only 1’s and 0’s.
Number Systems:
It can have different base values. The base number represents the number of digits used
in that numbering system.
Binary -- (base-2) -- the digits for binary are: 0 and 1
Octal -- (base-8) -- the digits for octal are: 0, 1, 2, 3, 4, 5, 6 and 7
Decimal -- (base 10) -- the digits used are: 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9
Hexadecimal -- (base 16) -- the digits are: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
Number Conversions:
Conversion of decimal number to any number system:
Step 1: convert the integer part by doing successive division using the radix of asked number
systems.
Step 2: convert the fractional part by doing successive multiplication using radix of asked
number system
Conversion of decimal to binary number system:
The radix of asked number system is 2
Convert (87)10 to ( )2
( 1010111)2
Convert (14.625)10 decimal number to binary number
(1110)2
1st Multiplication Iteration
Multiply 0.625 by 2
0.625 x 2 = 1.25(Product) Fractional part=0.25 Carry=1 (MSB)
2nd Multiplication Iteration
Multiply 0.25 by 2
0.25 x 2 = 0.50(Product) Fractional part = 0.50 Carry = 0
3rd Multiplication Iteration
Multiply 0.50 by 2
0.50 x 2 = 1.00(Product) Fractional part = 1.00 Carry = 1 (LSB)
(101)2
The binary number of (16.625)10 is (1110.101)2
Conversion of decimal to octal number system:
The radix of asked number system is 8
Convert (264)10 decimal number to octal number
(410)8
The octal number of (264)10 is (410)8
Convert (105.589)10 decimal number to octal number
( 0.4554)
The octal number of (105.589)10 is (151.4554)8
Conversion of decimal to Hexadecimal number system:
The radix of asked number system is 16
Convert (1693)10 decimal number to Hexadecimal number
1693/16 = 105 Reminder (13) D (LSB)
105/16 = 6 Reminder 9
6/16 = 0 Reminder 6 (MSB)
(1693)10 (69D)16
Convert (1693.0628)10 decimal fraction to hexadecimal fraction (?)16
1693/16 = 105 Reminder (13) D (LSB)
105/16 = 6 Reminder 9
6/16 = 0 Reminder 6 (MSB)
(69D)
Multiply 0.0628 by 16
0.0628 x 16 = 1.0048(Product) Fractional part=0.0048 Carry=1 (MSB)
Multiply 0.0048 by 16
0.0048 x 16 = 0.0768(Product) Fractional part = 0.0768 Carry = 0
Multiply 0.0768 by 16
0.0768 x 16 = 1.2288(Product) Fractional part = 0.2288 Carry = 1
Multiply 0.2288 by 16
0.2288 x 16 = 3.6608(Product) Fractional part = 0.6608 Carry = 3 (LSB)
(.1013)
(1693.0628)10 = (69D.1013)16
Conversion of any number system to decimal number system:
In general the numbers can be represented as
N= A n-1r n-1 + = A n-2r n-2 +……..+ A1 r1 + A0 r0 + A-1 r-1+ A-2 r-2+………
Where n= number in decimal
A= digit
r= radix of number system
n= The number of digits in the integer portion of number
m= the number of digits in the fractional portion of number
Conversion of binary to decimal number system:
Convert ( 101.101 )2= ( ? )10
101.101
= 1 x 22 + 0 x 21 + 1 x 20 . 1 x 2-1 + 0 x 2-2 + 1 x 2-3
=1x4+0x2+1x1.1x(1/2)+0x(1/4)+1x(1/8)
=4+0+1.(1/2)+0+(1/8)
= 5 + 0.5 + 0.125
= 5 . 625
Therefore ( 1 0 1 . 1 0 1 )2 = ( 5.625 )10
Conversion of octal to decimal number system:
Convert (128)8= ( ? )10
1238 = 1*82 + 2*81 + 3*80 = 64 + 16 + 3 = 73
the decimal equivalent of the number 1238 is 7310
Convert (2 1. 2 1)8= (? )10
2 1. 2 1
= 2 x 81 + 1 x 80. 2 x 8-1 + 1 x 8-2
= 2 x 8 + 1 x 1. 2 x ( 1 / 8 ) + 1 x ( 1 / 64 )
= 16 + 1 . (0. 2 5) + (0. 0 1 5 6 2 5)
= 17 + 0. 265625
= 17. 265625
Therefore (2 1. 2 1)8 = (1 7. 2 6 5 6 2 5)10
Conversion of hexadecimal to decimal number system:
Convert (E F. B 1)16= (?)10
EF.B1
= E x 161 + F x 160. B x 16-1 + 1 x 16-2
= 14 x 16 + 15 x 1 . 11 x (1 / 16) + 1 x (1 / 256)
= 224 + 15 + (0. 6 8 7 5) + (0. 0 0 3 9 0 6 2 5)
= 239 + 0. 6914
= 239. 691406
Therefore (E F. B 1)16 = (2 3 9. 6 9 1 4 0 6)10
Convert ( 0.9D9 )16= ( ? )10
0.9D9
= 0 x 160. 9 x 16-1 + D x 16-2 + 9 x 16-3
= 0 x 1. 9 x ( 1 / 16 ) + 13 x ( 1 / 256 ) + 9 x ( 1 / 4096 )
= 0 . (0. 5625) + (0. 050781) + (0. 0021972 )
= 0. (0. 6154782 )
= 0. 6154782
Conversion of binary to octal number system:
Convert (101101001)2 to ( )8
Divide the binary into group of three digits from LSB we will find the following pattern
101|101|001
Now writing the equivalent decimal number of each group we get
5|5|1
So the equivalent octal number is 5518
Convert (11001100.101)2 to ( )8
011|001|100. |101|
314.5
So the equivalent octal number is 314.5
Conversion of binary to hexadecimal number system:
Convert (111100010)2 to ( )16
Divide the binary into group of four digits from LSB
0001|1110|0010
Now writing the equivalent hexadecimal number of each group
1|E|2
So the equivalent Hexadecimal number is 1E216
Convert 11000011001.101)2 to ( )16
0110|0001|1001|.1010|
619.A
So the equivalent Hexadecimal number is (619.A)16
Conversion of octal number system to hexadecimal number system:
Convert (25)8 to ( )16
First convert octal to binary
The binary equivalent of 25 is 010101
Divide the binary into group of four digits from LSB
0001|0101
15
So the equivalent Hexadecimal number is (15)16
Conversion of hexa decimal number system to octal number system:
Convert (1A.2B)16 to ( )8
First convert hexadecimal to binary
The binary equivalent of 1A.2B is 00011010.00101011
Divide the binary into group of three digits
011|010|.|001|010|110
32.126
so the equivalent octal number is (32.126)8
2. Complements
In digital computers to simplify the subtraction operation and for logical manipulation
complements are used. There are two types of complements for each radix system the radix
complement and diminished radix complement. The first is referred to as the r’s complement
and the second as the (r-1)’s complement.
r’s Complement
Given a positive number N in base r with an integer part of n digits, the r’s
complement of N is defined as rn- N if N≠0 and 0 if N=0
(r-1)’s Complement
Given a positive number N in base r with an integer part of n digits, the (r-1)’s
complement of N is defined as (rn-1)-N
Subtraction with r’s Complement
The direct method of subtraction uses the borrow concept
When subtraction is implemented by means of digital components, this method is found
to be less efficient. So, instead the following procedure can be followed.
The subtraction of two positive numbers (M-N), both of base r, may be done as follows.
(1) Add the minuend M to the r’s complement of the subtrahend N.
(2) Inspect the result obtained in step 1 for an end carry.
If an end-carry occurs, discard it.
If an end-carry does not occurs, take the r’s complement of the number obtained in
step 1 and place a negative sign in front.
Subtraction with (r-1)’s Complement
The procedure for subtraction with (r-1)’s complement is same as r’s complement
except for end-around carry.
The subtraction of M-N, both positive numbers in base r, may be calculated in the
following manner.
1. Add the minuend M to the (r-1)’s complement of the subtrahend N.
2. Inspect the result obtained in step 1 for an end carry.
*. If an end-carry occurs, add 1 to the least significant digit (end-around carry)
*. If an end-carry does not occur, take the (r-1)’s complement of the number
obtained in step 1 and place a negative sign in front.
It is classified into four types they are
1’s complement
2’s complement
9’s complement
10’s complement
1’s complement and 2’s complement:
The 1’s complement of a binary number is the number that results when we change all
ones to zeros and the zeros to ones.
The 2’s complement is the binary number that results when we add 1 to the 1’s
complement.
Problems related to 1’s complement and 2’s complement:
1’s complement subtraction
Subtraction of binary numbers can be accomplished by the direct method by using the
1’s complement method, which allows to perform subtraction using only addition . for
subtraction of two numbers we have two cases.
1. Subtraction of smaller number from larger number and
2. Subtraction of larger number from smaller number.
1’s complement Subtraction of smaller number from larger number:
Method:
1. Determine the 1’s complement of the smaller number.
2. Add the 1’s complement to the larger number.
3. Remove the carry and add it to the result.
This is called end -around carry.
1’s complement Subtraction of larger number from smaller number
Method:
1. Determine the 1’s complement of the larger number.
2. Add the 1’s complement to the smaller number.
3. Answer is in 1’s complement form. To get the answer in true form take the 1’s
complement and assign negative sign to the answer.
Advantages of 1’s complement subtraction :
1. The 1’s complement subtraction can be accomplished with a binary adder.
Therefore, this method is useful in arithmetic logic circuits.
2. The 1’s complement of a number is easily obtained by inverting each bit in the
number.
2’s complement Subtraction:
Like 1’s complement subtraction, in 2’s complement subtraction, the subtraction is
accomplished by only addition.
2’s complement Subtraction of smaller number from larger number:
Method
1. Determine the 2’s complement of the smaller number.
2. Add the 2’s complement to the larger number.
3. Discard the carry.
2’s complement Subtraction of larger number from smaller number:
Method:
1. Determine the 2’s complement of the larger number.
2. Add the 2’s complement to the smaller number.
3. Answer is in 2’s complement form. To get the answer in true form take the 2’s
complement and assign negative sign to the answer.
9's complement and 10's complement:
Before knowing about 9's complement and 10's complement we should know why they
are used and why their concept came into existence. Addition of signed BCD numbers can be
performed by using 9’s and 10’s complement. The complements are used to make the
arithmetic operations in digital system easier. Various topics and related problems we going to
see here are
1. 9s complement
2. 10s complement
3. 9s complement subtraction
4. 10s complement subtraction
Now first of all let us know what 9's complement is and how it is done. To obtain the 9,s
complement of any number we have to subtract the number with (10n - 1) where n = number of
digits in the number, or in a simpler manner we have to divide each digit of the given decimal
number with 9.
Table 1. Will explain the 9’s complement more easily.
Decimal Digit 9’s complement
0 9
1 8
2 7
3 6
4 5
5 4
6 3
7 2
8 1
9 0
Now coming to 10's complement, it is relatively easy to find out the 10's complement
after finding out the 9,s complement of that number. We have to add 1 with the 9,s
complement of any number to obtain the desired 10's complement of that number. Or if we
want to find out the 10's complement directly, we can do it by following the formula, (10n -
number), where n = number of digits in the number. An example is given below to illustrate the
concept of obtaining 10’s complement.
A decimal number 456, find 9's complement and 10’s complement of this number
10’s complement is
In 9’s complement subtraction when 9’s complement of smaller number number is
added to the larger number carry is generated. It is necessary to add this carry to the result. (
this is called an end around carry).when larger number is subtracted from the smaller number,
there is no carry, and the result is in 9’s compliment form and negative. This is explained with
following examples.
Subtraction using 9’s complements:
Steps for 9’s complement BCD subtraction:
1. Find the 9’s complement of a negative number.
2. Add two numbers using BCD addition
3. If carry is generated add carry to the result otherwise find the 9’s complement of the
result.
Subtraction using 10’s complements:
The 10’s complement of the decimal is equal to 9’s complement plus 1. The 10’s
complement can be used to perform subtraction by adding the minuend to the 10’s
complement of the subtrahend and dropping the carry. This is explained with following
examples.
Steps for 10’s complement BCD subtraction:
1. Find 10’s complement of the number
2. Add two numbers using BCD addition
3. If carry is not generated find the 10’s complement of the result.
3. Fixed-point Representation
Positive integers, including zero, can be represented as unsigned numbers. However, to
represent negative integers, we need a notation for negative values. In ordinary arithmetic, a
negative number is indicated by a minus sign and a positive number by a plus sign. Because of
hardware limitations, computers must represent everything with l’s and 0’s, including the sign of
a number. As a consequence, it is customary to represent the sign with a bit placed in the
leftmost position of the number. The convention is to make the sign bit equal to 0 for positive
and to 1 for negative.
In addition to the sign, a number may have a binary (or decimal) point. The position of
the binary point is needed to represent fractions, integers, or mixed integer-fraction numbers.
The representation of the binary point in a register is complicated by the fact that it is
characterized by a position in the register. There are two ways of specifying the position of the
binary point in a register: by giving it a fixed position or by employing a floating-point
representation. The fixed-point method assumes that the binary point is always fixed in one
position. The two positions most widely used are (1) a binary point in the extreme left of the
register to make the stored number a fraction, and (2) a binary point in the extreme right of the
register to make the stored number an integer. In either case, the binary point is not actually
present, but its presence is assumed from the fact that the number stored in the register is
treated as a fraction or as an integer.
Integer Representation
When an integer binary number is positive, the sign is represented by 0 and the
magnitude by a positive binary number. When the number is negative, the sign is represented
by 1 but the rest of the number may be represented in one of three possible ways:
1. Signed-magnitude representation
2. Signed-1’s complement representation
3. Signed 2’s complement representation
As an example, consider the signed number 14 stored in an 8-bit register. +14 is
represented by a sign bit of 0 in the leftmost position followed by the binary equivalent of
14:00001110. Note that each of the eight bits of the register must have a value and therefore
0’s must be inserted in the most significant positions following the sign bit. Although there is
only one way to represent +14, there are three different ways to represent -14 with eight bits.
In signed-magnitude representation 1 0001110
In signed-1’s complement representation 1 1110001
In signed-2’s complement representation 1 1110010
The signed-magnitude representation of -14 is obtained from +14 by complementing
only the sign bit. The signed-1’s complement representation of -14 is obtained by
complementing all the bits of +14, including the sign bit. The signed-2’s complement
representation is obtained by taking the 2’s complement of the positive number, including its
sign bit.
The signed-magnitude system is used in ordinary arithmetic but is awkward when
employed in computer arithmetic. Therefore, the signed complement is normally used. The 1’s
complement imposes difficulties because it has two representations of 0 (+0 and -0). The 1’s
complement is useful as a logical operation since the change of 1 to 0 or 0 to 1 is equivalent to a
logical complement operation. The signed binary arithmetic deals exclusively with the signed-2’s
complement representation of negative numbers.
Arithmetic Addition
The addition of two numbers in the signed-magnitude system follows the rules of
ordinary arithmetic. If the signs are the same, we add the two magnitudes and give the sum the
common sign. If the signs are different, we subtract the smaller magnitude from the larger and
give the result the sign of the larger magnitude. For example, (+25) + (-37) = - (37 - 25) = -12 and
is done by subtracting the smaller magnitude 25 from the larger magnitude 37 and using the
sign of 37 for the sign of the result. This is a process that requires the comparison of the signs
and the magnitudes and then performing either addition or subtraction.
The rule for adding numbers in the signed-2’s complement system does not require a
comparison or subtraction, only addition and complementation. The procedure is very simple
and can be stated as follows: Add the two numbers, including their sign bits, and discard any
carry out of the sign (leftmost) bit position.
Numerical examples for addition are shown below. Note that negative numbers must
initially be in 2’s complement and that if the sum obtained after the addition is negative, it is in
2’s complement form.
In each of the four cases, the operation performed is always addition, including the sign
bits. Any carry out of the sign bit position is discarded, and negative results are automatically in
2’s complement form.
The complement form of representing negative numbers is unfamiliar to people used to
the signed-magnitude system. To determine the value of a negative number when in signed-2’s
complement, it is necessary to convert it to a positive number to place it in a more familiar form.
For example, the signed binary number 11111001 is negative because the leftmost bit is 1. Its
2’s complement is 00000111, which is the binary equivalent of +7. We therefore recognize the
original negative number to be equal to -7.
Arithmetic Subtraction
Subtraction of two signed binary numbers when negative numbers are in 2’s
complement form is very simple and can be stated as follows: Take the 2’s complement of the
subtrahend (including the sign bit) and add it to the minuend (including the sign bit). A carry out
of the sign bit position is discarded. This procedure stems from the fact that a subtraction
operation can be changed to an addition operation if the sign of the subtrahend is changed .This
is demonstrated by the following relationship:
But changing a positive number to a negative number is easily done by taking its 2’s
complement. The reverse is also true because the complement of a negative number in
complement form produces the equivalent positive number. Consider the subtraction of (-6) -
(-13) = +7. In binary with eight bits this is written as 11111010 - 11110011. The subtraction is
changed to addition by taking the 2’s complement of the subtrahend (-13) to give (+13). In
binary this is 11111010 + 00001101 = 100000111. Removing the end carry, we obtain the
correct answer 00000111 (-7).
It is worth noting that binary numbers in the signed-2’s complement system are added
and subtracted by the same basic addition and subtraction rules as unsigned numbers.
Therefore, computers need only one common hardware circuit to handle both types of
arithmetic. The user or programmer must interpret the results of such addition or subtraction
differently depending on whether it is assumed that the numbers are signed or unsigned.
4. Floating-point Representation
The floating-point representation of a number has two parts. The first part represents a
signed, fixed-point number called the mantissa. The second part designates the position of the
decimal (or binary) point and is called the exponent. The fixed-point mantissa may be a fraction
or an integer. For example, the decimal number +6132.789 is represented in floating-point with
a fraction and an exponent as follows:
The value of the exponent indicates that the actual position of the decimal point is four
positions to the right of the indicated decimal point in the fraction. This representation is
equivalent to the scientific notation +0.6132789 X 10+4.
Floating-point is always interpreted to represent a number in the following form:
Only the mantissa m and the exponent e are physically represented in the register
(including their signs). The radix r and the radix-point position of the mantissa are always
assumed. The circuits that manipulate the floating-point numbers in registers conform with
these two assumptions in order to provide the correct computational results.
A floating-point binary number is represented in a similar manner except that it uses
base 2 for the exponent. For example, the binary number +1001.11 is represented with an 8-bit
fraction and 6-bit exponent as follows:
The fraction has a 0 in the leftmost position to denote positive. The binary point of the
fraction follows the sign bit but is not shown in the register. The exponent has the equivalent
binary number +4. The floating-point number is equivalent to
A floating-point number is said to be normalized if the most significant digit of the
mantissa is nonzero. For example, the decimal number 350 is normalized but 00035 is not.
Regardless of where the position of the radix point is assumed to be in the mantissa, the
number is normalized only if its leftmost digit is nonzero. For example, the 8-bit binary number
00011010 is not normalized because of the three leading 0’s. The number can be normalized by
shifting it three positions to the left and discarding the leading 0’s to obtain 11010000. The
three shifts multiply the number by 23 = 8. To keep the same value for the floating-point
number, the exponent must be subtracted by 3. Normalized numbers provide the maximum
possible precision for the floating-point number. A zero cannot be normalized because it does
not have a nonzero digit. It is usually represented in floating-point by all 0’s in the mantissa and
exponent.
Two main standard forms of floating-point numbers are from the following
organizations that decide standards: ANSI (American National Standards Institute) and IEEE
(Institute of Electrical and Electronic Engineers). The ANSI 32-bit floating-point numbers in byte
format with examples are given below: