Chapter 2
Data Encoding Techniques
2.1
Encoding is the process of converting the data or
a given sequence of characters, symbols, alphabets
etc., into a specified format, for the secured
transmission of data.
Decoding is the reverse process of encoding which
is to extract the information from the converted
format.
Data Encoding
Encoding is the process of using various patterns of
voltage or current levels to represent 1s and 0s of the
digital signals on the transmission link.
The common types of line encoding are Unipolar, Polar,
Bipolar, and Manchester.
Encoding Techniques
The data encoding technique is divided into the following
types, depending upon the type of data conversion.
2.2
Analog data to Analog signals − The modulation
techniques such as Amplitude Modulation AM, Frequency
Modulation FM and Phase Modulation PM of analog signals,
fall under this category. [Assignment for G1]
Analog data to Digital signals − This process can be
termed as digitization, which is done by Pulse Code
Modulation PCM. Hence, it is nothing but digital modulation.
Sampling and quantization are the important factors in this.
Delta Modulation gives a better output than PCM. [Assignment
for G2]
Digital data to Analog signals − The modulation
techniques such as Amplitude Shift Keying ASK, Frequency
Shift Keying FSK, Phase Shift Keying PSK, etc., fall under this
category. [Assignment G3]
Digital data to Digital signals − These are discussed in this
section. There are several ways to map digital data to digital
signals. Some of them are −NRZ Encoding, BI-phase Encoding
and block encoding. [Assignment for G4 on Block encoding].
2.3
2-1 DIGITAL-TO-DIGITAL CONVERSION
In this section, we see how we can represent digital
data by using digital signals. The conversion involves
a technique known as: line coding.
Topics discussed in this section:
Line Coding
Line Coding Schemes
2.4
Line Coding
Converting a string of 1’s and 0’s
(digital data) into a sequence of
signals that denote the 1’s and 0’s.
For example a high voltage level
(+V) could represent a “1” and a
low voltage level (0 or -V) could
represent a “0”.
2.5
Figure 2.1 Line coding and decoding
2.6
Mapping Data symbols
onto Signal levels
2.7
Relationship between data
rate and signal rate
The data rate defines the number of bits
sent per sec - bps. It is often referred to
the bit rate.
The signal rate is the number of signal
elements sent in a second and is
measured in bauds. It is also referred to
as the modulation rate.
Goal is to increase the data rate whilst
reducing the baud rate.
2.8
Figure 2.2 Signal element versus data element
2.9
Data rate and Baud rate
The baud or signal rate can be expressed
as:
S = c x N x 1/r bauds where
N is data rate
c is the case factor (worst, best & avg.)
Note c = 1/2 for the avg. case as worst case is 1 and best
case is 0
r is the ratio between data element &
signal element
2.10
Example 2.1
A signal is carrying data in which one data element is
encoded as one signal element ( r = 1). If the bit rate is
100 kbps, what is the average value of the baud rate if c is
between 0 and 1?
Solution
We assume that the average value of c is 1/2 . The baud
rate is then
2.11
Considerations for choosing a good signal
element referred to as line encoding
Baseline wandering - a receiver will evaluate
the average power of the received signal
(called the baseline) and use that to determine
the value of the incoming data elements.
If the incoming signal does not vary over a long period
of time, the baseline will drift and thus cause errors in
detection of incoming data elements.
A good line encoding scheme will prevent long runs of
fixed amplitude.
2.12
Line encoding C/Cs
DC components - when the voltage
level remains constant for long
periods of time, there is an increase in
the low frequencies of the signal. Most
channels are bandpass and may not
support the low frequencies.
This will require the removal of the dc
component of a transmitted signal.
2.13
Line encoding C/Cs
Self synchronization - the clocks at
the sender and the receiver must
have the same bit interval.
If the receiver clock is faster or slower
it will misinterpret the incoming bit
stream.
2.14
Figure 2.3 Effect of lack of synchronization
2.15
Example 2.2
In a digital transmission, the receiver clock is 0.1 percent
faster than the sender clock. How many extra bits per
second does the receiver receive if the data rate is
1 kbps? How many if the data rate is 1 Mbps?
Solution
At 1 kbps, the receiver receives 1001 bps instead of 1000
bps.
At 1 Mbps, the receiver receives 1,001,000 bps instead of
1,000,000 bps.
2.16
Line encoding C/Cs
Error detection - errors occur during
transmission due to line impairments.
Some codes are constructed such that when
an error occurs it can be detected. For
example: a particular signal transition is not
part of the code. When it occurs, the receiver
will know that a symbol error has occurred.
It is good to add extra bits to the Tx data for error
detection (and possibly correct).
2.17
Line encoding
Noise and interference immunity -
there are line encoding techniques
that make the transmitted signal
“immune” to noise and interference.
This means that the signal cannot be
corrupted, it is stronger than error
detection.
Encoding/ Decoding complexity: complex
high cost
2.18
Line encoding C/Cs
Complexity - the more robust and
resilient the code, the more
complex it is to implement and the
price is often paid in baud rate or
required bandwidth.
2.19
Figure 2.4 Line coding schemes
2.20
Unipolar
All signal levels are on one side of the time
axis - either above or below
NRZ - Non Return to Zero scheme is an
example of this code.
The signal level does not return to zero during
a symbol transmission.
Scheme is prone to baseline wandering
and DC components. It has no
synchronization or any error detection. It is
simple but costly in power consumption.
2.21
Figure 2.5 Unipolar NRZ scheme
2.22
Polar - NRZ
The voltages are on both sides of the time
axis.
Polar NRZ scheme can be implemented
with two voltages. E.g. +V for 1 and -V for
0.
There are two versions:
NZR - Level (NRZ-L) - positive voltage for one
symbol and negative for the other
NRZ - Inversion (NRZ-I) - the change or lack of
change in polarity determines the value of a
symbol. E.g. a “1” symbol inverts the polarity a
“0” does not.
2.23
Figure 2.6 Polar NRZ-L and NRZ-I schemes
2.24
Note
In NRZ-L the level of the voltage
determines the value of the bit.
In NRZ-I the inversion
or the lack of inversion
determines the value of the bit.
2.25
Note
NRZ-L and NRZ-I both have an average
signal rate of N/2 Bd.
2.26
Note
NRZ-L and NRZ-I both have a DC
component problem and baseline
wandering, it is worse for NRZ-L. Both
have no self synchronization &no error
detection. Both are relatively simple to
implement.
2.27
Example 2.3
A system is using NRZ-I to transfer 1-Mbps data. What
are the average signal rate and minimum bandwidth?
Solution
The average signal rate is S= c x N x R = 1/2 x N x 1 =
500 kbaud. The minimum bandwidth for this average
baud rate is Bmin = S = 500 kHz.
Note c = 1/2 for the avg. case as worst case is 1 and best
case is 0
2.28
Polar - RZ
The Return to Zero (RZ) scheme uses
three voltage values. +, 0, -.
Each symbol has a transition in the
middle. Either from high to zero or from
low to zero.
This scheme has more signal transitions
(two per symbol) and therefore requires a
wider bandwidth.
No DC components or baseline wandering.
Self synchronization - transition indicates
symbol value.
More complex as it uses three voltage level.
It has no error detection capability.
2.29
Figure 2.7 Polar RZ scheme
2.30
Polar - Biphase: Manchester and Differential
Manchester coding consists of combining the NRZ-L and
Manchester
RZ schemes.
Every symbol has a level transition in the middle: from high
to low or low to high. Uses only two voltage levels.
Differential Manchester coding consists of combining the
NRZ-I and RZ schemes.
Every symbol has a level transition in the middle. But
the level at the beginning of the symbol is determined
by the symbol value. One symbol causes a level
change the other does not. i.e, If there occurs a transition at
the beginning of the bit interval, then the input bit is 0. If no transition
occurs at the beginning of the bit interval, then the input bit is 1.
2.31
Figure 2.8 Polar biphase: Manchester and differential Manchester schemes
2.32
Note
In Manchester and differential
Manchester encoding, the transition
at the middle of the bit is used for
synchronization.
2.33
Note
The minimum bandwidth of Manchester
and differential Manchester is 2 times
that of NRZ. The is no DC component
and no baseline wandering. None of
these codes has error detection.
2.34
Bipolar - AMI and
Pseudoternary
Code uses 3 voltage levels: - +, 0, -, to
represent the symbols (note not
transitions to zero as in RZ).
Voltage level for one symbol is at “0”
and the other alternates between + & -.
Bipolar Alternate Mark Inversion (AMI) -
the “0” symbol is represented by zero
voltage and the “1” symbol alternates
between +V and -V.
Pseudoternary is the reverse of AMI.
2.35
Figure 4.9 Bipolar schemes: AMI and pseudoternary
2.36
Bipolar C/Cs
It is a better alternative to NRZ.
Has no DC component or baseline
wandering.
Has no self synchronization
because long runs of “0”s results
in no signal transitions.
No error detection.
2.37
Assignment for G5
Discuss in detail the Pros and cons of all
types and variants of line coding
schemes
Unipolar (NRZ)
Polar (NRZ,RZ and biphase [Manchester and
differential Manchester])
Bipolar (AMI and Pseudo ternary)
with regard to the Factors to
consider in digital signaling.
2.38
2.2 Digital Data Communication Techniques
DATA TRANSMISSION MODES
Transmission of digital data through a transmission medium
can be performed either in serial or in parallel mode.
In the serial mode, one bit is sent per clock tick, whereas in
parallel mode multiple bits are sent per clock tick.
There are two subclasses of transmission for both the serial
and parallel modes, as shown in Fig
2.39
Asynchronous and Synchronous Transmission
Receiver samples the medium at the center of each bit
time.
Transmitter’s and receiver’s clocks may not be precisely
aligned.
In Synchronous Transmission,
data is sent in form of blocks or frames and is the full-duplex type.
Between sender and receiver, synchronization is compulsory. There is
no gap present between data, is more efficient and more reliable than
asynchronous transmission to transfer a large amount of data.
Examples: Chat Rooms, Telephonic Conversations, Video Conferencing
In Asynchronous Transmission,
data is sent in form of byte or character.
is the half-duplex type transmission.
start bits and stop bits are added with data.
It does not require synchronization.
Examples: email, forums and letters
40
Asynchronous and Synchronous Transmission (2)
timing problems require a mechanism to
synchronize the transmitter and receiver
receiver samples stream at bit intervals
if clocks are not precisely aligned, drifting will
sample at wrong time after sufficient bits are
sent
two solutions to synchronizing clocks:
Asynchronous Transmission
data are transmitted one character at a time
each character is 5 to 8 bits in length
receiver has the opportunity to resynchronize at
the beginning of each new character
simple and cheap
requires overhead of 2 or 3 bits per character
(~20%)
the larger the block of bits, the greater the
cumulative timing error
good for data with large gaps (keyboard)
Asynchronous Transmission
Types of errors, error detection and error
correction
Errors can be caused by signal attenuation or noise. Errors
can be
Single-bit Error
Typically, only one bit of the frame received is corrupted, and
the corrupted bit can be located anywhere in the frame.
Multiple-bit Error
Burst Error
More than one consecutive bit is corrupted in the received frame.
2.44
Error Detection
Vertical Redundancy Check(VRC): adds a parity bit to every
data unit so that the total number of 1s becomes even- for even
parity checking or odd-for odd parity checking.
Even Parity Check: Data sent from the sender undergoes
parity check :
1 is added as a parity bit to the data block if the data block has an odd
number of 1's.
0 is added as a parity bit to the data block if the data block has
an even number of 1's.
This procedure is used for making the number of 1's even. This is
commonly known as even parity checking.
2.45
Error Detection
Even Parity Check
Disadvantage:
Only single-bit error is detected by
this method, it fails in multi-bit error
detection.
2.46
Longitudinal Redundancy Check(LRC)
is also known as 2-D parity check. A block of bit is divided into
table or matrix of rows and columns.
In order to detect an error, a redundant bit is added to the whole
block and this block is transmitted to receiver. The receiver uses
this redundant row to detect error. After checking the data for
errors, receiver accepts the data and discards the redundant row
of bits.
Example :
If a block of 32 bits is to be transmitted, it is divided into matrix of
four rows and eight columns which as shown in the following
figure :
2.47
Longitudinal Redundancy Check(LRC)
In this matrix of bits, a parity bit (odd or even) is calculated for each column. It
means 32 bits data plus 8 redundant bits are transmitted to receiver. Whenever
data reaches at the destination, receiver uses LRC to detect error in data.
Example : Suppose 32 bit data plus LRC that was being transmitted is hit by a
burst error of length 5 and some bits are corrupted as shown in the following
figure :
The LRC received by the destination does not match with newly corrupted LRC.
The destination comes to know that the data is erroneous, so it discards the data.
2.48
Longitudinal Redundancy Check(LRC)
Disadvantage :
The main problem with LRC is that, it is not able to detect error
if two bits in a data unit are damaged and two bits in exactly
the same position in other data unit are also damaged.
Example : If data 110011 010101 is changed to
010010110100.
Figure : Two bits at same bit position damaged in 2 data units
In this example 1st and 6th bit in one data unit is changed .
Also the 1st and 6th bit in second unit is changed.
2.49
Checksum
Checksum is an error detection which detects the error by dividing
the data into segments of equal size and then use 1's
complement to find the sum of the segments and then the sum is
transmitted with the data to the receiver and same process is done
by the receiver and at the receiver side, all zeros in the sum
indicates the correctness of the data.
First of all data is divided into k segments in a checksum error
detection scheme and each segment has m bits.
For finding out the sum at the sender’s side, all segments are added
through 1's complement arithmetic. And for determining the
checksum we complement the sum.
Along with data segments, the checksum segments are also
transferred.
All the segments that are received on the receiver's side are added
through 1’S complement arithmetic to determine the sum. Then
complement the sum.
The received data is accepted only on the condition that the result is
found to be 0. And if the result is not 0 then it will be discarded.
kaka
2.50
Checksum: Example
2.51
Checksum
Disadvantages: In checksum
error is not detected, if one sub-unit
of the data has one or more
corrupted bits and corresponding bits
of the opposite value are also
corrupted in another sub-unit. Error is
not detected in this situation because
in this case the sum of columns is not
affected by corrupted bits.
2.52
Cyclic Redundancy Check-CRC
The checksum scheme uses the addition method but CRC uses
binary division. A bit sequence commonly known as cyclic
redundancy check is added to the end of the bits in CRC. This is
done so that the resulting data unit will be divisible by the second
binary number that is predetermined.
The receiving data units on the receiver's side need to be divided
by the same number. These data units are accepted and found to
be correct only on the condition that the remainder of this division
is zero. The remainder shows that the data is not correct. So, they
need to be discarded.
Disadvantages: Cyclic Redundancy Check may lead to overflow
of data.
2.53
Cyclic Redundancy Check-CRC-Examples
2.54
Error Correction
Error Correction codes are used to detect and correct the errors
when data is transmitted from the sender to the receiver.
Error Correction can be handled in two ways:
Backward error correction: Once the error is discovered, the
receiver requests the sender to retransmit the entire data unit.
Forward error correction: In this case, the receiver uses the
error-correcting code which automatically corrects the errors.
A single additional bit can detect the error, but cannot correct
it.
For correcting the errors, one has to know the exact position of
the error. For example, If we want to calculate a single-bit
error, the error correction code will determine which one of
seven bits is in error.
To achieve this, we have to add some additional redundant
bits.
2.55
Error Correction
Suppose r is the number of redundant bits and d is the total number of
the data bits. The number of redundant bits r can be calculated by
using the formula:
2r>=d+r+1
The value of r is calculated by using the above formula. For example, if
the value of d is 4, then the possible smallest value that satisfies the
above relation would be 3.
To determine the position of the bit which is in error, Hamming code
can be applied to any length of the data unit and uses the relationship
between data units and redundant units.
Hamming Code
Parity bits: The bit which is appended to the original data of binary
bits so that the total number of 1s is even or odd.
Even parity: if the total number of 1s is even, then the value of the
parity bit is 0. If the total number of 1s occurrences is odd, then the
value of the parity bit is 1.
Odd Parity: if the total number of 1s is even, then the value of parity
bit is 1. If the total number of 1s is odd, then the value of parity bit is 0.
2.56
Hamming Code
Algorithm of Hamming code:
An information of 'd' bits are added to the redundant bits 'r' to form d+r.
The location of each of the (d+r) digits is assigned a decimal value.
The 'r' bits are placed in the positions 1,2,.....2k-1.
At the receiving end, the parity bits are recalculated. The decimal value
of the parity bits determines the position of an error.
Relationship b/w Error position & binary number.
Let's understand the concept of Hamming code through an example:
Suppose the original data is 1010 which is to be sent.
Total number of data bits 'd' = 4 Number of redundant bits r :
2r >= d+r+1 2r>= 4+r+1 Therefore, the value of r is 3 that satisfies
the above relation. Total number of bits = d+r = 4+3 = 7;
2.57
Hamming Code
Determining the position of the redundant bits
The number of redundant bits is 3. The three bits are represented
by r1, r2, r4. The position of the redundant bits is calculated with
corresponds to the raised power of 2. Therefore, their
corresponding positions are 1, 21, 22.
The position of r1 = 1
The position of r2 = 2
The position of r4 = 4
Representation of Data on the addition of parity bits:
Determining the Parity bits: Determining the r1 bit
The r1 bit is calculated by performing a parity check on the bit
positions whose binary representation includes 1 in the first
position.
2.58
Hamming Code
We observe from the above figure that the bit positions that includes 1
in the first position are 1, 3, 5, 7. Now, we perform the even-parity
check at these bit positions. The total number of 1 at these bit positions
corresponding to r1 is even, therefore, the value of the r1 bit is 0.
Determining r2 bit
The r2 bit is calculated by performing a parity check on the bit positions
whose binary representation includes 1 in the second position.
We observe from the above figure that the bit positions that includes 1
in the second position are 2, 3, 6, 7. Now, we perform the even-parity
check at these bit positions. The total number of 1 at these bit positions
corresponding to r2 is odd, therefore, the value of the r2 bit is 1.
2.59
Hamming Code
Determining r4 bit
The r4 bit is calculated by performing a parity check on the bit
positions whose binary representation includes 1 in the third
position.
We observe from the above figure that the bit positions that
includes 1 in the third position are 4, 5, 6, 7. Now, we perform the
even-parity check at these bit positions. The total number of 1 at
these bit positions corresponding to r4 is even, therefore, the
value of the r4 bit is 0.
2.60
Hamming Code
Data transferred is given below:
Suppose the 4th bit is changed from 0 to 1 at the receiving end,
then parity bits are recalculated.
R1 bit: The bit positions of the r1 bit are 1,3,5,7
We observe from the above figure that the binary representation of r1 is
1100. Now, we perform the even-parity check, the total number of 1s
appearing in the r1 bit is an even number. Therefore, the value of r1 is 0.
R2 bit: The bit positions of r2 bit are 2,3,6,7.
We observe from the above figure that the binary representation of r2 is
1001. Now, we perform the even-parity check, the total number of 1s
appearing in the r2 bit is an even number. Therefore, the value of r2 is 0.
2.61
Hamming Code
R4 bit: The bit positions of r4 bit are 4,5,6,7.
We observe from the above figure that the binary
representation of r4 is 1011. Now, we perform the even-
parity check, the total number of 1s appearing in the r4 bit
is an odd number. Therefore, the value of r4 is 1.
The binary representation of redundant bits, i.e., r4r2r1 is 100, and its
corresponding decimal value is 4. Therefore, the error occurs in a 4 th bit
position.
The bit value at the 4th position must be changed from 1 to 0 to
correct the error.
2.62
Assignment
Assignment for G6
Implement VRC and LRC using C with appropriate menu
driven program
Assignment for G7
Implement checksum and CRC using C with appropriate
menu driven program
Assignment for G8
Briefly discuss Encapsulation, Decapsulation and the
roles of each of the OSI model layers. Also, list
networking devices, equipment and common protocols
in each layer.
Exam:Dec 10
2.63