Computer Communications
Lecture 4
Fundamental Limits and Basic Information Theory
CS3 Computer Communications, Copyright University of Edinburgh 2004
Supplementary Reading
Andrew Tanenbaum, Computer Networks (4/e), Pearson Education, 2003
Section 2.1
Alberto Leon-Garcia and Indra Widjaja, Communication Networks:
Fundamental Concepts and Key Architectures, McGraw-Hill, 2003
Sections 3.1-3.4
William Stallings, Data and Computer Communications (7/e), Pearson
Education, 2004
Chapter 3
Handout on Discrete Information Theory
Also available online at
http://www.inf.ed.ac.uk/teaching/courses/com/handouts/extra/theory.pdf
Gordon Brebner, Computers in Communication, McGraw-Hill, 1997
Chapter 2, pages 46-48
Free PDF download available online at http://www.dcs.ed.ac.uk/home/gordon/c-inc/book.pdf
C. E. Shannon, A Mathematical Theory of Communication, The Bell System
Technical Journal, Vol. 27, pp. 379-423, 623-656, July, October, 1948.
Local copy available at
http://www.inf.ed.ac.uk/teaching/courses/com/handouts/extra/shannon-1948.pdf
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
Digital versus Analog
Digital discrete, small number of possible values
Analog continuous over time
Compared to analog
transmission, digital
transmission more efficient,
robust and flexible.
In the computer
communications world, digital
sometimes called baseband
and analogue called
broadband.
But now broadband often
means high speed and
another term, narrowband
often used just to mean low
speed.
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
The Theoretical Basis for Data Communication
Fourier Analysis
Any periodic signal can be approximated by the summation of a series
(called Fourier Series) of sine and cosine waves at integer multiples of the
fundamental frequency (f = 1/T)
Bandwidth-Limited Signals
Maximum Data Rate of a Channel
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
Bandwidth-Limited Signals
A binary signal and its root-mean-square Fourier amplitudes.
(b) (c) Successive approximations to the original signal.
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
Bandwidth-Limited Signals (2)
(d) (e) Successive approximations to the original signal.
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
Bandwidth-Limited Signals (3)
Relation between data rate and harmonics.
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
Information Theory
How to determine the capacity of a noisy channel
How to predict the theoretical limits on compression
How to measure the quantity of information in a message
CS3 Computer Communications, Copyright University of Edinburgh 2004
Nyquists Intersymbol Interference Theorem
Discovered in 1920s by Harry Nyquist
Harry Nyquist (1889-1976) on the right.
Sample at twice the frequency to reconstruct original signal
For binary signalling with bandwidth B, the maximum data rate is 2B
If transmission system uses K distinct amplitude values to represent
encoded values, then maximum data rate is:
D = 2 B log2 K
Images courtesy of http://www.geocities.com/bioelectrochemistry/nyquist.htm
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
Shannons Theorem
Extended Nyquists work in 1948 to include the effect of noise
Theorem states:
C = B log2 (1 + S/N)
Where
C = channel capacity (bits / second)
B = hardware bandwidth
S = average signal power
N = average noise power
Often simply represent S/N as the signal-to-noise ration (SNR)
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
10
SNR backgrounder
Signal to Noise Ratio = Signal power / Noise power
Normally quoted in decibells (dB)
dB as a Function of S/N Ratio
This is a logarithmic scale
45
-3dB = 1/2 pow er
40
dB = 10 * Log10 (Signal pwr / Noise pwr)
35
25
20
Signal power = 10000,
dB
30
E.g.
15
10
5
Noise power = 1
0
10000
8000
6000
4000
2000
S/N ratio
SNR = 10 * Log10 (10,000) = 40 dB
This is slightly better than the normal SNR for POTS
Halving the signal power leads to 3dB ratio w.r.t. original power
10 * Log10 (1/2) = -3.01
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
11
Shannon capacity of a telephone channel
Basic bandwidth of telephone channel is 3400 Hz
Hence B = 3.4 kHz
Consider channel with SNR = 10,000 (i.e. 40 dB)
C = 3400 log2 (1 + 10000) = 45.2 kbps
In principle, modems limited to 45.2 kbps
So how can modems claim 56 kbps?
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
12
Discrete information theory
A brief look at this topic is useful
A more detailed printed note has been handed out
Yields some general results about the nature of information
And also about the transformations that can be applied to it
Founded by Claude Shannon in 1949
The notion of a code is central
A transformation is a mapping from strings in one alphabet to strings in
another (possibly same). Done before communication it is an encoding,
done after it is a decoding. The set of strings is called a code.
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
13
Discrete information theory
3 particular types of transformation:
Compression: to reduce quantity
Redundancy: to introduce error-resilience over error-prone channels
Encryption: for secure information transfer
Theory gives absolute limits on the extent to which such transformations
can be done
Transformations themselves are covered in later lectures
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
14
Information Theory An Introduction
Suppose a transmitter could transmit one of 3 symbols; A, B or C
Receiver has uncertainty about what it will receive next
Uncertainty depends on the number of symbols
Also depends on the probability of each symbol appearing
Could say uncertainty = 3 symbols
But consider what happens if A,B or C is followed by X or Y
Now six possible sequences; AX, AY, BX, BY, CX, CY
Uncertainty has grown to six (combinations of) symbols
If uncertainty = information, then it appears to grow multiplicatively
This is a counter-intuitive measure of information
Shannons key concept:
Uncertainty is proportional to the log of the number of possible states
In our example above, uncertainty = log(3) + log(2) = log(6)
CS3 Computer Communications, Copyright University of Edinburgh 2004
15
Based on Slides from Prof. Nigel Topham
Shannons Measure of Uncertainty
Uncertainty at receiver = log2(M)
Where M = number of distinct symbols
Assumes each symbol is transmitted with equal probability
What if symbols are not equally probable?
Let P = 1/M be the probability of each symbol being transmitted
log2(M) = -log2(M-1)
= -log2(1/M)
= -log2(P)
Now let each Pi be distinct, and let sum of all Pi = 1, for symbols 1..M
Let the surprisal of each symbol be defined by ui = -log2(Pi)
Arrival of improbable symbol is very surprising: as Pi
0 then ui
Arrival of a certain symbol is totally unsurprising: if Pi = 1 then ui = 0
Uncertainty is defined as the average surprisal per symbol for an infinite
length sequence of symbols, and it is always the receiver that is
uncertain.
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
16
Example
Transmitter
Receiver
Base of logs
Three green symbols and one red symbol to send
base 2 = bits
base e = nats or nits
base 10 = digits
Hence, symbol probabilities are: PG = 0.75, PR = 0.25
If a green symbol is received:
Surprisal = -log2 PG = -log2 (0.75) = 0.415 bits of information received
If a red symbol is received:
Surprisal = -log2 PR = -log2 (0.25) = 2.0 bits of information received
Average surprisal from receiving these four symbols is:
((3 * 0.415) + (1 * 2.0)) / 4
= PG uG + PR uR
Before any symbol is received, uncertainty = 0.75 log2 (0.75) + 0.25 log2 (0.25) = 0.811 b/sym
In general, therefore: receiver uncertainty H = - Pi log2 Pi
Quantity of information transmitted = quantity of uncertainty removed at the receiver
CS3 Computer Communications, Copyright University of Edinburgh 2004
17
Based on Slides from Prof. Nigel Topham
Shannons Uncertainty Function
H(x) is shown here as a function of the
probability P of one out of two symbols
being transmitted
Probability of the other is 1 P
Shannon's Uncertainty Function
Equivalent to the entropy in a Bernoulli
trial as a function of the probability of
success
0.8
0.6
H(x)
Maximum entropy when symbols are
equally likely
Minimum entropy when one symbol has
probability close to 1 (or 0)
Entropy rate of a data source is the
average number of bits needed to encode
it
0.4
0.2
0
0
0.1 0.2
0.3 0.4
0.5
0.6 0.7
0.8 0.9
Probability of one symbol
English text has entropy 1.1 1.6 bits per
character depending on the text
Hbefore Hafter = R
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
18
Entropy - results
For compression: Source Coding Theorem
If entropy of a data type is e, then e transmitted bits, but no fewer, are
enough to communicate each value.
Limits compression in error-free transmission.
Example: data type is alphabet [A..Z]
Assume probabilities as for English text, e.g. E is 0.1305, Z is 0.0009
Entropy of individual letters as individual symbols is approximately 4.15
This is the number of bits required to communicate one symbol of this type
In practice a code of length 4.15 bits is not possible
But 100 symbols on average could be coded in 415 bits
Source coding theorem relies on encoding a large block of values
So entropy value is average compression per individual encoded value
The best compression of The Three Musketeers, Anne of Green Gables and The 1995 CIA World Fact Book in
November 2001 yielded an entropy measurement of 1.30 bits/byte for English text. This used higher-order models in
which multi-letter sequences can be coded as independent symbols and hence achieves better than 4.15 bits /
character.
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
19
Entropy - results
For error correction: Channel Encoding Theorem
Based on Shannons concept of Channel Capacity (C)
C = B log2 (1 + SNR) bits / second
B = channel bandwidth
SNR = Signal-to-Noise Ratio
Assumes line noise has Gaussian distribution1
For any channel
This is the maximum reduction in entropy achievable by communication
Theorem:
If channel capacity is c then each value transmitted can communicate
c bits of information, and no more
Limits redundancy for error-free communication
1. See section 3.4.2, Leon-Garcia & Widjaja, Communication Networks: Fundamental Concepts and Key Architectures,
McGraw-Hill, 2003
CS3 Computer Communications, Copyright University of Edinburgh 2004
Based on Slides from Prof. Nigel Topham
20
10