UNIT V Finite Word Length Effects Lecture Notes Modified
UNIT V Finite Word Length Effects Lecture Notes Modified
Syllabus:
Quantization noise – Derivation for quantization noise power – Fixed point and binary floating point
number representation – Comparison – Over flow error – Truncation error – Co-efficient quantization
error – Limit cycle oscillation – Signal scaling – Analytical model of sample and hold operations –
Application of DSP – Model of speech wave form – Vocoder.
Minimal Coverage:
Quantization noise – Derivation for quantization noise power
Fixed point and binary floating point number representation – Comparison –
Over flow error – Truncation error –
Co-efficient quantization error –
Limit cycle oscillation – Signal scaling –
.
TEXT BOOK
1. John G Proakis and Dimtris G Manolakis, “Digital Signal Processing Principles- Algorithms and
Application”, 3rd Edition, PHI/Pearson Education, 2000.
REFERENCES
1. Sanjit K.Mitra, “Digital Signal Processing A Computer - Based Approach”, 2nd Edition, Tata
McGraw-Hill, 2001. (Covers much of this unit)
2.Alan V Oppenheim, Ronald W Schafer and John R Buck, “Discrete Time Signal Processing”, 2nd
Edition, PHI/Pearson Education, 2000.
3. Johny R.Johnson, “Introduction to Digital Signal Processing”, Prentice Hall of India/Pearson
Education, 2002.
4.http://www-inst.eecs.berkeley.edu/~cs61c/sp06/handout/fixedpt.html
5. Alan V Oppenheim & Ronald W Schafer " Digital Signal Processing" (Many good ideas got from
this book. This is another book which is not the one mentioned in Ref 2)
AVR 2
i=1
Negative Fraction (Fixed point numbers) can be represented in three ways:
Sign Magnitude Representation: X SM =1. b1 b 2 … b B 1 denotes negative fraction
1’s Complement Representation: X 1 C =1.b 1 b 2 b3 … b B
2’s Complement Representation: X 2 C =1.b 1 b 2 b3 … b B +0.000 … 01
Positive Fraction representation is same in every above representation.
Example:
7
∧−7
1. Express the fraction 8 in sign magnitude, two’s complement and one’s complement
8
format.
Solution:
7
=0.875=0.1110(2)
8
0.875∗2=1.75 →1
0.75∗2=1.5 →1
0.5∗2=1.0 → 1
0∗2=0 → 0
7 −7
Sign Magnitude: = 0.111 =1.111
8 8
7 −7
1’s Complement: = 0.111 =1.000
8 8
7 −7
2’s Complement: = 0.111 =1.001
8 8
Note:
There won’t be any point (decimal or binary point) in computers if you type a fractional number. Rather it
is made to evaluate the numbers after particular position.
AVR 3
10.011
−B
Range of positive fraction :0 ≤ X ≤1−2
Range of 2’ s complement Representation :−1≤ X ≤ 1−2−( B−1 )
(Since 2’s complement form represent both positive and negative fraction)
Dynamic range=X MAX −X MIN
Dynamic Range
Resolution= B
→tells what is the next number∈the series
2 −1
Example:
If B =2, then considering 2’s complement representation,
Range: −1 ≤ X ≤ 0.5
Dynamic Range = 0.5−(−1 )=1.5
1.5 1.5
Resolution = = =0.5
2 −1 3
2
( 11 ) (10 )( 01 ) (00)
Binary Floating point Representation:
Floating point:
AVR 4
E
X =M . 2
M : Mantissa∧E : Exponent
0.125 → 0.001∗20 →0.100∗2−2 →0.100∗2110
Floating Point Arithmetic:
Big Number = Small number * 2+ve
Floating Point Multiplication:
Multiply the mantissa
Small Number = Big number * 2− ve
Add the exponent
Correct the decimal point to the first one. 101=0.101∗2+3
Example :
X 1 =5=101=0.101∗23=0.101∗2011 0.001=0.100∗2−2
3 −1 −0 01 101
X 2 = =0.375=0.0 11=0.11∗2 =0.11 0∗2 =0.110∗2
8
E +E ( 011+101 ) 011−001
X 1∗X 2=( M 1∗M 2 )∗2 =( 0.101∗0.11 0 )∗2
1 2
=0.011110∗2
010 001
¿ 0.011110∗2 =0.111100∗2
Floating Point Addition:
If exponent is same for the both number, then add the mantissa like fixed point addition.
If exponent isdifferent , makethe smallnumber ¿ equate the exponent . See below :
Taking the previous example : −1
011 101 011 −001 0.110∗2
X 1 + X 2=0.101∗2 +0.110∗2 =0.101∗2 + 0.110∗2 0
011 011
0.0110∗2
X 1 + X 2=0.101∗2 +0.0000110∗2 1
0.00110∗2
Now the exponents are same , we can add themantissa , 2
011
0.000110∗2
X 1 + X 2=0.1010110∗2 3
0.0000110∗2
Comparison of Floating Point and Fixed Point Representation:
S.No Fixed Point Floating Point
1. Advantage: Disadvantage:
Faster computation Relatively slower computation
Fixed range and precision Variable range and precision
2. Disadvantage: Advantage:
Poor dynamic range Larger dynamic range
Uniform resolution throughout the numbers Fine resolution for small numbers but coarser
resolution for large numbers
3. Example: Example:
0.375 → 0.011 0
0.375 → 0.011∗2 =0.11∗2 =0.11∗2
−1 101
−0.375 →1.101 0 −1
−0.375 →1.011∗2 =1.11∗2 =1.11∗2
101
Sign bit becomes one for the addition of two positive numbers resulting like a negative number. This is
actually due to overflow. To handle overflow, when the number exceeds the range, then exceeding
number should be replaced by the maximum number.
Overflow also occurs when two numbers (fixed point or floating point ) are multiplied. It means,
the result of product of two numbers can’t be stored in the memory register because of their large size
(b bits * b bits = 2b bits). In that case, to store large number, large number is made to small number either
by truncation or Rounding (This is also called as Quantization).
Multiplying fractional numbers (b bits * b bits = 2b bits can be quantized to b bits itself) without
much loss of accuracy.
AVR 5
2 2 2 2
Fig: 5.1 Quantization errors (scan from A V O)
Range of Truncation Error in Floating Point: (Hint: Multiply by 2 with the fixed point range)
Positive Number:
−b
−2. 2 ≤ e ≤ 0
Sign-Magnitude Representation Negative Number:
−b
0 ≤ e ≤ 2.2
1’s Complement Represented Negative Number:
−b
0 ≤ e ≤ 2.2
2’s Complement Represented Negative Number:
−b
−2. 2 ≤ e ≤ 0
Range of Rounding Error in Floating Point:
−b −b
2 2
−2. ≤ e ≤ 2. →−2−b ≤ e ≤ 2−b
2 2
Fig: 5.2 a) Sampler (Non Linear Model and b) Additive Noise Model for Quantization (Statistical Model)
Fig: 5.3 Probability density functions for a) rounding and b) truncation (assumed)
(Statistical Characterization of Quantization errors)
AVR 7
∞
Mean∨Average Noise , me =E ( e )=∫ e . Pe ( e ) de
−∞
2
σ =Noise Variance=Noise Power
e
∞
σ =E (( e−m e ) )=E ( e )= ∫ e . Pe ( e ) de
2 2 2
2
e
−∞
Mean and Variance of Rounding Noise:
+Δ +Δ
[ ] [ ]
2 2 Δ
1 1 1 e2 1 Δ2 Δ 2
me =E ( e )= ∫ e . de= ∫ e de=
2
= − =0
−Δ Δ Δ −Δ Δ 2 −Δ
2
2Δ 4 4
2 2
+Δ +Δ
[ ] [ ( )] [ ]
2 2 Δ
1 1 1 e3 1 Δ3 −Δ3 1 Δ3 Δ 3
Noise Power : σ = ∫ e . de= ∫ e de=
2 2
2 2
e = − = +
−Δ Δ Δ −Δ Δ 3 −Δ
2
3Δ 8 8 3Δ 8 8
2 2
¿
3Δ 8 [ ]
1 2 Δ 3 Δ2 2−2 b
= =
12 12
Mean and Variance of Truncation Noise:
Similarly proceeding,
−b −2 b
−Δ −2 2
2
me = = ;σ = e
2 2 12
Output Quantization Noise Power:
Whenever quantized signals are processed by a digital system (like Filter), the input error (or noise)
manifests itself as an error (or noise) in the resulting output.
Actualinput ¿ the system:=x ( n ) +e ( n ) (Statistical Modeling)
Actual output of the system := y ( n )+ f (n)
Ignoring theinput signal : x ( n ) , considering only e ( n ) as input , output noise power can be calculated .
x(n) y(n)
System
e(n) f(n)
System
{∑ }
+∞ +∞
Mean of the output=E [ y ( n ) + f ( n ) ]=E h ( k ) x ( n−k ) + ∑ h ( k ) e ( n−k )
k=−∞ k=−∞
Ignoring theunquantized input x ( n ) , since we need ¿ find only the effect of quantization noise
on the output :
[ ]
+∞ +∞ +∞
mf =E [ f ( n)] =E ∑ h ( k ) e ( n−k ) = ∑ h ( k ) E [ e ( n−k ) ] = ∑ h ( k ) me
k=−∞ k=−∞ k=−∞
∞ ∞
mf =me ∑ h ( n )=me ∑ h ( n ) e j 0 =m e H ( e j 0 )
n=−∞ n=−∞
(Since
∞
H ( e jω )= ∑ h ( n ) e− jωn
n=−∞
Similarly ,
∞ 2 π 2
σ 2 σ
Output Noise Power : σ =σ ∑ |h ( n )| = e ∫ | H ( e jω )| dω= e ∮ H ( z ) H ( z−1 ) z−1 dz
2 2 2
f e
n=−∞ 2 π −π 2 πj
( )
∞
∑ |h ( n )| = 21πj ∮ H ( z ) H ( z −1 ) z−1 dz
2
using Parseval’ s relation :
n=−∞
0 nop nop
Filter y(n)
x(n)
(Quantized coefficients)
...
0 nop nop
Fig: 5.4 b Limit Cycles in Quantized Filter Coefficients
For the same filter, implemented with finite register length arithmetic (Quantization), the output may
decay into a non zero amplitude, after which it has an oscillatory behaviour. This effect is often referred
to as zero-input limit cycle behavior. (This is due to nonlinear quantizers in the feedback loop of the
filter)
Example:
x(n) + y(n)
x(n) + y(n)
Q[ ]
y ( n )=x ( n ) +Q [ α y ( n−1 ) ]
7
y ( 0 )= +Q ( 0.5∗0 )=0.875=0.111( 2)
8
Quantized
AVR 9
x(n) + y(n)
e(n)
Fig:5.6 Statistical Model for fixed point roundoff noise in First order IIR Filter
Assumptions about the noise:
e(n) is a white noise sequence
e(n) has a uniform distribution over one quantization interval
−1 −b 1 −b
Roundoff noise Error range: 2 ≤ e ( n) ≤ 2
2 2
m e =0
−2 b
2 2
Input Noise power :σ e =
12
∞
∑ |h ( n )|
2
Output N oise power :σ 2f =σ 2e
n=−∞
∑ α 2 n ( u ( n ) ) =σ 2e ∑ α 2 n
2
σ 2f =σ 2e
n=−∞ n=0
∞
1
Since ∑ bn=
n=0 1−b
( )
−2 b
2 2 1 2 1
σ f =σ e 2
= 2
1−α 12 1−α
Ovcrflow Oscillations:
Sometimes, due to overflow, output reaches the maximum value. Now the limit cycle will oscillate with
maximum amplitude. This is called Overflow Oscillations
I If x max denotes the ma ximumof the absolute value of the input ,then
+∞ +∞
|y k ( n )|≤ ∑ hk ( r ) x max →| y k ( n )|≤ xmax ∑ hk ( r )
r=−∞ r =−∞
∑ hk ( r )
r=−∞
For FIR Filter, this reduces to:
1
x max < M −1
∑ hk ( r )
r =0
In most cases, scaling (reducing the input by some constant times x --> a x where, a < 1 ) of the input
using above equation is required to guarantee that no overflow occurs.
AVR 11
−2 b
2 2 2
σ f =M . σ e =M
12
Applications of DSP:
Noise Cancellation
Echo Cancellation and generation
Speech signal processing
Image Processing