Random signals
Jan Cernock
y
UPGM
FIT VUT Brno, cernocky@fit.vutbr.cz
Random signals
deterministic signals (can be represented by an equation) have one substantial
drawback they carry very little information (for example cosine: amplitude,
frequency, phase shift).
real-world signals are tough to be described as deterministic (for example physical
model of speech production is very complex and anyway simplified).
in signal theory, we will consider these useful signals as random signals (processes)
(for example speech, circulation of letters in agencies of Czech Posts, exchange rate
CZK/EUR . . . ).
According to the character of the time axis, random signals are divided into continuous
time random signals (the time is defined for all t) and discrete time random signals
(only for discrete n).
These signals can not be represented in all time-points (in this case, they would be
deterministic), we will rather look for characteristic properties of random signals such as
mean value, probability density function, etc.
2
Definition of random process
continuous time: the system {t } of random variables defined for all t is called
random process, it is denoted (t).
discrete time: the system {n } of random variables defined for all n N is called
random process, it is denoted [n].
Set of realizations of random process
the set of realizations includes infinity of possible runs of random process - its
realizations. We will limite ourselves to finite number and denote each realization as
(t), or [n]. In case any parameters are estimated on this set, we will speak about
ensemble estimates.
Example: the random signal is recording of water flowing through the water tube in my
flat. 1068 realizations, each of 20 ms, were recorded. For the demonstration of continuous
random signals, we will imagine this set as (t), for discrete random signals as [n].
3
(t) for = 1, 200, 500, 1000
0.2
0
0.2
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.2
0
0.2
0.2
0
0.2
0.2
0
0.2
[n] for = 1, 200, 500, 1000
0.2
0
0.2
0
50
100
150
200
250
300
50
100
150
200
250
300
50
100
150
200
250
300
50
100
150
200
250
300
0.2
0
0.2
0.2
0
0.2
0.2
0
0.2
Distribution function
is defined for one random variable: the random process for given time t or n is such a
random variable. Definition:
F (x, t) = P{(t) < x},
F (x, n) = P{[n] < x},
where P{(t) < x} or P{[n] < x} is the probability that random variable in given time
will be smaller than x. Note, that x is nothing random, it is an auxiliary variable.
Ensemble estimation of distribution function: we will fix ourselves in given time t or n,
and take realizations. For given x we estimate:
P
=1 1 if (t) < x, 0 else
F (x, t) =
P
=1 1 if [n] < x, 0 else
F (x, n) =
1
F(x,0.1ms)
F(x,3.1ms)
F(x,6.3ms)
F(x,9.4ms)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.3
0.2
0.1
0.1
x
0.2
0.3
0.4
0.5
1
F(x,1)
F(x,50)
F(x,100)
F(x,150)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.3
0.2
0.1
0.1
x
0.2
0.3
0.4
0.5
Probability density function
is again defined for one random variable (random process for a given time t or n is such a
random variable). Definition:
F (x, t)
p(x, t) =
x
F (x, n)
p(x, n) =
x
Ensemble estimation of Probability density function: The function can be obtained by
numeric derivation from estimated F (x, t) or F (x, n) or it can be estimated by a
histogram:
choose given t or n
CHoose L values x from xmin till xmax , with regular step =
xmax xmin
:
L1
x1 = xmin , x2 = xmin + , x3 = xmin + 2 . . .
. . . xL1 = xmin + (L 2), xL = xmin + (L 1) = xmax
In such a way, well obtain L cages with width , for xi , given cage chi is from
9
xi
2 till xi + 2 . The left edge of the
left-most cage (1) will be stretched till , the right edge of the right-most (L) till +.
the estimate for xi is computed
p(xi , t) =
1 if (t) chi , 0 else
p(xi , n) =
1 if [n] chi , 0 else
=1
=1
10
3.5
3
2.5
2.5
p(x,3.1ms)
p(x,0.1ms)
2
1.5
1.5
1
0.5
0.5
0
0.2
0.2
0.4
0.6
0.2
0.2
0.4
0.6
0.2
0.4
0.6
3.5
3
p(x,9.4ms)
p(x,6.3ms)
2.5
2
1.5
2.5
2
1.5
0.5
0.5
0.2
0.2
0.4
0.6
0.2
0
x
11
3.5
2.5
p(x,50)
p(x,1)
2.5
2
1.5
1.5
1
0.5
0.5
0
0.2
0.2
0.4
0.6
0.2
0.2
0.4
0.6
0.2
0.4
0.6
3.5
3
3
2.5
p(x,150)
p(x,100)
2.5
2
1.5
2
1.5
0.5
0.5
0.2
0.2
0.4
0.6
0.2
0
x
12
F (x, t), p(x, t) and probabilities
our task is to determine the probability, that the value of random process in time t or n is
in the interval [a, b]. We can estimate this in two ways (shown only for continuous time):
from distribution function, having in mid its definition: F (x, t) = P{(t) < x}, then
P{a < (t) < b} = F (b, t) F (a, t)
from probability density function. Densities can be integrated:
Z b
P{a < (t) < b} =
p(x, t)dx
a
13
Properties of distribution function and probability density function:
the values of random process will hardly be smaller than , therefore
F (, t) = P{(t) < } = 0.
the values of random process will all be smaller than , therefore
F (+, t) = P{(t) < +} = 1.
probability density function is derivation of distribution function, the inverse sense is
given by an integral:
Z x
F (x, t) =
p(g, t)dg
14
as F (+, t) = 1, p(x, t) must obey:
Z +
p(x, t)dx = 1
the value of probability densifty function for given x is not a probability !!!
15
Illustration of F (x, t), p(x, t) a beer keg . . .
beer keg is being drunk from x = 18.00 till 22.00. Well define function p(x) as immediate
beer consumption (drinking function) and F (x) as function of drunk beer (attention, x is
time in this example):
The behavior is simnilar to PDF and distribution functions, just replace 1 with 1 keg:
F (x) is zero in time (it is zero till 18.00), as there was no beer.
F (x) is 1 keg in time + (actually already at 22.00), as all beer was drunk and
16
therell be no more.
Rx
amount of drunk beer at time x: F (x) = p(g)dg.
R +
total amount of drunk beer: F (+) = p(x)dx = 1 keg.
amount of beer drunk from time x1 till x2 can be computed in as difference of 2
points on F (x) or by integration of p(x).
one value of p(x) (for example. p(19.00)) can NOT be called amount of drunk beer
noone can drink anything in infinitely short time.
17
Moments
on contrary to functions, moments are values, characterizing random process in time t or
n:
mean value or Expectation, is the 1st moment:
Z +
a(t) = E{(t)} =
xp(x, t)dx
a[n] = E{[n]} =
xp(x, n)dx
ensemble estimate of mean value for each time t or n the estimate is simply the mean
value of samples over all realizations:
1 X
(t)
a
(t) =
=1
1 X
[n]
a
[n] =
=1
18
For our signals:
3
x 10
5
0
5
10
15
0
0.002
0.004
0.006
0.008
0.01
t
0.012
0.014
0.016
0.018
x 10
5
0
5
10
15
0
50
100
150
200
n
19
250
300
variance (dispersion), standard deviation
D(t) = E{[(t) a(t)]2 } =
[x a(t)]2 p(x, t)dx
D[n] = E{[[n] a[n]] } =
[x a[n]]2 p(x, n)dx
std - standard deviation is the square root of variance:
p
p
(t) = D(t)
[n] = D[n]
ensemble estimate of variance and std: for each t or n:
X
1
D(t)
=
[ (t) a
(t)]2 ,
=1
X
1
[ [n] a
[n]]2 ,
D[n]
=
=1
20
(t) = D(t)
[n] =
D[n]
For our signals:
0.14
0.135
0.13
0.125
0
0.002
0.004
0.006
0.008
0.01
t
0.012
0.014
0.016
0.018
0.14
0.135
0.13
0.125
0
50
100
150
200
n
21
250
300
Correlation function
is quantifying the similarity between values of random process in times t1 (or n1 ) and t2
(or n2 ):
Z + Z +
R(t1 , t2 ) =
x1 x2 p(x1 , x2 , t1 , t2 )dx1 dx2 ,
R(n1 , n2 ) =
x1 x2 p(x1 , x2 , n1 , n2 )dx1 dx2 ,
where p(x1 , x2 , t1 , t2 ), resp. p(x1 , x2 , n1 , n2 ) is two-dimensional probability density
function between times t1 and t2 , resp. n1 and n2 . Theoretically, it can be computed from
two-dimensional distribution function:
F (x1 , x2 , t1 , t2 ) = P{(t1 ) < x1 a (t2 ) < x2 },
F (x1 , x2 , n1 , n2 ) = P{[n1 ] < x1 a [n2 ] < x2 }
22
by deriving along x1 and x2 :
2 F (x1 , x2 , t1 , t2 )
p(x1 , x2 , t1 , t2 ) =
x1 x2
2 F (x1 , x2 , n1 , n2 )
p(x1 , x2 , n1 , n2 ) =
x1 x2
Well rather be interested in ensemble estimation using 2D-histogram:
similarly as for 1D histogram, define cages, in forms of squares: chij is for the first
till
x
and
for
the
second
dimension
from
x
dimension from xi
i
j
2
2
2 till
xj +
2.
23
24
the value of 2D-histogramu for cage chij with values xi and xj will be:
P
=1 1 if (t1 ), (t2 ) chij , 0 else
p(xi , xj , t1 , t2 ) =
2
P
1 if [n1 ], [n2 ] chij , 0 else
p(xi , xj , n1 , n2 ) = =1
2
25
For our signals (only discrete-time): n1 = 0, n2 = 0, 1, 5, 11.
0.4
0.4
100
20
0.2
0.2
80
60
x1
x1
15
0
10
40
0.2
0.2
5
20
0.4
0.4
0.2
0
x2
0.2
0.4
0.4
0.4
0.4
0.2
0
x2
0.2
0.4
0.4
14
12
20
0.2
0.2
10
15
x1
x1
8
0
10
0.2
0.2
5
0.4
0.4
0.2
0
x2
0.2
0.4
0.4
0.4
26
0.2
0
x2
0.2
0.4
For n1 = 0, n2 = 0, 1, 5, 11 the following coefficients were obtain after numeric integration:
R(0, 0) = 0.0188: the process is very similar to itself (the same point). similar. . .
R(0, 1) = 0.0151: if shifted to neighboring sample, still very similar to time n1 = 0.
R(0, 5) = 0.0030: times n1 = 0 and n2 = 5 are not similar at all.
R(0, 11) = 0.0133: in times n1 = 0 and n2 = 11 the process is similar to itself, but
with inverse sign ! It is probable, that if the value [n1 ] is positive, [n2 ] will be
negative and vice versa.
27
Correlation function for
n1 = 0, n2 = n1 +k for k = 0 . . . 40, comparison with n1 = 100, n2 = n1 +k for k = 0 . . . 40:
0.02
0.015
R(0,k)
0.01
0.005
0
0.005
0.01
0.015
10
15
20
k
25
30
35
40
10
15
20
k
25
30
35
40
0.02
R(100,100+k)
0.015
0.01
0.005
0
0.005
0.01
0.015
28
STATIONARITY OF RANDOM PROCESS
simply said, the behavior of random process is not changing with the time. The values do
not depend on time t or n. Correlation function does not depend on the precise values of
t1 , t2 or n1 , n2 , but only on their difference: = t2 t1 , k = n2 n1 . For stationary
continuous-time signal:
F (x, t) F (x)
p(x, t) p(x)
a(t) a D(t) D
(t)
p(x1 , x2 , t1 , t2 ) p(x1 , x2 , ) R(t1 , t2 ) R( )
Similarly for discrete time:
F (x, n) F (x)
p(x, n) p(x)
a[n] a D[n] D
p(x1 , x2 , n1 , n2 ) p(x1 , x2 , k)
29
[n]
R(n1 , n2 ) R(k)
The water-example signal was obviously stationary, as:
the mean value was similar for all times. In case we had more realizations, it would be
even more similar.
dtto for standard deviation.
Correlation function for n1 = 0, n2 = n1 + k and for n1 = 100, n2 = n1 + k was
similar.
30
Stationary vs. non-stationary signal:
2
1.5
1
x (t)
0.5
0
0.5
1
1.5
2
0
0.5
1.5
t
2.5
0.5
1.5
t
2.5
1
0.8
0.6
0.4
xk(t)
0.2
0
0.2
0.4
0.6
0.8
1
0
31
ERGODICITY OF RANDOM PROCESS
all parameters can be estimated from 1 realization:
32
Example of stationary and ergodic and of stationary but non-ergodic random process.
2
1.5
1
x (t)
0.5
0
0.5
1
1.5
2
0
0.5
1.5
t
2.5
0.5
1.5
t
2.5
2
1.5
1
x (t)
0.5
0
0.5
1
1.5
2
0
33
All ensemble estimates can be replaced by temporal estimates, we dispose of interval
of lenght T (continuous time) or of N samples (discrete time). The only realization we
have will be simply denoted x(t), resp. x[n]:
histograms can be applied to estimate distribution function and PDF:
mean value, variance, std:
Z
1 T
a
=
x(t)dt
T 0
N 1
1 X
a
=
x[n]
N n=0
D=
T
[x(t) a
]2 dt
N
1
X
1
=
D
[x[n] a
]2
N n=0
correlation function:
)= 1
R(
T
x(t)x(t + )dt
N
1
X
1
x[n]x[n + k]
R(k)
=
N n=0
34