Cambridge Books Online
Cambridge Books Online
http://ebooks.cambridge.org/
Mathematical Finance
Chapter
      Let us consider a situation where we select randomly and independently n samples from
      a population whose distribution has mean μ and variance σ 2 . The set of such samples,
      denoted as (x1 , x2 , . . . , xn ), is referred to as a random sample of size n. Random sam-
      ples are an important foundation of statistical theory, because a majority of the results
      known in mathematical statistics rely on assumptions that are consistent with a random
      sample. Let us start with a simple question: How can we estimate the population mean
      μ X and population variance σ X2 from the random sample (x1 , x2 , . . . , xn )?
         The sample mean (also called the empirical average) x is defined as
                                                                     1
                                                                       n
                                                            x=           xi .                                  (6.1)
                                                                     n
                                                                         i=1
                                                               1
                                                                 n
                                                            X=     Xi .                                        (6.2)
                                                               n
                                                                         i=1
      In fact the term “sample mean” is often used in statistical theory to describe the variable
      X , but the quantity we can actually observe is its instance x.
                Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                      http://dx.doi.org/10.1017/CBO9780511977770.007
                                 Cambridge Books Online © Cambridge University Press, 2016
6.1 Sample mean and sample variance                                                                                         139
                                                                 1
                                                                   n
                                                 E[X ] =             E[X i ].                                              (6.3)
                                                                 n
                                                                     i=1
E[X ] = μ X , (6.4)
                                                       1                 1
                                                         n                  n
                               X − E[X ] =                 (X i − μ X ) =     Yi ,                                         (6.6)
                                                       n                  n
                                                           i=1                                i=1
where
Yi X i − μ X , i = 1, 2, . . . , n.
Therefore,
                       ⎡                  2 ⎤
                  1                                             1                                  
                    n                                 n              n                                n
   Var [X ] = E ⎣     Yi                        ⎦= 1    E[Yi
                                                            2
                                                              ] +                                            E[Yi Y j ].   (6.7)
                  n                                n2             n2
                                 i=1                           i=1                          i=1 j=1( j=i)
Since the random variables {Yi ; 1 ≤ i ≤ n} are statistically independent with zero mean
and variance σ X2 , we have
                                                                           σ X2
                                                       Var [X ] =               .                                          (6.8)
                                                                            n
Thus, the variance of the sample mean variable is the population variance divided by
the sample size.
    The deviations of the individual observations from the sample mean provide infor-
mation about the dispersion of the xi about x. We define the sample variance
sx2 by
                                                         1 
                                                                     n
                                              sx2           (xi − x)2 .                                                   (6.9)
                                                        n−1
                                                                   i=1
             Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                   http://dx.doi.org/10.1017/CBO9780511977770.007
                              Cambridge Books Online © Cambridge University Press, 2016
140   Fundamentals of statistical data analysis
                                                             1 
                                                                          n
                                                 S X2          (X i − X )2 ,                                      (6.10)
                                                            n−1
                                                                        i=1
      which is also commonly called the sample variance. We find, after some rearrangement
      (Problem 6.1),
                                             1 2                                           
                                               n               n                             n
                                                         1
                                  S X2 =         Yi −                                                   Yi Y j .   (6.11)
                                             n        n(n − 1)
                                                 i=1                               i=1 j=1( j=i)
                                                              1
                                                                n
                                                E[S X2 ]    =     E[Yi2 ] = σ X2 .                                 (6.12)
                                                              n
                                                                     i=1
      The reason for using n − 1 rather than n as the divisor in (6.9) is to make E[S X2 ] equal
      to σ X2 ; that is, to make s 2 an unbiased estimate of σ X2 . The positive square root of the
      sample variance, sx , is called the sample standard deviation.
      When the observed data takes on discrete values, we can just count the number of occur-
      rences for the individual values. Suppose that the sample size n is given and k(≤ n)
      distinct values exist. Let n j be the number of times that the jth value is observed,
      1 ≤ j ≤ k. Then the fraction
                                                           nj
                                                  fj =        ,       j = 1, 2, . . . , k,                         (6.13)
                                                           n
                                           
 j  c j − c j−1 , j = 1, 2, . . . , k,
      need not be equal. Let n j denote the number of observations that fall in the jth class
      interval. Then the relative frequency of the jth class takes the same form as (6.13).
      The grouped distribution may be represented graphically as the following “staircase
      function” in an (x, h)-coordinate system:
                              fj    nj
                 h(x) =          =      , for x ∈ (c j−1 , c j ], j = 1, 2, . . . , k.                             (6.14)
                              
j   n
 j
                 Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                       http://dx.doi.org/10.1017/CBO9780511977770.007
                                  Cambridge Books Online © Cambridge University Press, 2016
      6.3 Graphical presentations                                                                                 141
      Such a diagram is called a histogram and can be regarded as an estimate of the PDF
      of the population. If the class lengths 
 j are all the same, the shape of the histogram
      remains unchanged whether we use the relative frequency of the classes { f j } or the
      frequency counts of the classes {n j } as the ordinate. Such diagrams are also called
      histograms.
         The choice of the class intervals in the histogram representation is by no means trivial.
      Certainly, we should choose them in such a way that the characteristic features of the
      distribution are emphasized and chance variations are obscured. If the class lengths
      are too small, chance variations dominate because each interval includes only a small
      number of observations. On the other hand, if the class lengths are too large, a great deal
      of information concerning the characteristics of the distribution will be lost.
         Let {xk : 1 ≤ k ≤ n} denote n observations in the order observed and let {x(i) : 1 ≤
      i ≤ n} denote the same observations ranked in order of magnitude. The frequency H (x)
      of observations that are smaller than or equal to x is called the cumulative relative
      frequency, and is given by
                            ⎧
                            ⎨ 0, for x < x(1) ,
                   H (x) =      i
                                  , for x(i) ≤ x < x(i+1) , i = 1, 2, . . . , n − 1,       (6.15)
                            ⎩ n
                                1, for x ≥ x(n) ,
      which is the empirical analog of the CDF FX (x). If we use the unit step function
                                           
                                             1,    for x ≥ 0,
                                   u(x) =                                               (6.16)
                                             0,    for x < 0,
      the above H (x) can be more concisely written as
                           1                 1
                             n                  n
              H (x) =          u(x − x(i) ) =     u(x − xk ), − ∞ < x < ∞.                                      (6.17)
                           n                  n
                               i=1                                k=1
      Interestingly enough, use of the unit step function makes it unnecessary to obtain the
      rank-ordered data in order to find H (x). The graphical plot of H (x) is a nondecreasing
      step curve, which increases from zero to one in “jumps” of 1/n at points x = x(1) ,
      x(2) , . . ., x(n) . If several observations take on the same value, the jump is a multiple
      of 1/n.
         When grouped data are presented as a cumulative relative frequency distribution, it
      is usually called the cumulative histogram. The cumulative histogram is far less sensi-
      tive to variations in class lengths than the histogram. This is because the accumulation is
      essentially equivalent to integration along the x-axis, which filters out the chance vari-
      ations contained in the histogram. The cumulative relative frequency distribution or the
      cumulative histogram is, therefore, quite helpful in portraying the gross features of data.
      Reducing primary data to the sample mean, sample variance, and histogram can reveal
      a great amount of information concerning the nature of the population distribution. But
      sometimes important features of the underlying distribution are obscured or hidden by
                 Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                       http://dx.doi.org/10.1017/CBO9780511977770.007
                                  Cambridge Books Online © Cambridge University Press, 2016
142       Fundamentals of statistical data analysis
          the data reduction procedures. In this section we will discuss some graphical methods
          that are often valuable in an exploratory analysis of measurement data. They are: (a) his-
          tograms on the probability or log-normal probability papers; (b) the survivor functions
          on log-linear and log-log papers; and (c) the dot diagram and correlation coefficient.
P = F(x) (6.18)
          provides the dependence of the cumulative distribution on the variable x. The inverse
          function
x P = F −1 (P) (6.19)
          gives the value of the variable x that corresponds to the given cumulative probability
          P. The value x P is called the P-fractile. Some authors use the terms percentile or
          quantile instead of the term fractile.
             The distribution function of the standard normal distribution N (0, 1) is often
          denoted by (·) as defined in (4.46):
                                                    u     	 2
                                                 1              t
                                     (u) = √           exp −       dt.                  (6.20)
                                                 2π −∞           2
          Then the fractile, u P , of the distribution N (0, 1) is derived as
             Suppose that for a given cumulative relative frequency H (x) we wish to test whether
          this empirical distribution resembles a normal distribution; that is, to test whether
                                                       	        
                                                         x −μ
                                           H (x) ∼=                                           (6.22)
                                                            σ
          holds for some parameters μ and σ , where the symbol ∼     = means “to have the distribution
          of.” Testing this relation is equivalent to testing the relation
                                                                                x −μ
                                                                u H (x) ≈            .                               (6.23)
                                                                                  σ
1 This term should not be confused with a similar term “fractal diagram” known in fractal geometry.
                      Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                            http://dx.doi.org/10.1017/CBO9780511977770.007
                                       Cambridge Books Online © Cambridge University Press, 2016
          6.3 Graphical presentations                                                                                   143
          According to the definition of H (x), the plot of u H (x) versus x forms a step (or staircase)
          curve:
                               ⎧
                               ⎨ −∞, for x < x(1) ,
                   u H (x)   =   u , for x(i) ≤ x < x(i+1) , i = 1, 2, . . . , n − 1,                                 (6.24)
                               ⎩ i/n
                                 ∞,  for x ≥ x(n) .
u = u H (x) (6.25)
                       Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                             http://dx.doi.org/10.1017/CBO9780511977770.007
                                        Cambridge Books Online © Cambridge University Press, 2016
144          Fundamentals of statistical data analysis
3 P ⫻ 100
                                                                                                        99
                   2
                                                                                                        95
                                                                                                        90
                   1
                                                                                                        70
             uP
                   0                                                                                    50
                                                                                                        30
                  −1
                                                                                                        10
                                                                                                        5
                  −2                                                                                    2
                                                                                                        0.5
                  −3
                    −3       −2            −1             0             1             2             3
                                                          X
                                                         (a)
3 P ⫻ 100
                                                                                                        99
                   2
                                                                                                        95
                                                                                                        90
                   1
                                                                                                        70
             uP
                   0                                                                                    50
                                                                                                        30
                  −1
                                                                                                        10
                                                                                                        5
                  −2                                                                                    2
                                                                                                        0.5
                  −3
                    −3       −2            −1             0             1             2             3
                                                          X
                                                         (b)
Figure 6.1   The fractile diagram of normal variates: (a) step curve; (b) dot diagram.
                          Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                                http://dx.doi.org/10.1017/CBO9780511977770.007
                                           Cambridge Books Online © Cambridge University Press, 2016
             6.3 Graphical presentations                                                                                   145
3 P ⫻ 100
                                                                                                     99
                   2
                                                                                                     95
                                                                                                     90
                   1
                                                                                                     70
             uP
                   0                                                                                 50
                                                                                                     30
                  −1
                                                                                                     10
                                                                                                     5
                  −2                                                                                 2
                                                                                                     0.5
                  −3
                            100                      101                      102
                                                      X
                                                      (a)
3 P ⫻ 100
                                                                                                     99
                   2
                                                                                                     95
                                                                                                     90
                   1
                                                                                                     70
             uP
                   0                                                                                 50
                                                                                                     30
                  −1
                                                                                                     10
                                                                                                     5
                  −2                                                                                 2
                                                                                                     0.5
                  −3
                            100                      101                      102
                                                      X
                                                      (b)
Figure 6.2   The fractile diagram of log-normal variates: (a) step curve; (b) dot diagram.
             is often called the survivor function, or the survival function in reliability theory. It
             is equivalent to the complementary distribution function FXc (t) defined earlier.
                The natural logarithm of (6.27) is known as the log-survivor function or the log-
             survival function (Cox and Lewis [71]):
             The log-survivor function will show the details of the tail end of the distribution more
             effectively than the distribution itself.
                          Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                                http://dx.doi.org/10.1017/CBO9780511977770.007
                                           Cambridge Books Online © Cambridge University Press, 2016
146       Fundamentals of statistical data analysis
            If, for instance, FX (x) is an exponential distribution with mean 1/α, then its log-
          survivor function is a straight line: S X (t) = log e−αx = −αx.
            If FX (x) is a mixed exponential distribution (or hyperexponential distribution)
then its log-survivor function has two asymptotic straight lines, since
          where H (t) represents the cumulative relative frequency (ungrouped data) or the
          cumulative histogram (grouped data). In the ungrouped case we find from (6.15) that
                                         	         
                                                 i
                                      log 1 −        , 1 ≤ i ≤ n,                        (6.32)
                                                 n
          should be plotted against x(i) , where the subscript (i) represents the rank as in (6.15).
          In order to avoid difficulties at i = n, we may sometimes modify (6.32) into
                                             	         
                                                    i
                                        log 1 −          , 1 ≤ i ≤ n.                         (6.33)
                                                   n+1
            As an example, Figure 6.3 plots the log-survivor function using a sample of size 1000
          drawn from the above hyperexponential distribution with parameters
          Out of the 1000 samples taken, 18 sample points that exceed x = 10 fall outside the
          scale of the figure; hence they are not shown. The asymptotes of (6.30) can be easily
          recognized from this log-survivor function.
             Characteristically, the log-survivor function of the mixed exponential distribution
          (6.29) is convex with a linear tail. Observations of (or departures from) such char-
          acteristic shapes are used to postulate a functional form for a distribution. See Gaver
          et al. [115] and Lewis and Shedler [225].
                     Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                           http://dx.doi.org/10.1017/CBO9780511977770.007
                                      Cambridge Books Online © Cambridge University Press, 2016
              6.3 Graphical presentations                                                                                              147
−1
−2
                            −3
             log[1−H(t )]
−4
−5
−6
−7
−8
−9
                            −10
                                  0            2                 4                  6                 8                10
                                                                           t
Figure 6.3   The log-survivor function of a mixed-exponential (or hyperexponential) distribution with π1 = 0.0526,
             π2 = 1 − π1 , α1 = 0.1, and α2 = 2.0.
−1
−2
                            −3
             log[1−H(t )]
−4
−5
−6
−7
−8
−9
                            −10
                                  1    2           3       4          5         6         7         8         9        10
                                                                           t
Figure 6.4   The log-survivor function of a mixed Pareto distribution, β1 = β2 = 1, π2 = 1 − π1 ,
             α1 = 1.5, α2 = 5, and π1 = 0.2.
                                                                     β1α1         β2α2
                                               S X (t) = π1               + (−π1 α , 0 < max{β1 , β2 } ≤ t.
                                                                                )                                                    (6.35)
                                                                     t α1         t 2
             As an example, Figure 6.4 plots the log-survivor function of 500 samples drawn from
             the mixed Pareto distribution with β1 = β2 = 1, α1 = 1.5, α2 = 5, and π1 = 0.2.
                                      Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                                            http://dx.doi.org/10.1017/CBO9780511977770.007
                                                       Cambridge Books Online © Cambridge University Press, 2016
148     Fundamentals of statistical data analysis
                                                                  f X (t)      f X (t)
                                                  h X (t) =               =                                       (6.36)
                                                                  S X (t)   1 − FX (t)
        is called the hazard function or the failure rate, because h X (t) dt represents the prob-
        ability that the life will end in the interval (t, t + dt], given that X has survived up to
        age t; i.e., X ≥ t. If X represents the service time of a customer, as in queueing theory,
        h X (t) is called the completion rate function.
           The hazard functions of the exponential, Weibull, Pareto, and log-normal distribu-
        tions are given as follows:
                             ⎧
                             ⎪
                             ⎪  λ,                           t ≥ 0, for exponential,
                             ⎪
                             ⎪      ' (α−1
                             ⎪
                             ⎪  α
                             ⎪
                             ⎪
                                      t
                                                ,            t ≥ 0, for Weibull,
                             ⎨ βα β
                   h X (t) =    t,                        
                                                             t ≥ β, for Pareto,              (6.37)
                             ⎪
                             ⎪               ( log t−μY )2
                             ⎪
                             ⎪   t −1 exp  −
                             ⎪
                             ⎪
                                                     2
                                                   2σY
                             ⎪ # ∞ exp − (u−μY )2 du ,       t > 0, for log-normal,
                             ⎪                         
                             ⎩         log t             2σY2
R = X −t (6.40)
                   Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                         http://dx.doi.org/10.1017/CBO9780511977770.007
                                    Cambridge Books Online © Cambridge University Press, 2016
             6.3 Graphical presentations                                                                                          149
                        4
                                      Pareto: α = 3, β = 1
                       3.5            Weibull: α = 1.5, β = 1
                       2.5
             RX (t )
1.5
0.5
                        0
                             0    0.5         1        1.5         2        2.5         3        3.5         4
                                                                   t
Figure 6.5   The mean residual life curves of a Pareto distribution with α = 3.0 and β = 1.0, and a Weibull
             distribution with α = 1.5 and β = 1.0.
             the residual life conditioned on X > t. Then the mean residual life function is
             given by
                                                                                                  #∞
                                                                                                    t    S X (u) du
                                                      R X (t) = E[R|X > t] =                                        .           (6.41)
                                                                                                         S X (t)
             as expected. Figure 6.5 shows mean residual life curves of a Pareto distribution and a
             Weibull distribution.
(xi , yi ), 1 ≤ i ≤ n, (6.43)
             is to plot the points (xi , yi ) one by one as coordinates. Such a diagram is called a dot
             or scatter diagram. The density of dots in a given region is proportional to the relative
             frequency of the pairs (X, Y ) in the region.
                                 Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                                       http://dx.doi.org/10.1017/CBO9780511977770.007
                                                  Cambridge Books Online © Cambridge University Press, 2016
150          Fundamentals of statistical data analysis
             Example 6.1: Scatter diagram of Internet distances [134, 364]. The approximate geo-
             graphic distance between a pair of Internet hosts can be inferred by sending probe
             packets between the two hosts and measuring the round-trip delays experienced by the
             probes. The relationship between geographic distance g and round-trip delay d from a
             given Internet host to Internet hosts can be characterized by a scatter diagram consist-
             ing of points (g, d). Owing to the inherent randomness in round-trip delays over the
             Internet, delay measurements taken between a given pair of hosts separated by a fixed
             geographic distance g at different times yield different delays d.
                The scatter diagram in Figure 6.6 was obtained by sending probe packets from a
             host at Stanford University to 79 other hosts on the Internet across the USA [364]. The
             line labeled baseline provides a lower bound on the d as a function of g based on the
             observation that the packet propagation speed over the Internet is at most the speed
             of light through an optical fiber. If the refractive index of the fiber is denoted by η,
             the propagation speed of the optical signal is v = c/η, where c is the speed of light in
             vacuo. Typically, the value of η is slightly less than 1.5, so we make the approxima-
             tion v ≈ 2c/3. If the round-trip delay between a pair of hosts is measured to be d, the
             corresponding (one-way) geographical distance is upper bounded by ĝ = vd/2 ≈ cd/3.
             When the unit of time is milliseconds and the unit of geographical distance is kilometers,
             c ≈ 300 km/ms, so d and ĝ can be related approximately by
                                                                                              1
                                                                                    d≈           ĝ,                                 (6.44)
                                                                                             100
             which is the equation of baseline in Figure 6.6.
               Since packets generally traverse multiple hops between two hosts and experience
             queueing and processing delays at each hop, the measured round-trip delay will
100
90
80
                            70
             delay d (km)
60
50
40
30
20
                             10                                                                           baseline
                                                                                                          bestline
                             0
                                  0    500 1000 1500 2000 2500 3000 3500 4000 4500
                                              geographical distance g (km)
Figure 6.6   Scatter diagram of delay measurements from Internet host at Stanford University to 79 other
             hosts across the USA [364].
                                      Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                                            http://dx.doi.org/10.1017/CBO9780511977770.007
                                                       Cambridge Books Online © Cambridge University Press, 2016
6.3 Graphical presentations                                                                                   151
typically be much larger than the delay predicted by the equation of the baseline
in (6.44). Gueye et al. [134] propose a tighter linear bound determined by solving a
linear programming problem that minimizes the slope and y-intercept of the line subject
to the constraints imposed by the set of scatter points. This deterministic bound corre-
sponds to the line labeled bestline in Figure 6.6. An alternative approach that retains
more of the statistical information captured by the scatter points is discussed in [364].
  The most frequently used measure of statistical association between a pair of vari-
ables is the correlation coefficient. For a given pair of random variables X and Y , the
covariance of X and Y , written Cov[X, Y ] or σ X Y , is defined as
                                                                      σX Y
                                                        ρX Y =              .                               (6.46)
                                                                     σ X σY
− 1 ≤ ρ X Y ≤ 1. (6.47)
  We say that X and Y are properly linearly dependent if there exist nonzero
constants a and b such that a X − bY is a constant c; that is,
P[a X − bY = c] = 1. (6.48)
Therefore,
Var[a X − bY − c] = 0, (6.49)
ρ X Y = +1 or − 1 (6.50)
             Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                   http://dx.doi.org/10.1017/CBO9780511977770.007
                              Cambridge Books Online © Cambridge University Press, 2016
152   Fundamentals of statistical data analysis
                                                         1 
                                                                      n
                                             sx y =         (xi − x)(yi − y)
                                                        n−1
                                                                    i=1
                                                                                                                                  (6.52)
                                                      1            
                                                                   n
                                                                                  nx y
                                                   =                      xi yi −      ,
                                                     n−1                          n−1
                                                                    i=1
      where x and y are the sample means of {xi } and {yi } respectively. The sample
      correlation coefficient is defined accordingly:
                                                                           sx y
                                                               rx y =            ,                                                (6.53)
                                                                          sx s y
where sx2 and s y2 are the sample variances of {xi } and {yi } respectively.
      Most textbooks on probability theory and mathematical statistics do not seem to deal
      with graphical presentations of real data. We consider that this is an unfortunate state
      of affairs. Various types of graphical presentations of collected data should be explored
                 Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                       http://dx.doi.org/10.1017/CBO9780511977770.007
                                  Cambridge Books Online © Cambridge University Press, 2016
      6.6 Problems                                                                                              153
6.6 Problems
x = x n and s 2 = sn2 .
(a) Derive the following recursive formula for the sample mean:
                                                                        xi − x i−1
                                                 x i = x i−1 +                     , i ≥1
                                                                            i
          with the initial value
x 0 = 0.
      (b) Similarly, show the recursive formula for the sample variance:
                                               	        
                                                   i −2 2        (xi − x i−1 )2
                                     si2   =              si−1 +                , i > 1,
                                                   i −1                i
                 Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                       http://dx.doi.org/10.1017/CBO9780511977770.007
                                  Cambridge Books Online © Cambridge University Press, 2016
154   Fundamentals of statistical data analysis
s02 = s12 = 0.
      6.4 Expectation and variance of the cumulative histogram. Find expressions for
      the expectation and variance of H j (the cumulative histogram in the jth interval) in
      terms of the underlying distribution function FX (x). Explain why the shape of the
      cumulative histogram is rather insensitive to the choice of class lengths {
 j }.
      (a) constant a;
      (b) uniformly distributed in [a, b].
      6.7 Hazard function and distribution functions. Show that the distribution function
      FX (x) is given in terms of the corresponding hazard function h X (x) as follows:
                                                                        #x
                                           FX (x) = 1 − e−               0   h X (t) dt
                                                                                          , x ≥ 0,                          (6.54)
                 Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                       http://dx.doi.org/10.1017/CBO9780511977770.007
                                  Cambridge Books Online © Cambridge University Press, 2016
6.6 Problems                                                                                                 155
and hence
                                                                           #x
                                            f X (x) = h X (x)e−             0   h X (t) dt
                                                                                             .             (6.55)
6.11∗ Mean residual life function and the hazard function. Show that the mean
residual life function R X (t) is a monotone-decreasing function if and only if the hazard
function h X (t) is monotone increasing.
Hint: Consider the conditional survivor function of R = X − t, given that X is greater
than t, defined by
and find its relations with the hazard function h X (t) and the mean residual life function.
6.12∗ Conditional survivor and mean residual life functions for standard Weibull
distribution.
(a) Find the conditional survivor function S X (r |t) (see Problem 6.11) of the standard
    Weibull distribution.
(b) Find the mean residual life function R X (t) for the standard Weibull distribution.
            Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                  http://dx.doi.org/10.1017/CBO9780511977770.007
                             Cambridge Books Online © Cambridge University Press, 2016
156   Fundamentals of statistical data analysis
      6.14 Mean residual life functions – continued. Find an expression for the mean
      residual life function R X (t) for each of the following distributions:
      (a) Pareto distribution with parameters α > 1 and β > 0.
      (b) Two-parameter Weibull distribution with parameters α and β.
      6.15∗ Covariance between two RVs. Suppose that RVs X and Y are functionally
      related according to
Y = cos X.
      6.17   Correlation coefficient – continued. Show that if ρ = ±1, then (6.51) holds.
      6.18∗ Sample covariance. Show that the sample covariance s X Y defined by (6.52) is
      an unbiased estimate of the covariance σ X Y .
      6.19 Recursive formula for sample covariance. Generalize the recursive computa-
      tion formula of Problem 6.2 to the sample covariance.
                 Downloaded from Cambridge Books Online by IP 109.171.137.60 on Sun Aug 21 20:02:24 BST 2016.
                                       http://dx.doi.org/10.1017/CBO9780511977770.007
                                  Cambridge Books Online © Cambridge University Press, 2016