UKFS Vol
UKFS Vol
Junye Li∗
ESSEC Business School, Paris-Singapore, 188064, Singapore.
Abstract
A smoothing algorithm based on the unscented transformation is proposed for the non-linear Gaus-
sian system. The algorithm first implements a forward unscented Kalman filter and then evokes a
separate backward smoothing pass by only making Gaussian approximations in the state but not
in the observation space. The method is applied to volatility extraction in a diffusion option pric-
ing model. Both simulation study and empirical applications with the Heston stochastic volatility
model indicate that in order to accurately capture the volatility dynamics, both stock prices and
1. Introduction
In the state-space model framework, Bayesian optimal smoothing, also known as belief inference,
refers to statistical methodology that can be used to infer state estimate using all information, which
is available not only in the past and at the current, but also in the future. Optimal smoothing is
closely related to optimal filtering, which makes inference based on information only available in
the past and at the current. Smoothing and filtering have been successfully applied in many fields,
but their applications to financial problems has not attracted enough attention.
At the core of financial econometrics is volatility estimation. Volatility pervades almost every-
where in financial markets. For examples, it is used in option pricing, in portfolio allocation to
control and manage risks, and in computation of risk-adjusted returns for comparisons of relative
∗
Tel.: +33 1 3443 3097; Fax: +33 1 3443 3212.
E-mail address: li@essec.edu
in empirical studies. There are mainly two modeling approaches for volatility. One is the class of
ARCH/GARCH models (Engle, 1982; Bollerslev, 1986), where conditional volatility is a determin-
istic function of past volatility and return innovations, and the other is the stochastic volatility
models (Shephard, 2005), which assume that volatility is unobservable and is driven by a different
random process. In the past thirty years, the diffusion process has become an common tool used to
model dynamics of financial data. The diffusion stochastic volatility models (Hull and White, 1987;
Heston, 1993) may be the most popular ones both in academia and in practice because of their
flexibility in pricing derivatives and risk management. These models have the following general
dSt
= μt dt + Vt dWt , (1)
St
dVt = α(t, St , Vt )dt + β(t, St , Vt )dZt , (2)
where equations (1) and (2) define the stock price process and the volatility process, respectively,
Wt and Zt are two standard Brownian motions, maybe mutually correlated, and α(·) and β(·) are
However, in statistical analysis of stochastic volatility models, we have two main problems. The
first problem is that variances of relevant stochastic variables are state-dependent, and the second is
that when taking into account derivatives, the pricing formula is non-linear. These problems make
the existent linear techniques (i.e., the standard Kalman filter and smoother) rarely applicable. Even
though nonlinear Kalman filters, such as the extended Kalman filter and the uscented Kalman filter,
can be used, they do not take into account all available information. Furthermore, when the system
becomes highly nonlinear and high-dimensional, the extended Kalman filter may perform poorly
Recently, some works have been done for solving the above issues in non-linear Gaussian mod-
els. Pedersen, Thygesen, and Madsen (2011) introduce a finite-element method-based smoothing
approach. McCausland, Miller, and Pelletier (2011) present a Gaussian simulation smoothing al-
gorithm. This paper presents a smoothing algorithm for a nonlinear Gaussian system based on
Rauch et al. (1965), Anderson and Moore (1979), and Sarkka (2008) using the unscented trans-
2
formation approach, recently developed in the field of engineering (Julier and Uhlman, 1997). The
that undergoes a nonlinear transformation using the so-called sigma points to cover and propagate
information on data. It can reach at least the second-order approximation accuracy. The unscented
Kalman filter (Julier and Uhlman, 1997, 2004; Wan and van der Merve, 2001) is a straightforward
application of the unscented transformation. The unscented transformation can also be applied to
Bayesian optimal smoothing. Wan and van der Merve (2001) present an unscented smoother based
on the two-filter smoother (Fraser and Potter, 1969). However, the two-filter smoother needs strong
unrealistic assumptions and can not in general result in the right estimate (Klaas et al., 2006). The
algorithm presented here follows the forward-backward smoother approach. It first implements a
forward unscented Kalman filter and then evokes a separate backward smoothing pass by only mak-
ing Gaussian approximations in the state but not in the observation space to obtain the smoothing
A simulation study is implemented using the Heston stochastic volatility model (Heston, 1993).
I find that both the unscented Kalman filter and smoother can not capture the volatility dynamics if
we only use the stock price data. The filtered and smoothed volatility clearly deviate from the true
path. However, when we take into account both stock prices and options, the precision of volatility
filtering/smoothing gets improved dramatically. The unscented Kalman smoother performs nearly
the same as the unscented kalman filter when we use the stock price data alone, whereas it performs
much better than the filter whenever both stock prices and options are used.
I apply the above algorithms to the real data on S&P 500 index and index options. Again, I find
that the filtered and smoothed volatility from jointly using stock prices and options is reasonably
much better than that obtained from using stock prices alone. The option pricing performance
shows that the smoothed volatility can generate smaller pricing errors than the filtered one.
The rest of the paper is organized as follows. Section 2 discusses Bayesian optimal filtering
and smoothing and introduces the unscented Kalman filter and smoother for a nonlinear Gaussian
system. Section 3 implements a simulation study using the Heston stochastic volatility model.
Section 4 presents empirical results using S&P 500 index and index options. Finally, Section 5
3
2. Nonlinear Gaussian systems and state extraction
In this section, I first discuss Bayesian optimal filtering and smoothing for a general state-space
model in subsection 2.1 and then introduce the unscented Kalman filter and smoother in subsections
2.3 and 2.4 based on the unscented transformation of subsection 2.2 for a nonlinear Gaussian system.
yt = H(xt , Θ, wt ), (3)
xt = F (xt−1 , Θ, vt ), (4)
where the observation yt is assumed to be conditionally independent given the state xt with the
distribution p(yt |xt ), the state xt is modeled as a Markov process with the initial distribution p(x0 )
and the transition law p(xt |xt−1 ), wt and vt are mutually independent observation noise and state
noise with mean zero and variance Rtw and Rtv , respectively, and Θ is a set of static parameters.
Based on past and current observations, filtering is a process to estimate system’s current state,
that is, to find the filtering distribution p(xt |y1:t ), where y1:t represents the information set up to
time t, for t = 1, 2, . . . , T . Bayesian optimal filtering can be implemented by the following two steps,
Thus, calculation and/or approximation of the prior p(xt |y1:t−1 ), of the likelihood p(yt |xt ), and of
the evidence p(yt |y1:t−1 ) is the essence of Bayesian filtering and inference.
In contrast, Bayesian optimal smoothing is to find the smoothing distribution p(xt |y1:T ) using
all information that is available not only in the past and at the current, but also in the future.
4
There are different smoothing methods, one of which is the two-filter smoother:
where in (7), the first term is our familiar Bayesian filtering procedure, and the second term is called
the backward information filter. However, as shown in Klaas et al. (2006), the computation of this
second term does not in general lead to the right result because it is not a probability density of the
state and thus its integral might not be finite. In practice, it needs strong unrealistic assumptions.
p(xt |y1:T ) = p(xt , xt+1 |y1:T )dxt+1
= p(xt |xt+1 , y1:t )p(xt+1 |y1:T )dxt+1
p(xt+1 |y1:T )p(xt+1 )|xt
= p(xt |y1:t ) dxt+1 , (8)
p(xt+1 |xt )p(xt |y1:t )dxt
where the first term is the filtering density. This is the so-called forward-backward smoother,
which can be implemented as follows: first, find the joint distribution of xt and xt+1 conditional on
Due to the Markov property of the state-space model, we have p(xt |xt+1 , y1:T ) = p(xt |xt+1 , y1:t ).
5
Finally, the smoothing density at time t can be found by:
p(xt |y1:T ) = p(xt , xt+1 |y1:T )dxt+1
= p(xt |xt+1 , y1:T )p(xt+1 |y1:T )dxt+1 , (11)
where the second term in the integral is the smoothing density at time t + 1.
If functions H(·) and F (·) are linear and if Gaussian distributions are assumed for xt , wt and vt ,
the well-known Kalman filter/smoother can be applied, and the optimal solutions are obtainable.
However, when H(·) and F (·) become nonlinear, integrals in filtering and smoothing can not be
solved analytically, and numerical approximations are required. In what follows, I will develop a
smoothing algorithm for a nonlinear Gaussian system using the scaled unscented transformation,
The scaled unscented transformation (SUT) is a method for calculating statistics of a random
variable that undergoes a nonlinear transformation (Julier and Uhlman, 1997). For a nonlinear
function:
y = f (x), (12)
assume that the mean and covariance of x (with dimension L) are x̄ and Px . The basic idea of SUT
is that the mean and covariance of y can be computed by forming a set of 2L + 1 sigma points χ:
χ0 = x̄, (13)
χi = x̄ + ( (L + λ)Px )i , i = 1, . . . , L, (14)
χi = x̄ − ( (L + λ)Px )i−L , i = L + 1, . . . , 2L, (15)
points around x̄ and is usually set to be a small positive value, and κ is a second scaling parameter
with value set to 0 or 3 − L. These sigma points are propagated through the nonlinear function f :
6
The mean and covariance of y are then approximated with a weighted sample mean and covariance
2L
2L
(m) (c)
ȳ = wi Yi , Py = wi (Yi − ȳ)(Yi − ȳ) , (17)
i=0 i=0
(m) λ (c) λ
w0 = , w0 = + (1 − α2 + β), (18)
L+λ L+λ
(m) (c) 1
wi = wi = , i = 1, 2, . . . , 2L, (19)
2(L + λ)
and superscripts (m) and (c) indicate that weights are for construction of the posterior mean and
covariance, respectively, and β is a covariance correction parameter and is used to incorporate prior
knowledge of x.
The scaled unscented transformation can approximate posterior mean and covariance with ac-
curacy up to third order for Gaussian inputs, and for non-Gaussian inputs, the accuracy can be
reached at least second order, with accuracy of third and higher order determined by parameters
α and β. Typical values for κ, α and β are 0, 10−3 and 2, respectively. These values should suffice
The unscented Kalman filter (UKF) is a straightforward application of the scaled unscented
transformation. It is proposed by Julier and Uhlman (1997, 2004). This nonlinear filter does not
explicitly approximate or linearize the nonlinear observation and state models. It uses the true
nonlinear models and updates state variables through a set of deterministic sigma points generated
To implement the unscented Kalman filter, we first concatenate the state xt−1 , the observation
xet−1 = xt−1 wt−1 vt−1 , (20)
7
observation noise, and the state noise, respectively, and whose mean and covariance are:
⎡ ⎤
x
Pt−1|t−1 0 0
⎢ ⎥
⎢ ⎥
x̂et−1|t−1 =E xet−1 , e
Pt−1 =⎢
⎢ 0 w
Rt−1 0 ⎥.
⎥ (21)
⎣ ⎦
0 0 v
Rt−1
We then use the scaled unscented transformation to form a set of 2L + 1 sigma points:
χet−1 = x̂et−1|t−1 x̂et−1|t−1 + e
(L + λ)Pt−1 x̂et−1|t−1 − e
(L + λ)Pt−1 , (22)
and the corresponding weights are given in (18) and (19). With these sigma points, we implement
the non-linear Kalman filter as follows, for the time prediction step:
2L
(m) x
χxt|t−1 = F (χxt−1 , χvt−1 ), x̂t|t−1 = wi χi,t|t−1 ,
i=0
2L
x (c)
Pt|t−1 = wi (χxi,t|t−1 − x̂t|t−1 )(χxi,t|t−1 − x̂t|t−1 ) ,
i=0
2L
(m)
Yt|t−1 = H(χxt|t−1 , χw
t|t−1 ), ŷt|t−1 = wi Yi,t|t−1 ,
i=0
2L
y (c)
Pt|t−1 = wi (Yi,t|t−1 − ŷt|t−1 )(Yi,t|t−1 − ŷt|t−1 ) ,
i=0
2L
xy (c)
Pt|t−1 = wi (χxi,t|t−1 − x̂t|t−1 )(Yi,t|t−1 − ŷt|t−1 ) ,
i=0
xy y
x̂t|t = x̂t|t−1 + Pt|t−1 (Pt|t−1 )−1 (yt − ŷt|t−1 ),
x x xy y y xy y
Pt|t = Pt|t−1 − [Pt|t−1 (Pt|t−1 )−1 ]Pt|t−1 [Pt|t−1 (Pt|t−1 )−1 ] .
t = 1, 2, . . . , T .
8
2.4. The unscented Kalman smoother
The Bayesian optimal forward-backward smoother presented in subsection (2.1) can also be
approximated using the scaled unscented transformation. At each time t, the filtering density
p(xt |y1:t ) and the predictive density p(xt+1 |y1:t ) can be found by UKF and SUT, respectively, and
they are assumed to be normal. Using the basic property of the Gaussian distribution and the
Markov property of the system leads to the density p(xt |xt+1 , y1:t ) = p(xt |xt+1 , y1:T ), which is also
normal. Assuming that the smoothing density of time t + 1 is known and normal xt+1 |y1:T →
N (x̂st+1 , Pxst+1 ) at time t, we can then find the joint normal distribution p(xt , xt+1 |y1:T ), from which
A single-step smoothing recursion can be performed as follows: first, concatenate the state xt and
the state noise vt at time t and form sigma points for the augmented random variable x̃t = (xt , vt ) :
χ̃t = ˆt|t x̃
x̃ ˆt|t + (L + λ)P̃t ˆt|t −
x̃ (L + λ)P̃t , (23)
where ⎡ ⎤ ⎡ ⎤
x̂t ⎥ x
Pt|t 0 ⎥
ˆt|t = ⎢
x̃ ⎣ ⎦,
⎢
P̃t = ⎣ ⎦.
0 0 Rtv
Second, propagate the sigma points through the nonlinear state function and compute the
predicted values:
2L
(m) x
χ̃xt+1|t = F (χ̃xt , χ̃vt ), ˆt+1|t =
x̃ wi χ̃i,t+1|t ,
i=0
2L
x (c)
P̃t+1|t = wi (χ̃xi,t+1|t − x̃
ˆt+1|t )(χ̃x
i,t+1|t − x̃t+1|t ) ,
ˆ
i=0
2L
(c)
C̃t+1 = wi (χ̃xi,t+1|t − x̃
ˆt+1|t )(χ̃xi,t − x̂t|t ) ,
i=0
where C̃t+1 is the unscented transformation-based Gaussian approximation to the covariance be-
9
Finally, compute the smoothed mean and covariance of the state:
Pxst x
= Pt|t x
+ C̃t+1 (P̃t+1|t )−1 (Pxst+1 − P̃t+1|t
x x
)(C̃t+1 (P̃t+1|t )−1 ) .
Because the smoothing density is the same as the filtering density at final time T , the above
smoothing recursion starts from the last step with x̂sT = x̂T |T and PxsT = PTx|T and proceeds
backward to the initial time, for t = T, T − 1, . . . , 1. This smoothing algorithm has its origin from
the linear technique in the sense of Anderson and Moore (1979, p.189) and Hamilton (1994, p.394).
When the system becomes linear, the above algorithm is exactly the same as the linear smoother.
3. Simulation study
Subsection 3.1 briefly introduces the Heston stochastic volatility model. Subsection 3.2 constructs
the state-space representation. Subsection 3.3 conducts simulations and discusses their implications.
The Heston stochastic volatility model (Heston, 1993) is one of the most popular models both
in academia and in practice. Under a given probability space (Ω, F, P ) and the complete filtration
dSt
= (r + πW Vt )dt + Vt dWt , (24)
St
dVt = κ(θ − Vt )dt + σ Vt dZt , (25)
where r is a constant risk-free interest rate, πW is the market price of diffusion risk, κ is the
volatility mean reversion parameter, θ is the long-run volatility parameter, and σ captures volatility
of volatility. Wt and Zt are two correlated Brownian motions with a correlation parameter ρ ∈
Assume that there exists an Equivalent Martingale measure Q, under which the risk-neutral
10
model is defined as:
dSt
= rdt + Vt dWtQ , (26)
St
dVt = κ(θ − Vt ) − πV Vt dt + σ Vt dZtQ , (27)
where WtQ and ZtQ are two Brownian motions under the risk-neutral measure and are correlated
with the same parameter ρ, and πV is the market price of volatility risk.
For this risk-neutral model, the conditional characteristic function of log returns Rt = ln(St /St−τ )
can be derived using approaches of Duffie, Pan, and Singleton (2000) and Carr and Wu (2004). It
where
κθ (γ − κ∗ )(1 − e−γτ )
A(u, τ ) = 2 log 1 − + (γ − κ∗ )τ ,
σ2 2γ
2ϕ(u)(1 − e−γτ )
B(u, τ ) = ,
2γ − (γ − κ∗ )(1 − e−γτ )
γ = κ∗2 + 2σ 2 ϕ(u),
1
ϕ(u) = (iu + u2 ),
2
κ∗ = (κ + πV ) − iuρσ.
With this conditional characteristic function, we can compute European-type options with the fast
Fourier transform method (Carr and Madan, 1997). In this paper, I use a more efficient algorithm,
state-space representation. I first de-correlate the stock price and volatility for applicability of
Bayesian filtering and smoothing methods. Taking into account [dWt , dZt ] = ρdt, I rewrite the
11
stock price process (24) and the volatility process (25) as follows:
1
d ln St = r + πW Vt − Vt dt + ρ Vt dZt + 1 − ρ2 Vt dWt∗ , (29)
2
1
Vt dZt = dVt − κ(θ − Vt )dt , (30)
σ
where Wt∗ and Zt are independent. Note that Brownian motion Wt∗ in (29) is different from that
1 ρ
d ln St = rdt + πW − Vt dt + dVt − κ(θ − Vt )dt + 1 − ρ2 Vt dWt , (31)
2 σ
which can be regarded as one of observation equations. The volatility process (25) is regarded as a
state equation. Now we can see that the noise in (31) is independent of that in (25).
Taking into consideration options and discretizing the model with a time interval τ , we then
Measurement:
ρ ρ
ln St = ln St−τ + (r − κθ)τ + Vt
σ σ
ρ 1
+ (κτ − 1) + (πW − )τ Vt−τ + 1 − ρ2 τ Vt−τ wt (32)
σ 2
ytO = f (St , Vt , Θ) + O
t , (33)
State:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
√
⎜ Vt ⎟ ⎜ κθτ ⎟ ⎜ 1 − κτ 0 ⎟ ⎜ Vt−τ ⎟ ⎜ σ τ Vt−τ ⎟
⎝ ⎠ = ⎝ ⎠+⎝ ⎠⎝ ⎠+⎝ ⎠ zt , (34)
Vt−τ 0 1 0 Vt−2τ 0
where both Vt and Vt−τ are regarded as states, options ytO are assumed to be observed with i.i.d
option price computed from the model. In this state-space model, the option pricing measurement
12
Returns BS Imp. Vol.
0.05 0.3
0 0.2
−0.05 0.1
0 100 200 300 400 500 0 100 200 300 400 500
0.2 0.2
0.1 0.1
0 100 200 300 400 500 0 100 200 300 400 500
0.2 0.2
0.1 0.1
0 100 200 300 400 500 0 100 200 300 400 500
Note: Upper panels present the simulated stock returns (left) and the BS implied volatility (right) of the simulated
at-the-money short maturity call options. The true parameters are Θ = (3, 0.04, 0.3, −0.6). The middle panels plot
the filtered volatility (left) and the smoothed volatility (right) when using stock prices alone. The lower panels show
the filtered volatility (left) and the smoothed volatility (right) when jointly using stock prices and options. The dots
represents the true values and the lines the estimated ones.
For demonstration, I assume zero risk premia (πW = 0, and πV = 0). Under this setting, we have
parameters Θ = {κ, θ, σ, ρ} and the state xt = Vt . In simulations, 500 daily observations of stock
prices and options are simulated with the initial values S0 = 100, V0 = 0.04 and the true parameters
Θ∗ = {3.00, 0.04, 0.30, −0.60}, which are the typical values obtained from the empirical studies.
The simulated options are those of at-the-money short maturity calls with strike equal to the stock
price and with maturity equal to one month. I assume that the risk-free interest rate is known and
equal to 5% and that option prices are contaminated by the measurement noise O 2
t → N (0, σO ),
where σO is set to 10%. The upper panels of Figure 1 present a sequence of simulated stock returns
and a series of the Black-Scholes implied volatility of the simulated at-the-money short maturity
call options.
I first implement the unscented Kalman filter and smoother using the stock price data alone to
see whether stock prices contain enough information to effectively capture the volatility dynamics.
13
Returns (κ = 0.5) Returns (κ = 10)
0.05 0.05
0 0
−0.05 −0.05
0 100 200 300 400 500 0 100 200 300 400 500
Filtering Filtering
0.3 0.3
0.2 0.2
0.1 0.1
0 100 200 300 400 500 0 100 200 300 400 500
Smoothing Smoothing
0.3 0.3
0.2 0.2
0.1 0.1
0 100 200 300 400 500 0 100 200 300 400 500
Note: Figure presents the filtered and smoothed volatility for a persistent volatility process (left panels) and
a non-persistent volatility process (right panels). The former is simulated using the true parameters Θ =
(0.5, 0.04, 0.30, −0.60), and the latter is simulated using the true parameters Θ = (10, 0.04, 0.30, −0.60). Upper
panels are the simulated stock returns, middle panels the filtered volatility and lower panels the smoothed volatility.
The dots represents the true values and the lines the estimated ones.
The middle panels of Figure 1 plot the filtered volatility (left) and the smoothed volatility (right).
We find that both the filtered and smoothed volatility estimates clearly deviate from the true path.
I then combine the data on stock prices and options and see whether the latter are helpful
for volatility extraction. The lower panels of Figure 1 present the filtered volatility (left) and the
smoothed volatility (right). The main observations are that the precision of volatility extraction
has dramatically improved both in filtering and in smoothing and that the smoothed volatility has
smaller variation than the filtered volatility. Why can we achieve so much improvement? Eco-
nomically, options contain different information from stock prices. The traded options encode the
assessment of market participants of the volatility risk and therefore reflect the expectation of fu-
ture market movements. They are forward-looking and implicitly embody all available information.
However, stock prices mainly contain the historical information. Statistically, the volatility factor
enters into the option pricing formula in a non-linear form, which can further help us pin down
volatility in estimation.
14
Returns (σ = 0.1) Returns (σ = 0.8)
0.05 0.06
0.03
0 0
−0.03
−0.05 −0.06
0 100 200 300 400 500 0 100 200 300 400 500
Filtering Filtering
0.25
0.4
0.2 0.3
0.2
0.15
0.1
0.1 0
0 100 200 300 400 500 0 100 200 300 400 500
Smoothing Smoothing
0.25
0.4
0.2 0.3
0.2
0.15
0.1
0.1 0
0 100 200 300 400 500 0 100 200 300 400 500
Note:Figure presents the filtered and smoothed volatility for a low volatility of volatility process (left panels)
and a high volatility of volatility process (right panels). The former is simulated using the true parameters
Θ = (3, 0.04, 0.10, −0.60), and the latter is simulated using the true parameters Θ = (3, 0.04, 0.80, −0.60). Upper
panels are the simulated stock returns, middle panels the filtered volatility and lower panels the smoothed volatility.
The dots represents the true values and the lines the estimated ones.
The parameter κ controls how fast volatility mean-reverts to its long-run mean θ. The small κ
implies that the volatility process is very persistent, whereas the large κ indicates that the volatility
shock on returns dissipates very quickly. I investigate these different cases and see whether the
unscented Kalman filter/smoother can capture these different persistence of volatility. Figure 2
presents the true and filtered/smoothed volatility for a persistent process (κ = 0.5, left panels) and
a non-persistent one (κ = 10, right panels) using both stock prices and options. We find that for
these different volatility processes, the unscented Kalman filter/smoother can efficiently capture the
true values. We again observe that the smoothed volatility is less noisy than the filtered one.
volatility can have a big change within a small time interval even under a persistent case. I also
study cases with different values of σ. Figure 3 presents results for a low volatility of volatility
(σ = 0.1, left panels) process and a high volatility of volatility process (σ = 0.8, right panels). We
find that for the low volatility of volatility case, there is no problem to extract the true volatility
15
Table 1: Monte Carlo Study
Note: Table presents means of root mean square errors across 200 Monte Carlo simulations for different volatility
processes. The persistent volatility process is simulated using the true parameters Θ = (0.5, 0.04, 0.30, −0.60), and
the non-persistent process is simulated using the same parameters as the persistent process except κ, which has a
value of 10. The low volatility of volatility process is simulated using the true parameters Θ = (3, 0.04, 0.1, −0.60),
and the high volatility of volatility process is simulated using the same parameters as the low volatility of volatility
process except σ, which has a value of 0.8. EKF/EKS stand for the extended Kalman filter/smoother, UKF/UKS
for the unscented Kalman filter/smoother, and PF/PS for the bootstrap particle filter/smoother. The last column
presents the average relative computational time where the computational time of the UKS is normalized to 1.
path using the unscented Kalman filter and/or the unscented Kalman smoother. However, we find
that for the high volatility of volatility case, the unscented Kalman filter meets a problem to filter
the small volatility values around data point 420. But the unscented Kalman smoother can mostly
correct this problem since the smoother uses all available information, whereas the filter only uses
(UKF/UKS) for volatility extraction, I implement 200 Monte Carlo simulations for different scenar-
ios discussed before. For the purpose of comparison, I also present the performance of the extended
Kalman filter/smoother (EKF/EKS; Cox, 1964) and the bootstrap particle filter/smoother (PF/PS;
Klaas et al., 2006). In EKF/EKS, I use the central-difference method to approximate derivatives,
and in PF/PS, I use 5000 particles when using stock prices alone and 1000 particles when using both
stock prices and options. Table 1 presents means of root mean square errors (RMSE) of volatil-
ity estimates and the relative computational time of each algorithm. The following findings are
obtained for the Heston stochastic volatility model. First, the unscented Kalman filter/smoother
16
perform nearly the same as the extended Kalman filter/smoother whether we use stock prices
alone or jointly use stock prices and options. Second, the unscented/extended Kalman smoothers
outperform the unscented/extended Kalman filters. The improvement of the smoothers upon the
filters is much more dramatic when jointly using stock prices and options. Third, with compar-
ison to the particle filter/smoother, UKF/UKS (EKF/EKS) underperform PF/PS when we use
stock prices alone. However, when we take into account both stock prices and options, the particle
filter performs almost the same as UKF/EKF but underperforms UKS/EKS. Fourth, computa-
tionally, EKF/EKS is slightly faster than UKF/UKS. However, PF/PS are highly computationally
demanding. In particular, when we use both stock prices and options, the particle smoother is too
computationally demanding to be practically feasible. Finally, the reasons that EKF/EKS perform
as well as UKF/UKS are that in the Heston stochastic volatility model, the state system is linear
and one-dimensional, and options are priced using exponential affine characteristic function.
4. Empirical applications
This section investigates the unscented Kalman smoother with the real data on S&P 500 index
and index options. Subsection 4.1 describes the data, subsection 4.2 presents the filtered and
smoothed volatility, and subsection 4.3 discusses the option pricing implications.
4.1. Data
The data used are S&P 500 index and index options traded in the Chicago Board Options
Exchange (CBOE) during the period from January, 1996 to September, 2008. They are in weekly
frequency and are those traded on Wednesday. If Wednesday is a holiday, we select Thursday
options. There are in total 664 weeks. The data are obtained from OptionMetrics.
For purpose of volatility extraction, I construct two sets of options. One is the at-the-money
short maturity (ATM-SM) calls with maturity greater than 15 days and less than 50 days and with
moneyness (S/K) closest to 1, and the other is the out-of-money short maturity (OTM-SM) calls
with maturity greater than 15 days and less than 50 days and with moneyness (S/K) closest to
0.95. Now we have three sets of observations: one set of stock prices and two sets of option prices.
Table 2 presents summary statistics of index returns (panel A) and the constructed options (panel
B).
17
Table 2: Descriptive Statistics of the Data and Parameter Estimates
B. Constructed Calls
Mean Mn. Std Mn. Mean Mt. Std Mt. Mean IV. Std IV.
C. Parameter Estimates
κ θ σ ρ πV μ
Note: Table presents the descriptive statistics of weekly data on the S&P 500 index and index options from January
1996 to September 2008 and the parameter estimates of the Heston stochastic volatility model. In panel A, the mean
and standard deviation are annualized. In panel B, Mn stands for moneyness, Mt for maturity (in days), and IV
for the Black-Scholes implied volatility. In panel C, parameters are estimated using a EM algorithm based on the
unscented Kalman smoother.
When investigating option pricing implications with the filtered/smoothed volatility, I implement
the following filters to the dataset. First, we only consider call options. Second, in order to ensure
that options are liquid enough, we select call options with maturity less than 1.5 years and with
moneyness greater than 0.90 and less than 1.03. Furthermore, I rule out options with zero trading
volume and with open interest less than 100 contracts. Lastly, I exclude call options with maturity
less than 10 days and best bid prices less than 3/8 dollar to mitigate market microstructure problems.
As a result, there are 28,557 call options in total and 43 options each day on average.
The Heston stochastic volatility model discussed in the previous section is again taken as an
example. I first obtain parameter estimates using a likelihood-based method. Panel C of Table 2
presents parameter estimates. The objective mean-reverting parameter is about 2.4, the long-run
mean of the volatility process is around 20%, and the volatility of volatility is 0.33. All of them are
highly significant. The negative estimate of the volatility risk premium (not significant) indicates
that the risk-neutral volatility process is much more persistent than the objective one. With these
18
Returns BS Imp. Vol.
0.1 0.45
0.05 0.35
0 0.25
−0.05 0.15
−0.1 0.05
1996 1999 2002 2005 2008 1996 1999 2002 2005 2008
0.3 0.3
0.2 0.2
0.1 0.1
0 0
1996 1999 2002 2005 2008 1996 1999 2002 2005 2008
−0.05
0
−0.15
−0.25 −0.02
1996 1999 2002 2005 2008 1996 1999 2002 2005 2008
Note: Figure presents the filtered and smoothed volatility for the Heston stochastic volatility model using real data
on the S&P 500 index and index returns. The upper panels are the index returns (left) and the average BS implied
volatility (right). The middle left panel plots the filtered volatility when using stock prices alone (dashed line) and
when jointly using stock prices and options (solid line). The middle right panel plots the smoothed volatility when
using stock prices alone (dashed line) and when jointly using stock prices and options (solid line). The lower left panel
shows difference between the filtered volatility and the smoothed one when using stock prices alone. The lower right
shows difference between the filtered volatility and the smoothed one when jointly using stock prices and options.
Figure 4 presents the filtered/smoothed volatility using the S&P 500 index and/or index options.
The top panels are index returns and the average Black-Scholes implied volatility, and the middle
panels present the filtered (left panel) and smoothed (right panel) volatility using stock prices
(dashed line) and jointly using stock prices and options (solid line). With comparison to the
evolution of index returns and the Black-Scholes implied volatility, we can clearly see that the
filtered and smoothed results from jointly using stock prices and options are reasonably much
The lower panels plot differences between the filtered and smoothed volatility obtained from
using stock prices alone (left panel) and from jointly using stock prices and options (right panel).
We note that there can be a very big difference between the filtered and smoothed volatility when
only using stock prices, whereas more information can make this difference small.
We also notice that even though the smoothed volatility has less variation than the filtered one,
19
Table 3: Ratios of Absolute Option Pricing Errors
Moneyness (S/K)
Maturity Ratio 0.90-0.97 0.94-0.97 0.97-1.00 1.00-1.03
Note: Table presents ratios of option pricing errors. VFJ/VFS (VSJ/VSS) represents the ratio of the option pricing
error using the filtered (smoothed) volatility obtained from jointly using stock prices and options and from using stock
prices alone. VSJ/VFJ represents the ratio of the option pricing error using the smoothed volatility and the filtered
volatility when jointly using stocks and options.
they look very similar to each other, especially those obtained from jointly using both datasets.
This is because the volatility process is model with a diffusion process, is persistent and can not
We now investigate option pricing implications. I divide the constructed call options into 12
groups whose maturities are 10-60 days, 60-180 days and larger than 180 days, and whose moneyness
are 0.90-0.94, 0.94-0.97, 0.97-1.00 and 1.00-1.03. I then compute the absolute option pricing error
using the filtered and smoothed volatility obtained from the previous subsection. The absolute
T nt
1 im
Pti − Pti ,
Aerr = (35)
N t=1
i=1
where N is the total number of options we consider, T the number of weeks, nt the number of
options at date t, and Ptiim and Pti are the model-implied and the market-observed option prices of
Table 3 presents ratios of the absolute option pricing errors. VFJ/VFS (VSJ/VSS) represents
the ratio of the option pricing error using the filtered (smoothed) volatility obtained from jointly
using stock prices and options and from using stock prices alone. VSJ/VFJ represents the ratio of
20
the option pricing error using the smoothed volatility and the filtered volatility when jointly using
stock prices and options. It is very clear that all of ratios of VFJ/VFS and VSJ/VSS are small
(from 0.407 to 0.592 and from 0.406 to 0.565), indicating that the volatility obtained from jointly
using stock prices and options is much more accurate than that obtained from using stock prices
alone. In most groups, the ratios of VSJ/VFJ are smaller than 1, implying that volatility obtained
from the smoother is better than that obtained from the filter. We have three ratios of VSJ/VFJ
larger than one. This may be because of the model misspecification problem as we know that the
Heston model works poorly for short maturity and out-of-the-money options.
5. Conclusion
This paper proposes a smoothing algorithm based on the unscented transformation and shows
how it can be used for volatility extraction in diffusion models. The algorithm firstly implements a
forward unscented Kalman filter and then provokes a separate backward smoothing pass to obtain
the smoothing solution for nonlinear systems. Simulation study and empirical applications with
the Heston stochastic volatility model indicate that in order to accurately capture the volatility
dynamics, both stock prices and options are necessary. The paper also finds that volatility obtained
from the smoother can in general result in smaller option pricing errors than that obtained from
the filter.
The rapid development of sequential Monte Carlo methods (particle filters/smoothers) paves
another way to statistically analyze the stochastic volatility models. Particle methods are very
general and can be applied to any nonlinear and/or non-Gaussian models. However, in practice,
the high computational cost in option pricing makes it impossible to choose large number of particles,
and this may make particle methods less efficient. The paper provides a practically efficient method
References
[1] Anderson, D.O., Moore, J.B. 1979. Optimal Filtering. New Jersey: Prentice-Hall.
21
[3] Carr, P., Madan, D.B., 1999. Option valuation using the fast Fourier transform. Journal of
[4] Carr, P., Wu, L., 2004. Time-changed Lévy processes and option pricing. Journal of Financial
[5] Chourdakis, K., 2005. Option pricing using the fractional FFT. Journal of Computational
Finance 8, 1-18.
[6] Cox, H., 1964. On the estimation of state variables and parameters for noisy dynamic systems.
[7] Duffie, D., Pan, J., Singleton, K., 2000. Transform analysis and asset pricing for affine jump-
[8] Engle, R. 1982. Autoregressive conditional heteroscedasticity with estimates of the variance
[9] Fraser, D., Potter, J., 1969. The optimum linear smoother as a combination of two optimum
[10] Hamilton, J.D. 1994. Time Series Analysis. New Jersey: Princeton University Press.
[11] Heston, S.L., 1993. A closed-form solution for options with stochastic volatility with applica-
[12] Hull, J., White, A., 1987. The pricing of options on assets with stochastic volatilities. Journal
[13] Julier, S.J., Uhlmann, J.K., 1997. A new extension of the Kalman filter to nonlinear systems.
[14] Julier, S.J., Uhlmann, J.K., 2004. Unscented filtering and nonlinear estimation. Proceedings
[15] Klaas, M., Bries, M., de Freitas, N., Doucet, A., Maskell, S., Lang, D., 2006. Fast particle
22
[16] McCausland, W.J., Miller, S., Pelletier, D. 2011. Simulation smoothing for state-space models:
A computational efficiency analysis. Computational Statistics and Data Analysis 55, 199-212.
[17] Pedersen, M.W., Thygesen, U.H., Madsen, H. 2011. Nonlinear tracking in a diffusion process
with a Bayesian filter and the finite element method. Computational Statistics and Data
[18] Rauch, H.E., Tung, F., Striebel, C.T., 1965. Maximum likelihood estimates of linear dynamic
[19] Sarkka, S., 2008. Unscented Rauch-Tung-Striebel smoother. IEEE Transactions on Automatic
[20] Shephard, N., 2005. Stochastic Volatility: Selected Readings. Oxford University Press: Ox-
ford.
[21] Wan, E., van der Merwe, R., 2001. The unscented Kalman filter. In: Haykin, S. (ed), Kalman
Filtering and Neural Networks. John Wiley & Sons, New York.
23