Business Analytics and Forecasting
DS 580
Farideh Dehkordi-Vakil
Introduction
Recall that extrapolative methods of forecasting
focus on a single time series to identify past
patterns in the historical data.
These patterns are then extrapolated to map out
the likely future path of the series.
Introduction
Note that, the past and present values are
already observed, where as the future values
are unknown and represent random
variables.
We do not know their values but we can
describe them in terms of a set of possible
values and the associated probabilities.
Introduction
This figure shows a
time series observed
for time period 1-12
and we would like to
make a forecast for
period 13-20.
Note the increase in
uncertainty as the
forecast horizon
increases.
Introduction
It is important to know both the forecast
origin and for how many periods a head the
forecast is being made.
Extrapolation of the Mean Value
Averaging methods
If a time series is generated by a constant process
subject to random error, then mean of the past values is
a useful statistics and can be used as a forecast for the
next period.
Averaging methods are suitable for stationary time
series data where the series is in equilibrium around a
constant value ( the underlying mean) with a constant
variance over time.
Averaging Methods
The Mean
Uses the average of all the historical data as the
forecast
1 t
Ft 1
t
i 1
When new data becomes available , the forecast for
1
time t+2 is the new mean
the previously
1 t including
Ft 2 this
new
yobservation.
observed data plus
i
t 1
i 1
This method is appropriate when there is no noticeable
trend or seasonality.
Averaging Methods
How do you describe this weekly
sales?
Suppose we are at week 26 and want to
forecast sales for the next few week. Should
use the average of all the 26 weeks available?
Moving Average Method
The moving average for time period t is the mean
of the k most recent observations.
A moving average of order k, MA(k) is the value
of k consecutive observations.
Ft 1 y t 1
( yt yt 1 yt 2 yt k 1 )
K
1 t
Ft 1
yi
k i t k 1
K is the number of terms in the moving average.
Moving Average Method
Some care should be taken in choosing the span k
for a moving average forecast model.
As a general rule, large spans smooth the time
series more than smaller spans by averaging many
ups and down in each calculation.
The smaller the number k, the more weight is
given to recent periods.
The greater the number k, the less weight is given
to more recent periods.
Moving Averages
A large k is desirable when there are wide,
infrequent fluctuations in the series.
A small k is most desirable when there are
sudden shifts in the level of series.
For seasonal data, the length of the season
is often used for the value of k.
Moving Average Method
For monthly data, a 12-month moving average,
MA(12), eliminate or averages out seasonal effect.
Moving average method
Assigns equal weight to each observation used in the
calculation.
As more information become available, new data point
will be included in the calculation and the oldest data
point will be discarded.
The moving average model does not handle trend or
seasonality very well although it can do better than the
total mean
Moving Averages
The following figure shows that the MA(3) adapt more quickly to
movements in the series while MA(7) produces a greater degree of
smoothing.
Example: Weekly Department Store Sales
The weekly sales
figures (in millions of
dollars) presented in
the following table are
used by a major
department store to
determine the need for
temporary sales
personnel.
Period (t)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Sales (y)
5.3
4.4
5.4
5.8
5.6
4.8
5.6
5.6
5.4
6.5
5.1
5.8
5
6.2
5.6
6.7
5.2
5.5
5.8
5.1
5.8
6.7
5.2
6
5.8
Example: Weekly Department Store Sales
Weekly Sales
8
Sales
Sales (y)
0
0
10
15
Weeks
20
25
30
Example: Weekly Department Store Sales
Use a three-week moving average (k=3) for
the department store sales to forecast for the
week 24 and 26.
y 24
( y23 y22 y21 ) 5.2 6.7 5.8
5.9
3
3
The forecast error is
e24 y24 y 24 6 5.9 .1
Example: Weekly Department Store Sales
The forecast for the week 26 is
y 26
y25 y24 y23 5.8 6 5.2
5.7
3
3
Example: Weekly Department Store Sales
RMSE = 0.63
Weekly Sales Forecasts
Sales
5
Sales (y)
forecast
0
0
10
15
Weeks
20
25
30
Period (t)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Sales (y) forecast
5.3
4.4
5.4
5.8
5.033333
5.6
5.2
4.8
5.6
5.6
5.4
5.6
5.333333
5.4
5.333333
6.5
5.533333
5.1
5.833333
5.8
5.666667
5
5.8
6.2
5.3
5.6
5.666667
6.7
5.6
5.2
6.166667
5.5
5.833333
5.8
5.8
5.1
5.5
5.8
5.466667
6.7
5.566667
5.2
5.866667
6
5.9
5.8
5.966667
5.666667
Exponential Smoothing Methods
This method provides an exponentially
weighted moving average of all previously
observed values.
Appropriate for data with no predictable
upward or downward trend.
The aim is to estimate the current level and
use it as a forecast of future value.
Simple Exponential Smoothing Method
Formally, the exponential smoothing equation is
Ft 1 yt (1 ) Ft
forecast for the next period.
= smoothing constant.
yt = observed value of series in period t.
Ft = old forecast for period t.
The forecast Ft+1 is based on the most recent
observation yt with a weight and weighting the most
recent forecast Ft with a weight of 1-
Ft 1
Simple Exponential Smoothing Method
The implication of exponential smoothing
can be better seen if the previous equation
is expanded by replacing Ft with its
components as follows:
Ft 1 yt (1 ) Ft
yt (1 )[ yt 1 (1 ) Ft 1 ]
yt (1 ) y t 1 (1 ) 2 Ft 1
Simple Exponential Smoothing Method
If this substitution process is repeated by
replacing Ft-1 by its components, Ft-2 by its
components, and so on the result is:
Ft 1 yt (1 ) y t 1 (1 ) 2 y t 2 (1 ) 3 y t 3 (1 )t 1 y1
Therefore, Ft+1 is the weighted moving
average of all past observations.
Simple Exponential Smoothing Method
The following table shows the weights assigned to
past observations for = 0.2, 0.4, 0.6, 0.8, 0.9
Simple Exponential Smoothing Method
The exponential smoothing equation
rewritten in the following form elucidate the
role of weighting factor .
Ft 1 Ft ( yt Ft )
Exponential smoothing forecast is the old
forecast plus an adjustment for the error that
occurred in the last forecast.
Effect of Different Weights
0.6
0.5
0.4
Weight
0.3
0.2
0.1
0
Lag
Simple Exponential Smoothing Method
Choosing the smoothing constant in the
exponential smoothing model is similar to
choosing the span k in the moving average model.
They both related to the smoothness of the model.
Smaller values of correspond to greater smoothing of
the ups and downs in the time series.
Larger values of put most of the weight on the most
recent observed values, so the forecasts tend to follow
the ups and downs of the series more closely.
Simple Exponential Smoothing Method
The value of smoothing constant must be
between 0 and 1
can not be equal to 0 or 1.
If stable predictions with smoothed random
variation is desired then a small value of is
desire.
If a rapid response to a real change in the pattern
of observations is desired, a large value of is
appropriate.
Simple Exponential Smoothing Method
To estimate , Forecasts are computed for
equal to .1, .2, .3, , .9 and the sum of
squared forecast error is computed for each.
The value of with the smallest RMSE is
chosen for use in producing the future
forecasts.
Simple Exponential Smoothing Method
To start the algorithm, we need F1 because
F2 y1 (1 ) F1
Since F1 is not known, we can
Set the first estimate equal to the first observation.
Use the average of a number of initial observations.
the first three or four up to 12 or even the mean of the whole
sample can be used.
When either sample size or is large, the choice of starting
value is relatively unimportant.
Example:University of Michigan Index
of Consumer Sentiment
University of Michigan
Index of Consumer
Sentiment for
January1995December1996.
we want to forecast the
University of Michigan
Index of Consumer
Sentiment using Simple
Exponential Smoothing
Method.
Date
Observed
Jan-95
97.6
Feb-95
95.1
Mar-95
90.3
Apr-95
92.5
May-95
89.8
Jun-95
92.7
Jul-95
94.4
Aug-95
96.2
Sep-95
88.9
Oc t-95
90.2
Nov-95
88.2
Dec-95
91
Jan-96
89.3
Feb-96
88.5
Mar-96
93.7
Apr-96
92.7
May-96
94.7
Jun-96
95.3
Jul-96
94.7
Aug-96
95.3
Sep-96
94.7
Oc t-96
96.5
Nov-96
99.2
Dec-96
96.9
Jan-97
Example:University of Michigan Index
of Consumer Sentiment
Since no forecast is
available for the first
period, we will set the
first estimate equal to
the first observation.
We try =0.3, and
0.6.
University of Michigan Index of Consumer
Sentiment
100
Consumer Sentiment Index
98
96
94
92
90
88
86
Sep-94
Apr-95
Oct-95
May-96
Date
Dec-96
Jun-97
Example:University of Michigan Index
of Consumer Sentiment
Note the first forecast is
the first observed value.
The forecast for Feb. 95 (t
= 2) and Mar. 95 (t = 3)
are evaluated as follows:
y t 1 y t ( yt y t )
y 2 y1 0.6( y1 y1 ) 97.6 0.6(97.6 97.6) 97.6
y 3 y 2 0.6( y2 y 2 ) 97.6 0.6(95.1 97.6) 96.1
Date
Jan-95
Feb-95
Mar-95
Apr-95
May-95
Jun-95
Jul-95
Aug-95
Sep-95
Oct-95
Nov-95
Dec-95
Jan-96
Feb-96
Mar-96
Apr-96
May-96
Jun-96
Jul-96
Aug-96
Sep-96
Oct-96
Nov-96
Dec-96
Jan-97
Feb-97
Mar-97
Apr-97
May-97
Jun-97
Jul-97
Aug-97
Sep-97
Oct-97
Nov-97
Dec-97
Consumer Sentiment
97.6
95.1
90.3
92.5
89.8
92.7
94.4
96.2
88.9
90.2
88.2
91
89.3
88.5
93.7
92.7
89.4
92.4
94.7
95.3
94.7
96.5
99.2
96.9
97.4
99.7
100
101.4
103.2
104.5
107.1
104.4
106
105.6
107.2
102.1
Alpha =0.3
#N/A
97.60
96.85
94.89
94.17
92.86
92.81
93.29
94.16
92.58
91.87
90.77
90.84
90.38
89.81
90.98
91.50
90.87
91.33
92.34
93.23
93.67
94.52
95.92
96.22
96.57
97.51
98.26
99.20
100.40
101.63
103.27
103.61
104.33
104.71
105.46
Alpha=0.6
#N/A
97.60
96.10
92.62
92.55
90.90
91.98
93.43
95.09
91.38
90.67
89.19
90.28
89.69
88.98
91.81
92.34
90.58
91.67
93.49
94.58
94.65
95.76
97.82
97.27
97.35
98.76
99.50
100.64
102.18
103.57
105.69
104.92
105.57
105.59
106.55
Example:University of Michigan Index
of Consumer Sentiment
RMSE =2.66 for = 0.6
RMSE =2.96 for = 0.3
University of Michigan Index of Consumer sentiments
120
100
Sentim ent Index
80
Consumer Sentiment
60
SES (Alpha =0.3)
SES(Alpha=0.6)
40
20
0
Jun-94
Oct-95
Mar-97
Jul-98
Months
Dec-99
Apr-01
Evaluating Forecasts
All quantitative forecasting models are developed on the
basis of historical data.
When RMSE are applied to the historical data, they are
often considered measures of how well various models fit
the data (how well they work in the sample).
To determine how accurate the models are in actual
forecast (out of sample) a hold out period is often used for
evaluation and a measure of forecast accuracy based on the
forecast errors (such as RMSE) can be computed.
Evaluating Forecasts
To evaluate the relative performance of alternative
methods:
The data series is partitioned into two parts.
The first part is called estimation sample or in-sample is used
to estimate the starting value and the smoothing parameter.
This sample typically contains the first 75-80 percent of the
observations.
The second part called hold-out sample or validation sample
or out-sample is used to assess forecasting performance. This
sample contains the last 20-25 percent of observation.
General Comments
On average, SES tends to outperform MA.
SES corresponds to an intuitively appealing
underlying statistical model of the data (we shall
see this in chapter 5).
Direct use of moving average based procedures
are not recommended for forecasting.
Moving averages are useful in the area of seasonal
adjustment (will see this in chapter 4)
General Comments
For evaluation or fitting purposes, we could
minimize RMSE or minimize MAP or
MAE. They generally produce similar
results.
Out-of-sample error measures tend to be
somewhat higher than those calculated for
estimation sample.
Linear Exponential Smoothing
When a time series has a long-term trend (e.g.
increases in GDP or sales) the forecasting
method must accommodate such features. There
are two main approaches:
Convert the series to rates of change (growth rates,
either absolute or percentage) then predict the rate of
change, OR
Develop forecasting methods that account for trends
Linear Exponential Smoothing
Linear trend fitted to Quarterly
Sales
Quarterly Sales = - 6.157 + 4.567 Period
80
S
R-Sq
R-Sq(adj)
70
60
Quarterly Sales = 6.914 - 0.0466 Period
+ 0.2883 Period**2
5.47757
93.7%
93.3%
80
60
50
Netflix Sales
S
R-Sq
R-Sq(adj)
70
50
40
Netflix Sales
30
40
30
20
20
10
10
0
0
10
Period
12
14
16
10
Period
12
14
16
Quadratic trend fitted to
Quarterly Sales
1.98024
99.2%
99.1%
Holts Exponential Smoothing
The previous two models assume a never
changing trend into the future.
The linear exponential smoothing model projects
trends more locally.
Holts two parameter exponential smoothing
method is an extension of simple exponential
smoothing.
It adds a growth factor (or trend factor) to the
smoothing equation as a way of adjusting for
changes in the trend.
Holts Exponential Smoothing
We start by defining the following
variables:
Lt = level of series at time t.
Tt = trend(slope) of series at time t.
The forecast function for one step ahead is:
Ft+1 = Lt + Tt
The forecast m steps ahead is
Ft+m = Lt + mTt
Holts Exponential smoothing
To update the level and the trend:
The new level is the old level (adjusted for the increase produced
by the trend) plus a partial adjustment (weight ) for the most
recent error.
Lt Lt 1 Tt 1 et
L t yt (1 )( Lt 1 Tt 1 )
The new trend is the old trend plus a partial adjustment (weight )
for the error.
Tt Tt 1 et
Tt ( Lt Lt 1 ) (1 )Tt 1
Forecast m steps into the future.
F t m Lt Tbt
Holts Exponential smoothing
The weight and can be selected
subjectively or by minimizing a measure of
forecast error such as RMSE.
Large weights result in more rapid changes
in the component.
Small weights result in less rapid changes.
Holts Exponential smoothing
The initialization process for Holts linear
exponential smoothing requires two estimates:
One to get the first smoothed value for L1
The other to get the trend b1.
One alternative is to set L1 = y1 and
b1 y 2 y1
or
b1
y 4 y1
3
or
b1 0
Example:Quarterly sales of saws for
Acme tool company
The following table
shows the sales of
saws for the Acme tool
Company.
These are quarterly
sales From 1994
through 2000.
Year
1994
1995
1996
1997
1998
1999
2000
Quarter
t
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
sales
500
350
250
400
450
350
200
300
350
200
150
400
550
350
250
550
550
400
350
600
750
500
400
650
850
600
450
700
Example:Quarterly sales of saws for
Acme tool company
Examination of the
plot shows:
A non-stationary time
series data.
Seasonal variation
seems to exist.
Sales for the first and
fourth quarter are larger
than other quarters.
Sales of saws for the Acme Tool Company: 1994-2000
900
800
700
600
500
Saws
400
300
200
100
0
0
10
15
Year
20
25
30
Example:Quarterly sales of saws for
Acme tool company
The plot of the Acme data shows that there might
be trending in the data therefore we will try Holts
model to produce forecasts.
We need two initial values
The first smoothed value for L1
The initial trend value b1.
We will use the first observation for the estimate
of the smoothed value L1, and the initial trend
value b1 = 0.
We will use = .3 and =.1.
Example:Quarterly sales of saws for
Acme tool company
Year
1994
1995
1996
1997
1998
1999
2000
Quarter
t
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
sales
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
500
350
250
400
450
350
200
300
350
200
150
400
550
350
250
550
550
400
350
600
750
500
400
650
850
600
450
700
Lt
500.00
455.00
390.35
385.88
398.18
378.34
318.61
303.23
307.38
266.55
220.98
261.95
339.77
340.55
311.38
379.12
431.67
427.00
407.92
467.83
558.73
553.10
517.56
564.16
659.35
656.71
608.16
644.43
bt
0.00
-4.50
-10.52
-9.91
-7.69
-8.90
-13.99
-14.13
-12.30
-15.15
-18.19
-12.28
-3.27
-2.86
-5.49
1.83
6.90
5.74
3.26
8.93
17.12
14.85
9.81
13.49
21.66
19.23
12.45
14.83
Ft+m
500.00
500.00
450.50
379.84
375.97
390.49
369.44
304.62
289.11
295.08
251.40
202.79
249.67
336.50
337.69
305.89
380.95
438.57
432.74
411.18
476.75
575.85
567.94
527.37
577.65
681.01
675.94
620.61
Example:Quarterly sales of saws for
Acme tool company
RMSE for this application
is:
= .3 and = .1
RMSE = 155.5
The plot also showed the
possibility of seasonal
variation that needs to be
investigated.
Quarterly Saw Sales Forecast Holt's Method
900
800
700
600
500
sales
Sales
Ht+m
400
300
200
100
0
0
10
15
Quarters
20
25
30
Exponential smoothing with a damped Trend
One common features of times series for
sales is a decline in sales as a product lines
matures unless the product is upgraded in
some way.
A procedure that damps down the trend
component as the forecast horizon is
extended assumes that the series will level
out over time.
Exponential smoothing with a damped Trend
This kind of life-cycle effects can be
accommodated by introducing a damping
factor to the updating equations for level
and trend.
Lt Lt 1 Tt 1 et
Tt Tt 1 et
The damping factor 0 < < 1 will dampen the
trend term.
Exponential smoothing with a damped Trend
forecast function for m steps ahead
Ft m Lt ( 2 m )Tt
This forecast levels out over time,
approaching the limiting value
Lt
Tt
(1 )
Use of Transformations
Use of LES methods requires that series be locally linear.
In many cases this assumption is not realistic and the forecasts either
underestimate or overestimate the actual value. This becomes more
serious as forecasting horizon increases.
Use of Transformations
A series with a more complex nonlinear
pattern can be forecast in two ways:
Transform the series so that the trend becomes
linear
Convert the series to growth over time, forecast
growth rate, and then convert back to the
original series.
The Log Transform
The log transform produces a linear trend, we can apply
LES and then transform back to the original series to
obtain the forecast of interest.
Typically the effect of the log transform process is to
improve forecasting performance for exponential growth
curve.
Yt 1.05Yt 1
Log transform :
LnY t ln 1.05 ln Yt 1
the reverse transformation :
Exp(ln Yt ) exp(ln 1.05 ln Yt 1 ) 1.05Yt 1
Use of Growth Rate
Gt
Yt Yt 1
100
Yt 1
Define Growth rate
Use SES to predicr the growth rate for the
next period.
The one step forecast for the original series
is given by
Gt 1
Ft 1 Yt (1
)
100
Growth Rate Analysis of Netflix Quarterly Sales
Year
Quarter
Quarterly Sales
Growth
-percentage
Growth forcast
Sales Forecast
2000
5.17*
2000
7.15
38.1
38.1
2000
10.18
42.5
38.1
9.9
2000
13.39
31.5
41.9
14.1
2001
17.06
27.4
33.0
19.0
2001
18.36
7.6
28.2
22.7
2001
18.88
2.8
10.5
23.5
2001
21.62
14.5
3.9
20.9
2002
30.53
41.2
13.0
22.5
2002
36.36
19.1
37.2
34.5
2002
40.73
12.0
21.7
49.9
2002
45.19
10.9
13.4
49.6
2003
55.67
23.2
11.3
51.2
2003
63.19
13.5
21.5
62.0
2003
72.20
14.3
14.6
76.8
2003
81.19
12.4
14.3
82.8
The BOX-Cox Transformations
Logarithmic transformation is appealing because it
reflects proportional rather absolute change.
But proportional change may project future
growth in excess of reasonable expectations.
A modified LES to allow for a damped trend was
introduced earlier.
This modification can be applied after the log
transform when appropriate.
The BOX-Cox Transformations
A second possibility is to select a transformation
that is moderate than the logrithmic one.
Box and Cox suggested using a power
transformation.
Z t Yt c
1 C 1
When to Transform
Do not use complex transforms unless they
are supported by both theory and data.
Always compare transformed method with
a benchmark by transforming the forecast to
the data series of interest.