0% found this document useful (0 votes)

15 views84 pages

Week 11 - Deep Temporal Models

The document discusses Deep Temporal Models and their applications in time series analysis, highlighting the importance of forecasting, anomaly detection, and classification. It covers various modeling techniques, including ARIMA models, and emphasizes the need to address trends, seasonality, and heteroskedasticity in time series data. The document also presents examples of time series data and outlines the objectives and properties of time series models.

Uploaded by

Deepak Joshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views84 pages

Week 11 - Deep Temporal Models

Uploaded by

Deepak Joshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 84

Deep Temporal Models

Sourangshu Bhattacharya
Department of Computer Science and Engg.
IIT Kharagpur
https://cse.iitkgp.ac.in/~sourangshu/
Time Series Analysis
Time Series Data is Ubiquitous

stocks sales goods consumption

● A wide range of time series data
o AIOps
o IoT
o Business data, e.g., sales
volume, stock price sensor power demand
o Many others

Cloud service monitoring

DNA sequence motion detect ECG
Typical Applications of Time Series

Time Series Forecasting Time Series Anomaly Detection

Time Series Search/Query Time Series Classification/Clustering

Forecasting Use Case: AutoScaling
● Autoscaling in cloud computing is an effective method to improve the usage of
computing resources
o It automatically allocates resources for cloud-based applications while
maintaining SLA (service level agreement)
o Horizontal scaling (add/delete instances or VMs) vs vertical scaling
(up/downgrade CPU, RAM, network, etc.)
o Time series forecasting and decision-making on resources
Introduction

Plotting a time series is an important early step in its analysis In

general, a plot can reveal:
Trend: upward or downward pattern that might be
extrapolated into the future
Periodicity: Repetition of behavior in a regular pattern
Seasonality: Periodic behavior with a known period (hourly,
monthly, every 2 months...)
Heteroskedasticity: changing variance
Dependence: positive (successive observations are similar) or
negative (successive observations are dissimilar)
Missing data, outliers, breaks...
Example Time Series
Example 1: Global Warming
The data are the global mean land-ocean temperature index from 1880 to 2009.
We note an apparent upward trend in the series during the latter part of the 20th
century that has been used as an argument for the global warming hypothesis
(whether the overall trend is natural or whether it is caused by some human-induced
interface)
Example Time Series
Example 4: Airline passengers from 1949-1961
Trend? Seasonality? Heteroskedasticity? ...
Upward trend, seasonality on a 12 month interval, increasing variability
Monthly totals of internaional airline passengers 1949−1961

100 200 300 400 500 600

0 20 40 60 80 10 0 12 0 14 0

Time
Example Time Series
Example 5: Monthly Employed persons from 1980-1991
Trend? Seasonality? Heteroskedasticity? ...
Upward trend, seasonality with a structural break
Example Time Series
Example 7: Annual number of Candadian Lynx trapped near
McKenzie River
Trend? Seasonality? Heteroskedasticity? breaks?... no trend, no clear
seasonality as it does correspond to a known period, periodicity
Objectives of Time Series Analysis
What do we hope to achieve with time series analysis?

• Provide a model of the data (testing of scientific

hypothesis, etc.)

• Predict future values (very common goal of analysis)

• Produce a compact description of the data (a good

model can be used for "data compression")
Modeling Time Series
We take the approach that the data is a realization of random variable.

However, many statistical tools are based on assuming any R.V. are IID.

In Times Series:
R.V. are usually not independent (affected by trend and seasonality)
Variance may change significantly
R.V. are usually not identically distributed

The first goal in time series modeling is to reduce the analysis needed to a simpler
case: Eliminate Trend, Seasonality, and heteroskedasticity then we model the
remainder as dependent but Identically distributed.
Probabilistic Model: Stochastic Process

A complete probabilistic model/description of a time series Xt observed

as a collection of n random variables at times t1,t2,. . . , tn for any
positive integer n is provided by the joint probability distribution,

F (C1, C2, ..., Cn) = P(X1 ≤ C1, ..., Xn ≤ Cn)

This is generally difficult to write, unless the case the variables are
jointly normal.
Thus, we look for other statistical tools = > quantifying
dependencies
Properties of Time Series Model

A time series model is a Discrete Time Stochastic Process.

A time series model for the observed data xt

The mean function µX = E(Xt)

The Covariance function

γX(r, s) = E((Xr − µX(r))(Xs − µX(s))) for all integers r and s

The focus will be to determine the mean function and the

Covariance function to define the time series model.
Some zero-Mean Models
iid Noise
The simplest model for a times series: no trend or seasonal component and in
which the observations are IID with zero mean.
We can write, for any integer n and real numbers x1, x2,...,xn,
P(X1 ≤ x1, ..., Xn ≤ xn) = P(X1 ≤ x1)...P(Xn ≤ xn)
It plays an important role as a building block for more complicated time series
models whit e noise
3
2
1
w

−2 −1 0

0 100 200 300 400 500

Time
Some zero-Mean Models
Random Walk
The random walk {St}, t = 0, 1, 2, .... is obtained by cumulatively
summing iid random variables, S0 = 0
S t = X 1 + X2 + · · · + X t , t = 1, 2, ....
where Xt is iid noise. It plays an important role as a building block
for more complicated time series models
R a n d o m wal k
−5 0 5 10 15 20
x

0 50 100 150 200

Time
Models with Trend

50 100 150 200 250

Population of the U.S.A (Millions)

18 00 18 50 19 00 19 50

T im e

In this case a zero-mean model for the data is clearly inappropriate. The graph
suggests trying a model of the form:

Xt = mt + Yt
where mt is a function known as the trend component and Yt has a
zero mean. Estimating mt?
Models with Seasonality

In this case a zero-mean model for the data is clearly inappropriate. The
graph suggests trying a model of the form:

Xt = St + Yt
where St is a function known as the season component and
Yt has a zero mean. Estimating St?
Time series Modeling

Plot the series = > examine the main characteristics (trend,

seasonality, ...)

Remove the trend and seasonal components to get stationary

residuals/models

Choose a model to fit the residuals using sample statistics (sample

autocorrelation function)

Forecasting will be given by forecasting the residuals to arrive at

forecasts of the original series Xt
Stationary and Autocorrelation function
Definitions
Xt is strictly stationary if {X1, . . . Xn} and {X1+h, . . . Xn+h} have the
1
same joint distributions for all integers h and n > 0.
Xt is weakly stationary if
µX (t) is independent of t.
γX (t + h, t) is independent of t for each h.

Let Xt be a stationary time series. The autocovariance function (ACVF) of Xt at

lag h is
γX (h) = Cov (Xt+h, Xt )
The autocorrelation function (ACF) of Xt at lag h is
The Sample Autocorrelation function
In practical problems, we do not start with a model, but with observed data (x1, x2, . . . , xn). To assess
the degree of dependence in the data and to select a model for the data, one of the
important tools we use is the sample autocorrelation function (Sample ACF).

Definition
Let x1, x2, . . . , xn be observations of a time series. The sample mean of
x1, x2, . . . , xn is
𝑛
1
𝑋ത = ෍ 𝑥𝑡
𝑛
𝑡=1
The Sample Autocorrelation function
Remarks
The sample autocorrelation function (ACF) can be computed for any
data set and is not restricted to observations from a stationary time
series. 1

For data containing a Trend, |ρˆ(h)| will display slow decay as h

increases.

For data containing a substantial deterministic periodic component,

|ρˆ(h)| will exhibit similar behavior with the same periodicity.
The Sample Autocorrelation function
Remarks
We may recognize the sample autocorrelation function of many time
series:
1

• White Noise => Zero Trend => Slow decay

• Periodic => Periodic

• Moving Average (q) => Zero for |h| > q

• AutoRegression (p) => Decay to zero exponentially

The Airlines Dataset

1
The Sample Autocorrelation function

1
Time Series Forecasting
Forecasting: Background
● Different forecasting types
○ Short-term forecasting: predict the near future
○ Long-term forecasting: predict the future with an
extended period
○ Extreme value forecasting: predict the extreme Short term forecasting Long term forecasting

values
○ Point or Probabilistic forecasting: predict point
value or interval/probability distribution
● Challenges:
○ Accuracy, robustness Extreme value forecasting Probabilistic forecasting
/

● Models:
○ Traditional: Statistical (ARIMA, ETS, Prophet)
○ Ensemble: Tree, MLP
○ Deep Models: CNN, RNN, Transformers
ARIMA Models: General framework
An ARIMA model is a numerical expression indicating how the observations of a target
variable are statistically correlated with past observations of the same variable

▪ ARIMA models are, in theory, the most general class of models for forecasting a time series which
can be “stationarized” by transformations such as differencing and lagging

▪ The easiest way to think of ARIMA models is as fine-tuned versions of random-walk models: the fine-
tuning consists of adding lags of the differenced series and/or lags of the forecast errors to the
prediction equation, as needed to remove any remains of autocorrelation from the forecast errors

In an ARIMA model, in its most complete formulation, are considered:

▪ An Autoregressive (AR) component, seasonal and not
▪ A Moving Average (MA) component, seasonal and not
▪ The order of Integration (I) of the series

That’s why we call it ARIMA (Autoregressive Integrated Moving Average)

ARIMA Models: General framework
The most common notation used for ARIMA models is:

𝑨𝑹𝑰𝑴𝑨(𝒑, 𝒅, 𝒒)

where:
▪ p is the number of autoregressive terms
▪ d is the number of non-seasonal differences
▪ q is the number of lagged forecast errors in the equation

In the next slides we will explain each single component of ARIMA models!
ARIMA Models: Autoregressive part (AR)
In a multiple regression model, we predict the target variable Y using a linear
combination of independent variables (predictors)

In an autoregression model, we forecast the variable of interest using a linear

combination of past values of the variable itself

The term autoregression indicates that it is a regression of the variable against itself
▪ An Autoregressive model of order 𝒑, denoted 𝐴𝑅(𝑝) model, can be written as

𝑦 𝑡 = 𝑐 + 𝜙 1 𝑦 𝑡− 1 + 𝜙 2 𝑦 𝑡 − 2 + ⋯+𝜙 𝑝 𝑦 𝑡 − 𝑝 +𝜀 𝑡
Where:
▪ 𝑦 𝑡 = dependent variable
▪ 𝑦𝑡−1, 𝑦𝑡−2, …,𝑦𝑡−𝑝= independent variables (i.e. lagged values of 𝑦 𝑡 as predictors)
▪ 𝜙1 , 𝜙2 , … , 𝜙𝑝 = regression coefficients
▪ 𝜀𝑡= error term (must be white noise)
ARIMA Models: Autoregressive part (AR)
Autoregressive simulated process examples:
AR(1) process example (𝝓𝟏=0.5 ) AR(2) process example (𝝓𝟏=0.5 , 𝜙2=0.2 )

Consider that, in case of AR(1) model:

▪ When 𝜙1 = 0, y𝑡 is a white noise
▪ When 𝜙1 = 1 and 𝑐 = 0, 𝑦𝑡 is a random walk
▪ In order to have a stationary series the following condition must be true: −1 < 𝜙 1 < 1
ARIMA Models: Moving Average part (MA)
Rather than use past values of the forecast variable in a regression, a Moving
Average model uses past forecast errors in a regression-like model

In general, a moving average process of order q, MA (q), is defined as:

𝑦 𝑡 = 𝑐 + 𝜀 𝑡 + 𝜃 1 𝜀 𝑡 − 1 + 𝜃 2 𝜀 𝑡 − 2 + ⋯+𝜃 𝑞 𝜀 𝑡 − 𝑞

The lagged values of 𝜀 𝑡 are not actually observed, so it is not a standard regression.

Moving average models should not be confused with moving average smoothing
(the process used in classical decomposition in order to obtain the trend component)

A moving average model is used for forecasting future values while moving average
smoothing is used for estimating the trend-cycle of past values
ARIMA Models: Moving Average part (MA)
Moving Average simulated process examples:

MA(1) process example (𝜃1=0.7) MA(2) process example (𝜃1=0.8 , 𝜃2=0.5)

▪ Looking just the time plot it’s hard to distinguish between an

AR process and a MA process!
ARIMA Models: ARMA and ARIMA
If we combine autoregression and a moving average model,
we obtain an ARMA(p,q) model:

𝑦 𝑡 = 𝑐 +𝜙 1 𝑦 𝑡 − 1 +𝜙 2 𝑦 𝑡 − 2 + ⋯+𝜙 𝑝 𝑦 𝑡 − 𝑝 + 𝜃 1 𝜀 𝑡−1 + 𝜃 2 𝜀 𝑡−2 + ⋯+𝜃 𝑞 𝜀 𝑡 − 𝑞 + 𝜀 𝑡

Autoregressive component of order p Moving Average component of order q

To use an ARMA model, the series must be STATIONARY!

▪ If the series is NOT stationary, before estimating and ARMA model, we need to apply one or
more differences in order to make the series stationary: this is the integration process,
called I(d), where d= number of differences needed to get stationarity

▪ If we model the integrated series using an ARMA model, we get an ARIMA (p,d,q)
model where p=order of the autoregressive part; d=order of integration; q= order of the
moving average part
ARIMA Models: ARMA and ARIMA

ARIMA simulated process examples

ARMA(2,1) process example, equal to ARIMA(2,0,1) ARIMA(2,1,1) process example (𝝓𝟏=0.5, 𝜙2=0.4, 𝜃1=0.8 )
(𝝓𝟏=0.5, 𝜙2=0.4, 𝜃1=0.8 )
ARIMA Models: Model identification
General rules for model indentification based on ACF and PACF plots:

The data may follow an 𝑨𝑹𝑰𝑴𝑨(𝒑, 𝒅, 𝟎) model if the ACF plots of the
differenced data show the following patterns:
▪ the ACF is exponentially decaying or sinusoidal

The data may follow an 𝑨𝑹𝑰𝑴𝑨(𝟎, 𝒅, 𝒒) model if the ACF plots of the
differenced data show the following patterns:
▪ the PACF is exponentially decaying or sinusoidal

For a general 𝑨𝑹𝑰𝑴𝑨(𝒑, 𝒅, 𝒒) model (with both p and q > 1) both ACF and PACF plots show
exponential or sinusoidal decay and it’s more difficult to understand the structure of the model
ARIMA Models: Seasonal ARIMA
A seasonal ARIMA model is formed by including additional seasonal terms
in the ARIMA models we have seen so far
𝑨𝑹𝑰𝑴𝑨(𝒑, 𝒅, 𝒒) (𝑷, 𝑫, 𝑸)𝒔

where s = number of periods per season (i.e. the frequency of seasonal cycle)
We use uppercase notation for the seasonal parts of the model, and lowercase
notation for the non-seasonal parts of the model

As usual, d / D are the number of differences/seasonal differences necessary

to make the series stationary
ARIMA Models: estimation and AIC
Parameters estimation
In order to estimate an ARIMA model, normally it’s used the Maximum Likelihood Estimation (MLE)

This technique finds the values of the parameters which maximize the probability of obtaining the
data that we have observed For given values of (𝒑, 𝒅, 𝒒) (𝑷, 𝑫, 𝑸) (i.e. model order) the algorithm will
try to maximize the log likelihood when finding parameter estimates

ARIMA model order

A commonly used criteria to compare different ARIMA models (i.e. with different values for (𝒑,𝒒) (𝑷,𝑸) but
fixed 𝒅 , 𝑫 ) and to determine the optimal ARIMA order, is the Akaike Information Criterion (AIC)
𝐀𝐈𝐂 = −2log (𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑) + 2(𝑝)

▪ where p is the number of estimated parameters in the model

▪ AIC is a goodness of fit measure
▪ The best ARIMA model is that with the lower AIC most of automatic model selection method
(e.g auto.arima in R) uses the AIC for determining the optimal ARIMA model order
ARIMA Models: Hands on
ARIMA Models: Hands on
Deep Time Series Forecasting
Deep Learning: Models

● MLP (multiple layer perceptron)

○ Fully connected feedforward artificial neural network
● CNN (convolutional neural network)
○ Shared-weight architecture of filters that slide along input features

● RNN (recurrent neural network)

○ Connection between nodes form a directed or undirected graph
along a temporal sequence

● Transformer
○ Deep learning model adopts the mechanism of self-attention,
differentially weighting the significance of each part of input
sequence
Forecasting: Deep Ensemble (MLP based Models)
● N-BEATS
○ Doubly residual stackings with
forward and backward residual
links
○ Forecasts are aggregated in a
hierarchical way
○ Trend and seasonal models for
interpretability
○ Ensemble: e.g., 18 to 180 models
○ Fit on different metrics: sMAPE,
MASE, MAPE
○ Train on input windows of different
lengths
○ Train with different random
initializations

Oreshkin, B. N., Carpov, D., Chapados, N., & Bengio, Y. “N-BEATS: Neural basis expansion analysis for interpretable time series forecasting.” ICLR 2020.
Forecasting: RNN based Models

● Recurrent neural networks

○ From RNN to LSTM/GRU: control information flow by gates, mitigating vanishing gradient problem
○ DeepAR: time series probabilistic forecasting through autoregressive recurrent network

Vanilla RNN LSTM GRU DeepAR

Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation, 1997.
Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv:1412.3555, 2014.
Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 2020.
Forecasting with Transformer: Informer

● Informer
○ ProbSparse self-attention for
efficient and robust attention
mechanism
○ Self-attention distilling: extract
dominating attention and reducing
the network size
○ Generative style decoder: produce
long sequence forecasts with only
one forward step, avoiding
cumulative error spreading during
inference

Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H. and Zhang, W., "Informer: Beyond efficient transformer for long sequence time-series forecasting." AAAI, 2021.
Long Sequence Time-series Forecasting

Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H. and Zhang, W., "Informer: Beyond efficient transformer for long sequence time-series forecasting." AAAI, 2021.
Transformer: Attention Computation

Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H. and Zhang, W., "Informer: Beyond efficient transformer for long sequence time-series forecasting." AAAI, 2021.
Forecasting with Transformer: Latest
Methods
● Autoformer: Transformer with auto-correlation mechanism
○ Decomposition architecture to disentangle complex temporal patterns (seasonality, trend)
○ Auto-correlation instead of point-wise self-attention to utilize period-based dependencies
and reduce complexity

● FEDformer: frequency enhanced decomposed Transformer

○ Efficient and robust frequency domain processing: to capture important structures in time series
○ Frequency enhanced block: substitute self-attention
○ Frequency enhanced attention: substitute cross-attention
○ Mixture of experts seasonal-trend decomposition: to better capture global properties in time series

Wu, H., Xu, J., Wang, J. and Long, M., “Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting”, NeurIPS 2021.

Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, Rong Jin, "FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting," ICML 2022.
Forecasting with Transformer: Results
Empirical comparison of FEDformer on six benchmark datasets

Linear complexity of FEDformer

Temporal Point Process
Many discrete events in continuous time

Events are (noisy)

observations of a
variety of complex
Disease dynamics
dynamic
processes…
Qmee, 2013

Online actions

Financial trading Mobility dynamics

Example I: Information propagation

Smeans D Christine
D follows S
Bob 3.00pm
3.25pm

Beth
3.27pm

Joe

David
4.15pm

Friggeri et al., 2014

They can have an impact in

the off-line world
Example II: Knowledge creation

Addition
Refutation

Questio
n
Answer

t
Upvote

t
Temporal point processes

Temporal point process: A random process whose realization consists of

discrete events localized in time
Discrete events

time
t=T

History, Dirac delta function

Formally:
Model time as a random variable
density
Prob. between [t, t+dt)

time

t=T

Prob. not before t

History,

t=T

Likelihood of a timeline:
Problems of density parametrization (I)

time
t=T

It is difficult for model design and interpretability:

1. Densities need to integrate to 1 (i.e., partition function)
2. Difficult to combine timelines
Intensity function
density
Prob. between [t, t+dt)

time
t=T

Prob. not before t

History,
Intensity:
Probability between [t, t+dt) but not before t

Observation: It is a rate = # of events / unit of time

Advantages of intensity parametrization (I)

time
t=T

Suitable for model design and interpretable:

1. Intensities only need to be nonnegative
2. Easy to combine timelines
Relation between f*, F*, S*, λ*

Central quantity
we will use!
Poisson process

time
t=T

Intensity of a Poisson process

Observations:
1. Intensity independent of history
2. Uniformly random occurrence
3. Time interval follows exponential distribution
Fitting & sampling from a Poisson

time
t=T

Fitting by maximum likelihood:

Sampling using inversion sampling:

Inhomogeneous Poisson process

time
t=T

Intensity of an inhomogeneous Poisson process

(Independent of history)

Example:
Fitting & sampling from inhomogeneous Poisson

time
t=T

Fitting by maximum likelihood:

Sampling using thinning (reject. sampling) + inverse sampling:

1. Sample from Poisson process with intensity
using inverse sampling
2. Generate Keep sample with
prob.
3. Keep the sample if
Self-exciting (or Hawkes) process

time
t=T

History,
Triggering kernel
Intensity of self-exciting
(or Hawkes) process:

Observations:
1. Clustered (or bursty) occurrence of events
2. Intensity is stochastic and history dependent
Fitting a Hawkes process from a
recorded timeline

time
t=T

Fitting by maximum likelihood:

The max. likelihood
is jointly convex
in and

Sampling using thinning (reject. sampling) + inverse sampling:

Key idea: the maximum of the intensity changes
over time
Mutually exciting process

time
Bob

History

Christine
time

History

Clustered occurrence affected by neighbors

Temporal Point Process
Deep Temporal Point Process Prediction
Deep Temporal Point Process Prediction
Transformer Hawkes Process
Embedding Layers
Multi-head Self-attention Modules
Continuous Time Conditional Intensity
Comparison Results
Summary
● Understanding different properties of time series is important
○ Stationarity, Trend, Seasonality, etc.

● Time series forecasting:

○ Identify trend and seasonality, and “remove” it
○ Model the stationary time series: ARIMA

● Deep Time-series forecasting:

○ Transformer-based models dominate – Informer

● Temporal point process models:

○ Event data can be modeled using self-exciting Hawkes process.
○ Transformer Hawkes process – Deep TPP modelling.
Thanks

questions?

Email: sourangshu@cse.iitkgp.ac.in

SC Presentation VIT 03272024 Final
No ratings yet
SC Presentation VIT 03272024 Final
98 pages
Time Series Analysis Guide
No ratings yet
Time Series Analysis Guide
17 pages
MATH545-Time Series
No ratings yet
MATH545-Time Series
79 pages
Math7339TS1TimesSeries Intro
No ratings yet
Math7339TS1TimesSeries Intro
33 pages
Intro To Time Series
No ratings yet
Intro To Time Series
85 pages
Class16 PDF
No ratings yet
Class16 PDF
77 pages
Time Series Analysis Lecture Notes
No ratings yet
Time Series Analysis Lecture Notes
97 pages
Econ f342 Ae Time Series Addl
No ratings yet
Econ f342 Ae Time Series Addl
124 pages
Lec 1, STAT 411
No ratings yet
Lec 1, STAT 411
67 pages
Stochiastic Time Series
No ratings yet
Stochiastic Time Series
49 pages
Time Series Notes
No ratings yet
Time Series Notes
146 pages
Arima Notes
No ratings yet
Arima Notes
4 pages
Resumos Forecasting
No ratings yet
Resumos Forecasting
17 pages
Time Series Analysis
100% (1)
Time Series Analysis
66 pages
Time Series Analysis Guide
100% (1)
Time Series Analysis Guide
83 pages
Univariate Time Series
No ratings yet
Univariate Time Series
83 pages
AR, MA, ARIMATime Series
No ratings yet
AR, MA, ARIMATime Series
76 pages
Chapter 5 - Time Series Models
No ratings yet
Chapter 5 - Time Series Models
195 pages
Notes LinearTimeSeries
No ratings yet
Notes LinearTimeSeries
12 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Python Time Series Cheat Sheet
No ratings yet
Python Time Series Cheat Sheet
7 pages
Time Series and Sequential Data
No ratings yet
Time Series and Sequential Data
143 pages
Time Series Analysis and Forecasting Using R
No ratings yet
Time Series Analysis and Forecasting Using R
30 pages
Time Series: H T 2008 P - G R
No ratings yet
Time Series: H T 2008 P - G R
161 pages
Time Series Analysis Essentials
No ratings yet
Time Series Analysis Essentials
104 pages
Intro of Time Series
No ratings yet
Intro of Time Series
18 pages
Time Series Analysis
No ratings yet
Time Series Analysis
62 pages
Time Series Forecasting Guide
No ratings yet
Time Series Forecasting Guide
6 pages
Time - Series - in - Brief
No ratings yet
Time - Series - in - Brief
11 pages
End Term Project (BA)
No ratings yet
End Term Project (BA)
19 pages
Tsa - Time Series Analysis
No ratings yet
Tsa - Time Series Analysis
45 pages
Time Series 2022 B
No ratings yet
Time Series 2022 B
57 pages
C TSAF Box Jenkins - Method
No ratings yet
C TSAF Box Jenkins - Method
83 pages
CSE4261 Lecture-9
No ratings yet
CSE4261 Lecture-9
45 pages
Stationarity and Non-Stationarity
No ratings yet
Stationarity and Non-Stationarity
22 pages
Introduction To Time Series
No ratings yet
Introduction To Time Series
6 pages
Module 1 - Part 5-1
No ratings yet
Module 1 - Part 5-1
29 pages
Time Series Model
No ratings yet
Time Series Model
22 pages
STAT0010 Introductory Slides
No ratings yet
STAT0010 Introductory Slides
67 pages
Time Series Analysis. Trends, Patters, Seasonality
No ratings yet
Time Series Analysis. Trends, Patters, Seasonality
14 pages
Time Series Analysis Handbook 02
No ratings yet
Time Series Analysis Handbook 02
6 pages
Timeseries - Analysis
No ratings yet
Timeseries - Analysis
37 pages
Noise Series
No ratings yet
Noise Series
40 pages
Time Series Analysis Guide
No ratings yet
Time Series Analysis Guide
16 pages
Introduction To Time Series Analysis, Lectures
No ratings yet
Introduction To Time Series Analysis, Lectures
49 pages
Time Series
No ratings yet
Time Series
67 pages
00 Time Series Analysis - Complete Study Guide
No ratings yet
00 Time Series Analysis - Complete Study Guide
26 pages
Multivariate Time Series in Python
No ratings yet
Multivariate Time Series in Python
16 pages
Lecture 11
No ratings yet
Lecture 11
37 pages
Week 1 - Slides White
No ratings yet
Week 1 - Slides White
72 pages
Predictive Analytics & Time Series
No ratings yet
Predictive Analytics & Time Series
54 pages
Time Series Forecasting
No ratings yet
Time Series Forecasting
29 pages
Time Series Model - Box Jenkins
No ratings yet
Time Series Model - Box Jenkins
26 pages
Group 9 Time Series Data Analysis (ARIMA)
No ratings yet
Group 9 Time Series Data Analysis (ARIMA)
47 pages
Time Series
No ratings yet
Time Series
32 pages
Time Series Analysis & ARMA Modeling
No ratings yet
Time Series Analysis & ARMA Modeling
56 pages
Week4 - 1
No ratings yet
Week4 - 1
18 pages
Week 3 NumPy Tutorial
No ratings yet
Week 3 NumPy Tutorial
13 pages
पंचदश दिन - षोडशवर्गीय कुण्डली निर्माण - 2025-07-01
No ratings yet
पंचदश दिन - षोडशवर्गीय कुण्डली निर्माण - 2025-07-01
2 pages
Week 12 Foundations of Generative AIv2 2
No ratings yet
Week 12 Foundations of Generative AIv2 2
74 pages
गृहकार्य 2025-07-04
No ratings yet
गृहकार्य 2025-07-04
23 pages
अष्टादश दिन - षोडशवर्गीय कुण्डली निर्माण - 2025-07-04
No ratings yet
अष्टादश दिन - षोडशवर्गीय कुण्डली निर्माण - 2025-07-04
3 pages
Week 7 - Mnist-Mlp
No ratings yet
Week 7 - Mnist-Mlp
7 pages
Broucher
No ratings yet
Broucher
6 pages
Product Manual 36693 (Revision D, 5/2015) : PG Base Assemblies
No ratings yet
Product Manual 36693 (Revision D, 5/2015) : PG Base Assemblies
10 pages
Fitness For Service Assessments BAOT144 - S
No ratings yet
Fitness For Service Assessments BAOT144 - S
10 pages
Points To Be Considered in Fixed Assets Check in Sap: Key 1 Key 2 Report Name TRX Code
No ratings yet
Points To Be Considered in Fixed Assets Check in Sap: Key 1 Key 2 Report Name TRX Code
7 pages
Branches of Psychology
90% (10)
Branches of Psychology
2 pages
The Shiphandlers Guide
No ratings yet
The Shiphandlers Guide
143 pages
DS1720 01
No ratings yet
DS1720 01
19 pages
WLP Q1 G11-Philosophy
No ratings yet
WLP Q1 G11-Philosophy
8 pages
Quartz Tolerance
No ratings yet
Quartz Tolerance
36 pages
Lesson 3 Associative Property
No ratings yet
Lesson 3 Associative Property
16 pages
SHP 2 Grid
No ratings yet
SHP 2 Grid
7 pages
Theory 2 - Code of Ethics For Professional Teacher & Historical Development of Teaching
No ratings yet
Theory 2 - Code of Ethics For Professional Teacher & Historical Development of Teaching
5 pages
Education Philosophy Review
50% (2)
Education Philosophy Review
34 pages
Product Guide: Hyundai Construction Equipment
100% (1)
Product Guide: Hyundai Construction Equipment
26 pages
7 The Brain
100% (1)
7 The Brain
19 pages
Unfolding Meaning From Memories: An Integrative Meaning Reconstruction Method For Counseling The Bereaved
No ratings yet
Unfolding Meaning From Memories: An Integrative Meaning Reconstruction Method For Counseling The Bereaved
17 pages
ServiceManuals LG Fridge GRL257NI GR-L257NI Service Manual
100% (1)
ServiceManuals LG Fridge GRL257NI GR-L257NI Service Manual
128 pages
GU Student Manual 2 Schemas
No ratings yet
GU Student Manual 2 Schemas
11 pages
Prototype Approach To Semantic Structure
No ratings yet
Prototype Approach To Semantic Structure
34 pages
Data Integration
No ratings yet
Data Integration
4 pages
Organizational Structures Guide
No ratings yet
Organizational Structures Guide
1 page
Excerpt
No ratings yet
Excerpt
10 pages
I564 E1 01 3g3ax PG Users Manual
No ratings yet
I564 E1 01 3g3ax PG Users Manual
69 pages
Action Research in Education Innovation
No ratings yet
Action Research in Education Innovation
80 pages
IC Engines
No ratings yet
IC Engines
37 pages
Manual-C-9102 (UL) Conventional Photoelectric Smoke Detecto20Issue1.04
No ratings yet
Manual-C-9102 (UL) Conventional Photoelectric Smoke Detecto20Issue1.04
10 pages
P35 Portable Dewpoint Meter Datasheet 1898 Iss7
No ratings yet
P35 Portable Dewpoint Meter Datasheet 1898 Iss7
3 pages
IndividualTaskReport - ESPINOZA, JOAN
No ratings yet
IndividualTaskReport - ESPINOZA, JOAN
2 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
2 pages
Apollo Test 4
No ratings yet
Apollo Test 4
21 pages

Week 11 - Deep Temporal Models

Uploaded by

Week 11 - Deep Temporal Models

Uploaded by

Deep Temporal Models

stocks sales goods consumption

Cloud service monitoring

Time Series Forecasting Time Series Anomaly Detection

Time Series Search/Query Time Series Classification/Clustering

Plotting a time series is an important early step in its analysis In

100 200 300 400 500 600

• Provide a model of the data (testing of scientific

• Predict future values (very common goal of analysis)

• Produce a compact description of the data (a good

A complete probabilistic model/description of a time series Xt observed

F (C1, C2, ..., Cn) = P(X1 ≤ C1, ..., Xn ≤ Cn)

A time series model is a Discrete Time Stochastic Process.

A time series model for the observed data xt

The Covariance function

The focus will be to determine the mean function and the

0 100 200 300 400 500

0 50 100 150 200

50 100 150 200 250

Plot the series = > examine the main characteristics (trend,

Remove the trend and seasonal components to get stationary

Choose a model to fit the residuals using sample statistics (sample

Forecasting will be given by forecasting the residuals to arrive at

Let Xt be a stationary time series. The autocovariance function (ACVF) of Xt at

For data containing a Trend, |ρˆ(h)| will display slow decay as h

For data containing a substantial deterministic periodic component,

• White Noise => Zero Trend => Slow decay

• Periodic => Periodic

• Moving Average (q) => Zero for |h| > q

• AutoRegression (p) => Decay to zero exponentially

In an ARIMA model, in its most complete formulation, are considered:

That’s why we call it ARIMA (Autoregressive Integrated Moving Average)

In an autoregression model, we forecast the variable of interest using a linear

Consider that, in case of AR(1) model:

In general, a moving average process of order q, MA (q), is defined as:

MA(1) process example (𝜃1=0.7) MA(2) process example (𝜃1=0.8 , 𝜃2=0.5)

▪ Looking just the time plot it’s hard to distinguish between an

𝑦 𝑡 = 𝑐 +𝜙 1 𝑦 𝑡 − 1 +𝜙 2 𝑦 𝑡 − 2 + ⋯+𝜙 𝑝 𝑦 𝑡 − 𝑝 + 𝜃 1 𝜀 𝑡−1 + 𝜃 2 𝜀 𝑡−2 + ⋯+𝜃 𝑞 𝜀 𝑡 − 𝑞 + 𝜀 𝑡

To use an ARMA model, the series must be STATIONARY!

ARIMA simulated process examples

As usual, d / D are the number of differences/seasonal differences necessary

ARIMA model order

▪ where p is the number of estimated parameters in the model

● MLP (multiple layer perceptron)

● RNN (recurrent neural network)

● Recurrent neural networks

Vanilla RNN LSTM GRU DeepAR

● FEDformer: frequency enhanced decomposed Transformer

Linear complexity of FEDformer

Events are (noisy)

Financial trading Mobility dynamics

Friggeri et al., 2014

They can have an impact in

Temporal point process: A random process whose realization consists of

History, Dirac delta function

Prob. not before t

It is difficult for model design and interpretability:

Prob. not before t

Observation: It is a rate = # of events / unit of time

Suitable for model design and interpretable:

Intensity of a Poisson process

Fitting by maximum likelihood:

Sampling using inversion sampling:

Intensity of an inhomogeneous Poisson process

Fitting by maximum likelihood:

Sampling using thinning (reject. sampling) + inverse sampling:

Fitting by maximum likelihood:

Sampling using thinning (reject. sampling) + inverse sampling:

Clustered occurrence affected by neighbors

● Time series forecasting:

● Deep Time-series forecasting:

● Temporal point process models:

You might also like