0% found this document useful (0 votes)

31 views37 pages

13 LinearFactorModels

Uploaded by

Aleesha K B

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views37 pages

13 LinearFactorModels

Uploaded by

Aleesha K B

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Deep Learning Srihari

Linear Factor Models

Sargur N. Srihari
srihari@cedar.buffalo.edu

1
Deep Learning Srihari

Topics in Linear Factor Models

1. Definition of Linear Factor Analysis
2. Related methods
1. Principal Components Analysis
2. Factor Analysis
3. Linear Factor Models generalize the above
4. Independent Component Analysis (ICA)
5. Slow Feature Analysis
6. Sparse Coding
7. Manifold Interpretation of PCA 2
Deep Learning Srihari

Deep Learning is about Models

• Many research frontiers of deep learning
involves building a probabilistic model of input
pmodel(x)
• Such a model can be used with probabilistic
inference to predict any variables given any
other variables

3
Deep Learning Srihari

Simplest model with latent variables

• Deep Learning is often to construct pmodel (x)
– Useful to predict variables given other variables
• With latent variables pmodel(x) = Eh pmodel(x | h)
– Latent variables provide another means of
representing the data
– Representations using latent variables obtain all
advantages of feedforward and recurrent networks
• Latent Factor Models are the simplest models
with latent variables

4
Deep Learning Srihari

Models with Latent Variables

• Many models also have latent variables, h
– We can write pmodel(x)=Eh pmodel(x|h) ∑ ∑
Since p(x) =
h
p(x,h) =
h
p(x | h)p(h) =Eh p(x | h)

– These latent variables provide another means of

representing the data

5
Deep Learning Srihari

Models with Latent Variables

• Much of deep learning involves building a
probabilistic model of input pmodel(x)
– From which we can infer any other variables
• Many models also have latent variables, h
– We can write pmodel(x)=Eh pmodel(x|h) ∑ ∑
Since p(x) =
h
p(x,h) =
h
p(x | h)p(h) =Eh p(x | h)

– These latent variables provide another means of

representing the data
• Distributed representations based on latent
variables can have all the advantages of
representation learning with deep feed-forward
and recurrent networks 6
Deep Learning Srihari

Linear factor models

• Linear factor models are the simplest
probabilistic models
– They are used as building blocks for:
• Mixture models
• Deep probabilistic models
• They are basic approaches to build generative
models that are extended by deep models
• Defined by using a stochastic, linear decoder
function that generates x by adding noise to a
linear transformation of h, i.e.,
x =Wh+b+noise 7
Deep Learning Srihari

Linear Factor Model Definition

• A linear factor model describes a data
generating process as follows
– First we sample the explanatory factors h from a
distribution h ~p(h)
• Where p(h) is a factorial distribution
p(h) = ∏ p(hi )
i

– So that it is easy to sample from

– Next we sample the real-valued observable
variables given the factors
x =Wh+b+noise
• where noise is Gaussian and diagonal (independent
dimensions)
8
Deep Learning Srihari

Graphical Representation of Linear Factor Model

h~p(h) with p(h) = ∏ p(hi )

x =Wh+b+noise

9
Deep Learning Srihari

Special cases of Linear Factor Model

h~p(h) with p(h) = ∏ p(hi )
i

x =Wh+b+noise

• Special cases of above equations are:

1. Probabilistic PCA
2. Factor analysis
3. Other Linear Factor Models
• They differ in choices about form of noise and
prior over latent variables h before observing x
– Factor Analysis and Probabilistic PCA shown next10
Deep Learning Srihari

Factor Analysis
h~p(h) with p(h) = ∏ p(hi )
i

x =Wh+b+noise

• Prior p(h) is a unit variance Gaussian h ~N(h;0,I)

• xi are conditionally independent given h
– noise from Gaussian with covariance matrix
ψ=diag (σ2) with σ2 =[σ12,.. σn2], vector of variances
– Latent variables capture dependencies between xi
• It can be shown that x is multivariate Gaussian
x~N(x;b,WWT+ψ)
11
Deep Learning Srihari

Probabilistic PCA
h~p(h) with p(h) = ∏ p(hi )
i

x =Wh+b+noise

• A slightly modified factor analysis model

• Assume equal conditional variances: σ 2 = σ12 = .. = σn2
– Thus x ~ N (x; b,WWT + σ2I)
Or equivalently x = Wh + b + σz
where z ~N(z; 0,I) is Gaussian noise
– Iterative EM can be used to estimate W and σ2
– Takes advantage of observation that most variations are
captured by the latent variables h, upto small residual
reconstruction error σ 2
• Probabilistic PCA becomes PCA as σ à 0
Deep Learning Srihari

PCA (Principal Components Analysis)

https://medium.com/@mallrishabh52/principal-components-analysis-7f6ff559cd83
13
Deep Learning Srihari

PCA
Principal components capture the most variation in a dataset

PCA deals with the curse of dimensionality by capturing

the essence of data into a few principal components.

PC1 must convey the PC2 is the second line that

maximum Variation meets PC1, perpendicularly, at the
among data points and center of the cloud, and describes
contain minimum error. second most variation in the data
Deep Learning Srihari

PCA Algorithm (Linear Algebra)

• Given {x(1),..,x(m)} in Rn represent using Rl l<n
– For point x(i) find code vector c(i) in Rl
• Find encoder f (x) = c and decoder x ≈ g (f (x))
– One decoding function is: g(c) =Dc where
• D is a matrix with l mutually orthogonal columns to
minimize distance between x and reconstruction
r(x)=g(f(x))=DDTx
1

c* = argmin x − g(c) ⎛
( ( ))
⎞2
2

2 D* = argmin ⎜ ∑ x (ij ) − r x (i ) ⎟
c D ⎝ i,j j ⎠

• Solution D
– The l eigenvectors of design matrix X X ∈! m×n
correspond to the largest eigenvalues
15
Deep Learning Srihari

Confirmatory Factor Analysis

Factor analysis is a technique for identifying which underlying factors are
measured by a (much larger) number of observed variables.
Such “underlying factors” are difficult to measure , e.g., IQ, depression or extraversion.

Researcher’s Hypothesis Confirmatory factor analysis

if questions 1, 2 and 3 all measure numeric IQ, then the

Pearson correlations among these items should be
substantial: respondents with high numeric IQ will
typically score high on all 3 questions

Exploratory Factor Analysis

No clue to which -or even how many- factors are represented by the data 16
Deep Learning Srihari

Exploratory Factor Analysis

• Psychologist’s hypothesis: there are two kinds
(k=2) of latent intelligence
– Verbal (factor F1) and mathematical (factor F2)
• Evidence for hypothesis is sought in the
examination scores (x) from p=6 academic
fields (e.g., astronomy) of n=1000 students
– Observable variables x1,.., x6 with means μ1…, μ6
xi –μi = li1F1+ li2 F2+ εi , i = 1,..6, li are the loadings
• In matrix form x –μ =LF + ε ; x is p×n, L is p×k, F is k×n
– Values of L, μ, and variances of errors ε must be estimated from data x
and F (assumption about levels of the factors is fixed for a given F)
– Solution: for astronomy, average student aptitude is 10F1+6F2
Deep Learning Srihari

Factor Analysis
• A method to describe variability among
observed, correlated variables in terms of a
lower no. of latent variables called factors
– E.g., variations in six observed variables mainly
reflect the variations in two latent variables
• Observed variables modeled as linear
combinations of latent factors, plus error terms
– Factor analysis aims to find independent latent
variables

18
Deep Learning Srihari

PCA vs. Factor Analysis

• Both are data reduction techniques

• Both involve choosing components or factors
• Fundamental difference between them:
– PCA is a linear combination of variables
– Factor Analysis is a measurement model of a latent
variable
• PCA is a more basic version of Factor Analysis

19
Deep Learning Srihari

PCA vs Factor Analysis

• Principal Components • Factor Analysis
Analysis – A model for measuring an
– create index variables from unobservable latent
larger set of measured variable
variables

Four measured F, the latent Factor, is causing the responses

variables y combined on the four measured y variables,
into a single component c u’s are the variance in each y that is unexplained
by the factor.
y1 = b1*F + u1
Model set up as: Model set up as y2 = b2*F + u2
c = w1y1 + w2y2 + w3y3 + w4y4 Regression equations: y3 = b3*F + u3
y4 = b4*F + u4
https://www.theanalysisfactor.com/
the-fundamental-difference-between-principal-component-analysis-and-factor-analysis/
Deep Learning Srihari

Independent Component Analysis

• Approach to modeling linear factors
• To separate observed signal into underlying
independent signals
– That are scaled and added together to form the
observed data

21
Examples of ICA
Deep Learning Srihari

1. Extracting source from noisy signal

Mixed True
Signal Source

2. Cocktail party problem: speech signals of people

talking simultaneously are separated

22
Deep Learning Srihari

ICA requires independent signals

• Signals are intended to be fully independent
rather than merely decorrelated from each other
– Independence is stronger than zero covariance
• Ex: No covariance doesn’t mean independence
– We sample x from [-1,1] -1 0 1
x

½
– Let s be 1 with probability 0.5, 0 1
s
otherwise s = 0
– Let y=sx
• Clearly x and y not independent, since y generated from x
– But x and y have zero covariance 23
Deep Learning Srihari

An ICA model
• Prior p(h) fixed ahead of time
• Model deterministically generates x =Wh
– Use nonlinear change of variables to determine p(x)

• Learning proceeds using maximum likelihood

• By choosing independent p(h) we can recover
underlying factors that are close to independent
– Used to recover low level signals mixed together

24
Deep Learning Srihari

ICA signal separation

• Each example is one moment in time
• Each xi is a sensor observation of mixed signals
• Each hi is one estimate of the original signals

Independent source A Top: A -2B Recovered signals

Independent source B Bottom:1.7A+3.4B

25
Deep Learning Srihari

Choice of p(h) in ICA

• All ICA variants require p(h) be non-Gaussian
– This is because if p(h) is an independent prior with
Gaussian components then W is not identifiable
• This is different from probabilistic PCA and
factor analysis, where p(h) is Gaussian
• Typical choice is p(hi)=[d/dhi]σ(hi)
– Have larger peaks near 0 than does Gaussian
• So ICA is learning sparse features

26
Deep Learning Srihari

Generalization of ICA
• PCA generalizes to nonlinear autoencoders
• ICA generalizes to a nonlinear generative
model
– Use a nonlinear f to generate observed data

27
Deep Learning Srihari

Slow Feature Analysis

• It is a Linear factor model
• Uses information from time signals to learn
invariant features
• Motivation: Slowness principle
– Important characteristics change slowly compared
to individual measurements that make up a scene
• Computer vision example shown next

28
Deep Learning Srihari

SFA in computer vision

• Individual pixels can change very rapidly
• Ex: zebra moves from right to left
– Pixels change rapidly from black to white to black
– Feature indicating whether zebra is in image
changes slowly

• Regularize model to learn features that change

slowly with time
29
Deep Learning Srihari

Slowness Principle
• Can apply slowness principle to any model
trained with gradient descent
• Slowness principle is introduced by adding a
term to the cost function of the form

– where f is feature extractor to be regularized

– λ is the strength of the slowness regularization term
– L is a loss function measuring the distance between
f (x(t)) and f (x(t+1))
• Common choice of L is the mean squared difference
30
Deep Learning Srihari

Sparse Coding
• A linear factor model
• Studied as unsupervised feature learning and
extraction
• Terminology
– Sparse Coding refers to inferring the values of h in
the model
– Sparse modeling refers to process of designing and
learning the model
– But sparse coding often refers to both
31
Deep Learning Srihari

Sparse Coding definition

32
Deep Learning Srihari

Manifold Interpretation of PCA

• Linear factor models including PCA and factor
analysis can be interpreted as learning a
manifold
• Probabilistic PCA learns a flat pancake of high
probability
• Illustrated next

33
Deep Learning Srihari

Flat Gaussian near low-dimensional

manifold

Shows upper half of pancake

above the manifold plane, which goes through its middle

Variance orthogonal to manifold is small (can be considered as noise)

while other variances are large (correspond to signal)

34
Deep Learning Srihari

Generality of Interpretation
• Manifold interpretation applies to not just to
PCA but also to any linear autoencoder that
learns Matrices W and V with the goal of making
the reconstruction of x lie as close to x as
possible
• Let the encoder be h = f(x) = W T(x − μ)
• The encoder computes a low-dimensional
representation of h
• With the autoencoderview, we have a decoder
computing the reconstruction xˆ = g(h) = b + V h
35
Deep Learning Srihari

Summary of Linear Factor Models

• Linear factor models are
– The simplest generative models
– Simplest models that learn a representation of data
• Analogy between linear classifiers and linear
factor models
1.Linear classifier/regression models are extended to
deep feedforward networks
2.Linear factor models are extended to autoencoder
networks and deep probabilistic models
– Perform the same tasks but with a much more
powerful and flexible model family 36
Deep Learning Srihari

Distribution of stars: Galaxy M31 in Andromeda

Andromeda
constellation

M31 is 2.1 million light years away and heading on a collision course with the Milky Way.
They should collide in about 4 billion years. We won't feel much from the mash-up
as there is so much empty space between the stars of both galaxies that few if any will notice.

The 3-dimensional data is largely present on a 2-dimensional plane.

Both PCA and Factor Analysis aim to find the plane using different approaches.

M31 Photo Courtesy of Michael Caliguri

ch13 Linear Factor Models
No ratings yet
ch13 Linear Factor Models
33 pages
Part 3 Chapter 13
No ratings yet
Part 3 Chapter 13
13 pages
15 Aos1364
No ratings yet
15 Aos1364
36 pages
Approximate Inference: Sargur Srihari Srihari@cedar - Buffalo.edu
No ratings yet
Approximate Inference: Sargur Srihari Srihari@cedar - Buffalo.edu
18 pages
Math Psych 03
No ratings yet
Math Psych 03
48 pages
Linear Factor Models
No ratings yet
Linear Factor Models
14 pages
Probabilistic & Unsupervised Learning: Maneesh@gatsby - Ucl.ac - Uk
No ratings yet
Probabilistic & Unsupervised Learning: Maneesh@gatsby - Ucl.ac - Uk
10 pages
Semi-Parametric PCA for Dimensionality Reduction
No ratings yet
Semi-Parametric PCA for Dimensionality Reduction
8 pages
Pattern Recognition Notes Part-2 - Studocu
No ratings yet
Pattern Recognition Notes Part-2 - Studocu
16 pages
Tipping Bishop 1999
No ratings yet
Tipping Bishop 1999
12 pages
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
No ratings yet
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
4 pages
Cse291d 8
No ratings yet
Cse291d 8
50 pages
Unit 5 Deep Unsupervised Learning
No ratings yet
Unit 5 Deep Unsupervised Learning
30 pages
Synthetic ECG Generation For Data Augmentation and Transfer Learning in Arrhythmia Classification
No ratings yet
Synthetic ECG Generation For Data Augmentation and Transfer Learning in Arrhythmia Classification
23 pages
Data Mining Techniques
No ratings yet
Data Mining Techniques
33 pages
Deep Learning: Applications and Techniques
No ratings yet
Deep Learning: Applications and Techniques
2 pages
Probabilistic Factorization of Non-Negative Data With Entropic Co-Occurrence Constraints
No ratings yet
Probabilistic Factorization of Non-Negative Data With Entropic Co-Occurrence Constraints
8 pages
Adaptive Data Analysis Validity
No ratings yet
Adaptive Data Analysis Validity
29 pages
Probabilistic Principal Component Analysis (Tipping, Bishop)
No ratings yet
Probabilistic Principal Component Analysis (Tipping, Bishop)
13 pages
Lecture8 2015
No ratings yet
Lecture8 2015
51 pages
Gaussian Mixture Models & EM
No ratings yet
Gaussian Mixture Models & EM
38 pages
5.3 MLBasics Hyperparam
No ratings yet
5.3 MLBasics Hyperparam
13 pages
L11 - UCLxDeepMind DL2020
No ratings yet
L11 - UCLxDeepMind DL2020
68 pages
PML Class 1 2025
No ratings yet
PML Class 1 2025
54 pages
Data Handling: Probability Statistics II
No ratings yet
Data Handling: Probability Statistics II
98 pages
When Models Meet Data
No ratings yet
When Models Meet Data
25 pages
AS Week 3
No ratings yet
AS Week 3
11 pages
Introduction to PCA Techniques
No ratings yet
Introduction to PCA Techniques
26 pages
U 4 Extra
No ratings yet
U 4 Extra
28 pages
5.4 MLBasics Estimators
No ratings yet
5.4 MLBasics Estimators
23 pages
Identifiability of Latent Variable and Structural Equation Models: From Linear To Nonlinear
No ratings yet
Identifiability of Latent Variable and Structural Equation Models: From Linear To Nonlinear
33 pages
Unit 3
No ratings yet
Unit 3
27 pages
hw7 Sol
No ratings yet
hw7 Sol
12 pages
Chapter 4 Unsupervised Learning Dimensionality Reduction and Learning Theory
No ratings yet
Chapter 4 Unsupervised Learning Dimensionality Reduction and Learning Theory
32 pages
PR Mse2-1
No ratings yet
PR Mse2-1
11 pages
awdawdCS Quedawdstion Bankd
No ratings yet
awdawdCS Quedawdstion Bankd
4 pages
Linear Factor Model
No ratings yet
Linear Factor Model
19 pages
Week 6 v1.61 (Hidden) - Revision, CW1, and Probabilistic Graphical Models
No ratings yet
Week 6 v1.61 (Hidden) - Revision, CW1, and Probabilistic Graphical Models
65 pages
Lecture 16 - 25.09.2024 - PCA, Unsupervised Learning-Clustring & Metrics
No ratings yet
Lecture 16 - 25.09.2024 - PCA, Unsupervised Learning-Clustring & Metrics
51 pages
ML - Unit 3
No ratings yet
ML - Unit 3
4 pages
Wk01 Machine Learning
No ratings yet
Wk01 Machine Learning
6 pages
Unit 4 (PCA)
No ratings yet
Unit 4 (PCA)
12 pages
Module-2 Notes-Bcs602
No ratings yet
Module-2 Notes-Bcs602
18 pages
6.1 DeepFFNets
No ratings yet
6.1 DeepFFNets
47 pages
Deep Feedforward Networks Guide
No ratings yet
Deep Feedforward Networks Guide
103 pages
Unit 2
No ratings yet
Unit 2
88 pages
Data Science Techniques Overview
No ratings yet
Data Science Techniques Overview
85 pages
R Guide to Principal Components Analysis
No ratings yet
R Guide to Principal Components Analysis
12 pages
Presentation
No ratings yet
Presentation
31 pages
Principal Component Analysis: 2.1 Definition of Principal Components
No ratings yet
Principal Component Analysis: 2.1 Definition of Principal Components
8 pages
Machine Learning Class Notes: SVM & Bayesian Learning
No ratings yet
Machine Learning Class Notes: SVM & Bayesian Learning
16 pages
The Expectation-Maximization Algorithm: IEEE Signal Processing Magazine December 1996
No ratings yet
The Expectation-Maximization Algorithm: IEEE Signal Processing Magazine December 1996
15 pages
IRJMETS443407
No ratings yet
IRJMETS443407
7 pages
Chapter 02 Understanding of Data
No ratings yet
Chapter 02 Understanding of Data
63 pages
Deep Learning for Economic Forecasting
No ratings yet
Deep Learning for Economic Forecasting
40 pages
Ai - Foundations of Machine Learning III
No ratings yet
Ai - Foundations of Machine Learning III
98 pages
U3 Prob & Stat & Hypo
No ratings yet
U3 Prob & Stat & Hypo
80 pages
1 Inference
No ratings yet
1 Inference
9 pages
Pattern Recognition and AI Using Matlab Textbook PDF
No ratings yet
Pattern Recognition and AI Using Matlab Textbook PDF
263 pages
2010 Key Determinants of Service Quality in Retail Banking
No ratings yet
2010 Key Determinants of Service Quality in Retail Banking
16 pages
GATE 2021 General Aptitude Questions
No ratings yet
GATE 2021 General Aptitude Questions
42 pages
Chemometric Software For Multivariate Data Analysis Based On Matlab
No ratings yet
Chemometric Software For Multivariate Data Analysis Based On Matlab
8 pages
Iranian EFL Listening Challenges
No ratings yet
Iranian EFL Listening Challenges
16 pages
IIT Kanpur Machine Learning End Sem Paper
100% (1)
IIT Kanpur Machine Learning End Sem Paper
10 pages
Tongue Drive System Doc Com Pleated
No ratings yet
Tongue Drive System Doc Com Pleated
20 pages
Journal of Ecology - 2016 - Kramer Walter - Root Traits Are Multidimensional Specific Root Length Is Independent From Root
No ratings yet
Journal of Ecology - 2016 - Kramer Walter - Root Traits Are Multidimensional Specific Root Length Is Independent From Root
12 pages
Gromski Et Al. - 2015 - A Tutorial Review Metabolomics and Partial Least Squares-Discriminant Analysis - A Marriage of Convenience or A
No ratings yet
Gromski Et Al. - 2015 - A Tutorial Review Metabolomics and Partial Least Squares-Discriminant Analysis - A Marriage of Convenience or A
14 pages
(Ebook PDF) Data Mining and Predictive Analytics 2nd Edition Download
No ratings yet
(Ebook PDF) Data Mining and Predictive Analytics 2nd Edition Download
138 pages
Sugarcane Precision Agriculture System
No ratings yet
Sugarcane Precision Agriculture System
7 pages
Customer Data Analysis
No ratings yet
Customer Data Analysis
14 pages
22BCE7750 ML Assignment
No ratings yet
22BCE7750 ML Assignment
23 pages
Lithological Mapping and Hydrothermal Alteration Using Landsat 8 Data: A Case Study in Ariab Mining District, Red Sea Hills, Sudan
100% (1)
Lithological Mapping and Hydrothermal Alteration Using Landsat 8 Data: A Case Study in Ariab Mining District, Red Sea Hills, Sudan
10 pages
Smart: Journal of Business Management Studies
No ratings yet
Smart: Journal of Business Management Studies
0 pages
Comparative Study of Allopathic and Herbal Medicines
100% (1)
Comparative Study of Allopathic and Herbal Medicines
12 pages
Educational Development Index PDF
100% (1)
Educational Development Index PDF
48 pages
Deep Trust A Novel Framework For Dynamic Trust and Reputation Management in The Internet of Things IoT-Based Networks
No ratings yet
Deep Trust A Novel Framework For Dynamic Trust and Reputation Management in The Internet of Things IoT-Based Networks
13 pages
Sonar Class: A MATLAB Toolbox For The Classification of Side Scan Sonar Imagery, Using Local Textural and Reverberational Characteristics
No ratings yet
Sonar Class: A MATLAB Toolbox For The Classification of Side Scan Sonar Imagery, Using Local Textural and Reverberational Characteristics
7 pages
Importance Indices in Ethnobotany: Bruce Hoffman and Timothy Gallaher
No ratings yet
Importance Indices in Ethnobotany: Bruce Hoffman and Timothy Gallaher
18 pages
JPM 1997 409612
No ratings yet
JPM 1997 409612
13 pages
Biosystems Engineering Research Review 15 PDF
No ratings yet
Biosystems Engineering Research Review 15 PDF
207 pages
.Machine Learning Algorithms Trends, Perspectives and Prospects
No ratings yet
.Machine Learning Algorithms Trends, Perspectives and Prospects
8 pages
Influence of Storage Conditions On The Quality Properties PDF
No ratings yet
Influence of Storage Conditions On The Quality Properties PDF
8 pages
Unit 6 Int255
No ratings yet
Unit 6 Int255
23 pages
4 Online Mandatory 11 11 40 Yes 0 No No 0 0 Minutes 0 1 640653126842 No
No ratings yet
4 Online Mandatory 11 11 40 Yes 0 No No 0 0 Minutes 0 1 640653126842 No
9 pages
Best Practices For Single-Cell Analysis Across Modalities
No ratings yet
Best Practices For Single-Cell Analysis Across Modalities
23 pages
Diabetes Prediction Using Machine Learning
No ratings yet
Diabetes Prediction Using Machine Learning
8 pages
Marketing Research
No ratings yet
Marketing Research
29 pages
A Machine Learning-Based Methodology To Predict Learners' Dropout, Success or Failure in MOOCs
No ratings yet
A Machine Learning-Based Methodology To Predict Learners' Dropout, Success or Failure in MOOCs
21 pages

13 LinearFactorModels

Uploaded by

13 LinearFactorModels

Uploaded by

Deep Learning Srihari

Linear Factor Models

Topics in Linear Factor Models

Deep Learning is about Models

Simplest model with latent variables

Models with Latent Variables

– These latent variables provide another means of

Models with Latent Variables

– These latent variables provide another means of

Linear factor models

Linear Factor Model Definition

– So that it is easy to sample from

Graphical Representation of Linear Factor Model

h~p(h) with p(h) = ∏ p(hi )

Special cases of Linear Factor Model

• Special cases of above equations are:

• Prior p(h) is a unit variance Gaussian h ~N(h;0,I)

• A slightly modified factor analysis model

PCA (Principal Components Analysis)

PCA deals with the curse of dimensionality by capturing

PC1 must convey the PC2 is the second line that

PCA Algorithm (Linear Algebra)

Confirmatory Factor Analysis

Researcher’s Hypothesis Confirmatory factor analysis

if questions 1, 2 and 3 all measure numeric IQ, then the

Exploratory Factor Analysis

Exploratory Factor Analysis

PCA vs. Factor Analysis

• Both are data reduction techniques

PCA vs Factor Analysis

Four measured F, the latent Factor, is causing the responses

Independent Component Analysis

1. Extracting source from noisy signal

2. Cocktail party problem: speech signals of people

ICA requires independent signals

• Learning proceeds using maximum likelihood

ICA signal separation

Independent source A Top: A -2B Recovered signals

Independent source B Bottom:1.7A+3.4B

Choice of p(h) in ICA

Slow Feature Analysis

SFA in computer vision

• Regularize model to learn features that change

– where f is feature extractor to be regularized

Sparse Coding definition

Manifold Interpretation of PCA

Flat Gaussian near low-dimensional

Shows upper half of pancake

Variance orthogonal to manifold is small (can be considered as noise)

Summary of Linear Factor Models

Distribution of stars: Galaxy M31 in Andromeda

The 3-dimensional data is largely present on a 2-dimensional plane.

M31 Photo Courtesy of Michael Caliguri

You might also like