0% found this document useful (0 votes)

65 views62 pages

DL - Module 3

Uploaded by

faiqansari2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views62 pages

DL - Module 3

Uploaded by

faiqansari2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

DEEP LEARNING-MODULE 3

Dr.Neethu Anna Sabu

Given an input value
z=−1.2, compute the output of the
following activation functions:
1.Sigmoid
2.Tanh
3.ReLU
4.Leaky ReLU (slope = 0.01)

For the same input

z=−1.2, calculate the gradient
(derivative w.r.t. input) for the same
activation functions.
AUTOENCODERS
• unsupervised representation learning, where neural networks discover meaningful
patterns or features in unlabeled data
• Imposes a bottleneck in the network- bottleneck is a hidden layer in the middle of the
network that has fewer neurons than the input or output layers-
• It forces the network to compress the input data into a lower-dimensional
representation
• bottleneck forces a compressed knowledge representation of the input
• An autoencoder is a neural network that is trained to attempt to copy its input to its
output
• The network may be viewed as consisting of two parts:
• an encoder function h = f (x) and
• a decoder that produces a reconstruction r = g(h).

• An autoencoder is a type of neural network architecture designed to efficiently

compress (encode) input data down to its essential features, then reconstruct (decode)
the original input from this compressed representation.
• Autoencoders should not copy perfectly
• But restricted by design to copy only approximately
• By doing so, it learns useful properties of the data
• Modern autoencoders use stochastic mappings
• Autoencoders were traditionally used for
• Dimensionality reduction as well as feature learning

• Auto-encoder is a complex mathematical model which trains on unlabeled as well as

unclassified data and is used to map the input data to another compressed feature
representation and from that feature representation reconstructing back the input data.
ENCODER
This part of the network
compresses the input.
•It reduces the dimensionality
using hidden layers.
•It learns important features
and discards irrelevant
details.

•BOTTLENECK
•Compressed
representation (latent
space).
•It contains only the most
important features needed
to represent the input.

•DECODER
•Reconstructs the original
input
Structure of an autoencoder
Components of the Autoencoder
1. Input (x)
•Raw data (e.g., an image, signal, etc.)
2. Encoder Function (f)
•Maps the input x to a compressed representation h:
•h=f(x)
This layer often includes nonlinear transformations (e.g.,
ReLU, sigmoid)
3.Hidden Layer / Code (h)
•The bottleneck layer — a lower-dimensional
representation of the input.
•Forces the network to learn the most important features
in a compact form.
• 4. Decoder Function (g)
Reconstructs the input from the hidden code:
r=g(h)
• 5. Reconstructed Output (r)
The network’s attempt to recreate the input x from the compressed code.
The autoencoder is trained to minimize the reconstruction error between the input x and
the output r.

Common loss functions include:

• Mean Squared Error (MSE):
APPLICATIONS
1.Dimensionality Reduction : Dimension Reduction refers to the process of
converting a set of data having vast dimensions into data with lesser dimensions
ensuring that it conveys similar information concisely.
2.Image -Denoising : A noisy image can be given as input to the autoencoder and a
de-noised image can be provided as output. The autoencoder will try de-noise the
image by learning the latent features of the image and using that to
reconstruct an image without noise. The reconstruction error can be calculated
as a measure of distance between the pixel values of the output image and ground
truth image.
3.Feature Extraction : Once the model is fit on training dataset, the reconstruction
(decoding) aspect of the model can be discarded and the model up to the point
of the bottleneck can be used (only the encoding part is required). The output of
the model at the bottleneck is a fixed-length vector that provides a compressed
representation of the input data.
1.Data Compression : It is a process to reduce the number of bits
needed to represent data. Compressing data can save storage capacity,
speed up file transfer, and decrease costs for storage hardware and
network bandwidth. Auto-encoders are able to generate reduced
representation of input data.
2.Removing Watermarks from Images
• Drawbacks of Auto-Encoders :
1.An autoencoder learns to capture as much information as possible rather than as
much relevant information as possible.
2.To train an autoencoder there is a need of lot of data, processing time,
hyperparameter tuning, and model validation before even start building the
real model.
3.Trained with “back-propagation technique” using loss-metric, there are chances of
crucial information loss during reconstruction of input.
•.

ASSUMPTIONS
• High degree of correlation exists in the data
• For uncorrelated data, input features are independent, then compression and
subsequent reconstruction would be difficult
•There's nothing to combine or merge without losing information.
•So if you reduce dimensionality, you're throwing away important parts — not just noise
or overlapping data
•Every feature contains unique, essential information
AUTOENCODER

Error should be minimum

No: of neurons in hidden layer< input layer
Dimensionality of x and x^ should be same
Expectation
• Sensitive enough to input for accurate reconstruction-x and x^ should be same
• AE should reconstruct input accurately-identity function can be done as
reconstruction is the idea-NOT OUR AIM
• network just memorizes the input rather than extracting patterns or learning
compressed features.
• Input data should be in compressed domain-AIM
• Insensitive enough that it doesn’t memorize or overfit the training data- does
not learn identity function
• fails to reconstruct or encode new/unseen inputs accurately- learned noise and
specific details, not the underlying structure
• CONFLICTING
By designing loss function which solves both
• our loss function of autoencoder is composed of two different parts.
• Reconstruction loss- Measures how close the reconstructed output X^ is to the
original input X.
• Regularization term adds a penalty to the loss function to:
• Prevent overfitting
• Enforce constraints (e.g., sparsity, smoothness)
• Encourage robust representations

• The first part is the loss function (e.g. mean squared error loss) calculating the
difference between input data and output data while the second term would act
as regularization term which prevents autoencoder from overfitting.
• PCA (Principal Component Analysis) in deep learning is a dimensionality
reduction technique used as a preprocessing step before training deep neural
networks.
• PCA transforms high-dimensional data into a lower-dimensional space by
identifying the directions (principal components) that capture the most variance in
the data.
• This compressed representation can be fed into deep learning models to reduce
complexity, speed up training, and avoid overfitting.
• PCA is like a linear autoencoder with no activation functions(only linear
transformation) and just one hidden layer.
• But autoencoders are more powerful, especially when data relationships are
nonlinear.
• PCA is a linear dimensionality reduction technique, not a neural network.But a
linear autoencoder (which is a neural network) can behave like PCA.
Linear autoencoder
• A Linear Autoencoder is a type of autoencoder where all layers use only linear
transformations (i.e., no activation functions like ReLU, Sigmoid, or Tanh are used).
• Cannot capture non-linear paterns
• Even without non-linearities, a linear autoencoder can learn a compressed
representation of input data with linear transformation
• It is mathematically similar to Principal Component Analysis (PCA).
•Decoder: Maps back to original space (linear reconstruction).
•Encoder: Projects input into a lower-dimensional space (linear projection).
•Loss Function: Mean Squared Error (MSE) between input X and output X^
• When:
• The autoencoder has a single hidden layer
• Uses MSE loss
• No activation functions
• Weights are tied (i.e., decoder weights are the transpose of encoder weights:))
• Then it performs exactly like PCA, learning the same principal subspace (lower
dimensional space).
•.

•X: The input dataset.

•n: Number of data samples (rows).
•d: Number of features (input dimensions).
•So each data sample is a d-dimensional
vector
•h: Latent representation (code) of the input.
•W1: Encoder weight matrix.
•X^: Reconstructed version of the input.
•W2: Decoder weight matrix.
•A linear autoencoder with one hidden layer,
tied weights, and MSE loss will learn the
same subspace as Principal Component
Analysis (PCA).
•This is why linear autoencoders are often
considered a neural version of PCA.
Undercomplete autoencoders

By forcing the model to pass the input through

a lower-dimensional space, it must learn to
compress the most important features of the
data.
This prevents it from simply copying the input
(i.e., identity mapping).
Like PCA, but can model non-linear structures.
Learns compressed, informative features.
• Autoencoders whose dimensions are less than the input
dimension are called undercomplete autoencoder.
• By penalizing the network according to the reconstruction error,
the model can learn and capture the most salient features.
• Special case: encoder & decoder are linear, Loss function is
mean square error - Reduces to Principal Component Analysis
• We know neural networks are capable of learning non-linear
functions, and autoencoders such as this one can be thought of
as a non-linear PCA.
• AE training-minimize the loss function
• Mean Squared Error (MSE) Loss function commonly used in
autoencoders
• There are several ways to design autoencoders to copy only
approximately
• The system learns useful properties of the data
• One way makes dimension h < dimension x
• Undercomplete: h has smaller dimension than x
• Overcomplete: h has greater dimension than x
• Principle Component Analysis (PCA)
• An undercomplete autoencoder, linear decoder and MSE loss
function, learns same subspace as PCA
• Nonlinear encoder/decoder functions yield more powerful nonlinear
generalizations of PCA
AVOIDING T RIVIAL IDENT I T Y
• Undercomplete autoencoders

• h has lower dimension than x

• f or g has low capacity (e.g., linear g)

• Must discard some information in h

• Overcomplete autoencoders

• h has higher dimension than x

• Must be regularized
AUTOENCODER

• Linear transformation and then adding non-

linearity
• Hidden representation captures everything that it
requires to capture xi
• After passing through bottle neck layer,
reconstructed output=input
• Analogy with PCA-linear transformation
• similar in that they both aim to reduce the
dimensionality of data and learn compressed
representations
Case 1-Undercomplete autoencoder

Analogy with PCA-h has all imp characteristics to

reconstruct the data, perfectly reconstructs input
from bottle neck layer
Case 1-Overcomplete autoencoder

• 10 bits initially,
store in 16 bit (all
which was in
original)
• Dimension of h >
xi
• All information in
bottleneck layer
Example 01
o/p-binary
No real nos

Appropriate fn for decoder-f?

Example 02
• Linear-produce any
real no:
• https://apxml.com/courses/autoencoders-representation-
learning/chapter-3-regularized-autoencoders/overfitting-in-
autoencoders
Regularization in AE
• Overcomplete autoencoders must be regularized because without regularization,
they risk learning a trivial identity function, defeating the purpose of learning
meaningful representations.
•The autoencoder has more capacity than needed.
•It can easily memorize the input by learning the identity function:
X^=X
•This leads to poor generalization and no meaningful feature learning.
Regularization adds constraints or penalties that make this easy identity mapping undesirable or
infeasible.
Regularization is most often applied to the encoder, because the encoder determines the quality
of the latent representation, which is critical to meaningful learning.

However, decoder regularization is also important to:

•Prevent overfitting during reconstruction
•Ensure smooth mappings from code to output
•Stabilize training
Regularized autoencoders
• A similar problem occurs if the hidden code is allowed to have dimension equal to
the input, and in the overcomplete case in which the hidden code has dimension
greater than the input.
• In these cases, even a linear encoder and a linear decoder can learn to copy the
input to the output without learning anything useful about the data distribution
• With proper regularization of the encoder, autoencoders of any architecture can be
effectively trained by selecting appropriate code dimensionality and network
capacity based on the complexity of the data distribution-Regularized encoder
• Allow overcomplete case but regularize
• To prevent the autoencoder from merely copying the input to the output, the loss
model should incorporate additional terms :-
that encourage desirable properties such as sparsity in the representation,
minimal sensitivity to input changes (small derivatives), and robustness to noise
or missing data.
•SPARSE AUTOENCODER
• modification of an overcomplete autoencoder.
• A sparse autoencoder is a type of model that has been regularized to respond to
unique statistical features by introducing a regularization technique by
enforcing sparsity on the activations within the hidden (bottleneck) layer.
• The core idea is that for any given input sample, only a small subset of neurons
in the hidden layer should be significantly active (i.e., have non-zero or near-
non-zero activation values)
• A sparse autoencoder will be forced to selectively activate regions of the
network depending on the input data.
• This eliminates the networks capacity to memorize the features from the input
data, and since some of the regions are activated while others aren’t, the network
therefore learns the useful information and features.
• https://apxml.com/courses/autoencoders-representation-learning/chapter-3-
regularized-autoencoders/sparse-autoencoders-regularization
•A sparsity constraint on the hidden
layer, encouraging most activations to
be zero.
It can be trained using
backpropagation like other
autoencoders, but with an added
penalty term in the loss function.

• Regularization on activation of
hidden layer nodes
How regularization is done?
Sparsity constraint
Sparsity helps the autoencoder to:
•Learn more interpretable features.
•Avoid simply learning the identity function.
•Encourage disentangled representations,
similar to those found in biological systems.

It's particularly useful when the hidden layer

• Assume non-linear activation as sigmoid has more neurons than the input layer (i.e.,
• Active and non –active at 1 and 0 as o/p an overcomplete autoencoder), which would
• Compute average activation at jth node in hidden otherwise just memorize the input.
layer
• m no: of i/p vectors; xi –any input
• Assume sparsity parameter as 0.02/0.002 or any
smaller value
Regularization term can be as KL
divergence (difference) between 2
distributions (one defined by ρ &
ρ^)

W-weight matrix
Relative weight putting between
data loss component and
regularization component
KL Divergence
• Defining a target sparsity parameter, ρ, which represents the desired
average activation level for each hidden neuron (e.g., ρ=0.05 meaning
we want each neuron to be active, on average, only 5% of the time).

compute the actual average activation of the j-th

hidden neuron, ρ^j, over a set of m training
samples:

Use the Kullback–Leibler (KL) divergence to

measure the difference between the desired
average activation ρ and the observed average
activation ρ^jρ^j
• KL divergence term acts as a penalty. It is minimized (equals zero)
when ρ^j=ρ, and increases as ρ^j deviates from ρ.
• We sum this penalty over all s hidden units and add it to the
reconstruction loss, weighted by another hyperparameter β

Minimizing this total loss encourages the network to achieve accurate reconstruction while ensuring
that the average activation of each hidden neuron ρ^j stays close to the target sparsity level ρ
L1 Regularization for Sparsity

• Refer
•Denoising autoencoder
• Autoencoders are Neural Networks which are commonly used for
feature selection and extraction.
• However, when there are more nodes in the hidden layer than there are
inputs, the Network is risking to learn the so-called “Identity Function”,
also called “Null Function”, meaning that the output equals the input,
marking the Autoencoder useless.
• This type of Autoencoder is an alternative to the concept of regular
Autoencoder which is prone to a high risk of overfitting.
• In the case of a Denoising Autoencoder, the data is partially corrupted
by noises added to the input vector in a stochastic manner.
• Then, the model is trained to predict the original, uncorrupted data
point as its output.
• Your friend has learned the manifold from seeing thousands of faces.
• When you hand them a smudged picture, they mentally “snap” it back
to the nearest real-face pattern they know, then redraw it cleanly.
• The denoising autoencoder does exactly this with images, except
mathematically — learning to map noisy data back to the clean data
manifold.
NOISY DATA
ORIGINAL DATA
No memorizing of
original data
Learns a vector field
that maps to a low
dimensional data
f(x)
g(f(x)
X^=g(f(x))
Adding noise to input data X to get
corrupted data/noisy data as X~ (X tilda)
X^=r(X)

Loss function-to minimize-squared error

loss

Learns vector field that maps an input

data to a point in lower dimensional
manifold
• manifold is a lower-dimensional surface embedded in a higher-dimensional space,
where the true data points lie.
• In a Denoising Autoencoder, manifold learning means learning a mapping that
takes corrupted data off the manifold and pulls it back to the high-density
data manifold, capturing the structure of the true data distribution.
• A DAE learns the manifold by training on noisy inputs x~to reconstruct the
clean data x.
• The noise forces the model to focus on the robust, underlying patterns (the
manifold) rather than memorizing noisy details.
• The encoder maps the noisy input to a point on the manifold (latent
representation), and the decoder reconstructs the clean data from this point.
Manifold learning-manifold interpretation of a
Denoising Autoencoder
• The diagram shows how a denoising autoencoder learns to
map noisy inputs back to the clean data manifold.
• The blue curve represents the manifold where real, clean
data points (red X’s) lie.
• When noise is added to a clean sample X, it is displaced to
a corrupted version X~ (brown circle region), which lies
off the manifold.
• The DAE, through its reconstruction function r(x), learns a
vector field r(x)−x (green arrows) that points from noisy
inputs toward the nearest point on the manifold.
• This process effectively “pulls” corrupted samples back to
high-density regions of the data distribution,- moves
Manifold learning has mostly focused on towards manifold-
unsupervised learning procedures that attempt to • MANIFOLD LEARNING-capturing the manifold’s
capture these manifolds structure and making the learned features more robust to
noise.
• Autoencoders exploit the idea that data concentrates around a low-
dimensional manifold or a small set of such manifolds
• Prevents memorizing-learns a vector field-helps AE to move any point
in space to a nearest point in manifold
• All autoencoder training procedures involve a compromise between two
forces:
Learning a representation h of a training example x such that x can
be approximately recovered from h through a decoder.
Satisfying the constraint or regularization penalty.
This can be an architectural constraint that limits the capacity of the
autoencoder, or it can be regularization term added to the reconstruction
cost.
Vector Field Learned by a Denoising
Autoencoder
Example of Denoising AE
Computational graph for the cost function of a
denoising autoencoder.
• The clean data x is corrupted to x~, encoded
(by encoder f) into features h (compressed
features), and then decoded (by decoder g)
back to reconstruct x.
• The loss measures how well the
reconstructed output matches the original
clean data, encouraging the network to
learn a mapping from noisy inputs back to
the data manifold.
merits
• Learns more robust filters
• Prevents from learning a simple identify function
• Decreases the risk of overfitting that can be problematic with
regular AE
CONTRACTIVE AUTOENCODERS

The contractive autoencoder (CAE) uses a

regularizer to make the derivatives of f(x) as small as
possible
The name contractive arises from the way the CAE
warps space
The input neighborhood is contracted to a
smaller output neighborhood
The CAE is contractive only locally
Regularization
• In the limit of small Gaussian input noise, the
denoising reconstruction error is equivalent to a
contractive penalty on the reconstruction
function that maps x to r- g(f(x))
• In other words, denoising autoencoders make the
reconstruction function resist small but ﬁnite-
sized perturbations of the input, while contractive
autoencoders make the feature extraction
function resist inﬁnitesimal perturbations of the
input.
• The goal of the CAE is to learn the manifold
structure of the data
• One practical issue with the CAE regularization
criterion is that although it is cheap to compute in
the case of a single hidden layer autoencoder, it
becomes much more expensive in the case of
deeper autoencoders
• Another practical issue is that the contraction
penalty can obtain useless results if we do not
impose some sort of scale on the decoder.
APPLICATIONS

Autoencoder - Unit 4
No ratings yet
Autoencoder - Unit 4
39 pages
D5 PPT
No ratings yet
D5 PPT
79 pages
Lecture 14 Autoencoders
No ratings yet
Lecture 14 Autoencoders
39 pages
ML Lec 19 Autoencoder
No ratings yet
ML Lec 19 Autoencoder
54 pages
DL Module III Till IA-1
No ratings yet
DL Module III Till IA-1
15 pages
Module 3 DL
No ratings yet
Module 3 DL
103 pages
MODULE 5 Auto-Encoders and Generative Models
No ratings yet
MODULE 5 Auto-Encoders and Generative Models
25 pages
Ch3 Auto Encoder
No ratings yet
Ch3 Auto Encoder
40 pages
Autoencoders
No ratings yet
Autoencoders
103 pages
Autoencoders: Neural Network Guide
No ratings yet
Autoencoders: Neural Network Guide
20 pages
Auto Encoder
No ratings yet
Auto Encoder
10 pages
Autoencoders
No ratings yet
Autoencoders
35 pages
DL M3 Tech
No ratings yet
DL M3 Tech
15 pages
Unit4 1
No ratings yet
Unit4 1
42 pages
Lecture 23b Auto Encoder
No ratings yet
Lecture 23b Auto Encoder
27 pages
DL Class5
No ratings yet
DL Class5
23 pages
Autoencoders in Machine Learning
No ratings yet
Autoencoders in Machine Learning
7 pages
Unit5 Autoencoders
No ratings yet
Unit5 Autoencoders
45 pages
Chapter 7 - Autoencoders
No ratings yet
Chapter 7 - Autoencoders
91 pages
Experiment 4
No ratings yet
Experiment 4
26 pages
Autoencoders
No ratings yet
Autoencoders
4 pages
Brief Introduction On Current Research Areas - Autoencoders
No ratings yet
Brief Introduction On Current Research Areas - Autoencoders
20 pages
Unit 3
No ratings yet
Unit 3
39 pages
Module 03
No ratings yet
Module 03
13 pages
DL Unit 5
No ratings yet
DL Unit 5
19 pages
L23 Autoencoders
No ratings yet
L23 Autoencoders
16 pages
Autoencoder
No ratings yet
Autoencoder
39 pages
Auto Encoder
No ratings yet
Auto Encoder
39 pages
Unit-V DL
No ratings yet
Unit-V DL
31 pages
Unit 4
No ratings yet
Unit 4
10 pages
Unit 5
No ratings yet
Unit 5
27 pages
Deep Learning: Prof:Naveen Ghorpade
No ratings yet
Deep Learning: Prof:Naveen Ghorpade
43 pages
DeepLearning Unit IV Notes
No ratings yet
DeepLearning Unit IV Notes
58 pages
Deep Learning Module-2 & 4
No ratings yet
Deep Learning Module-2 & 4
48 pages
Unit II
No ratings yet
Unit II
35 pages
Introduction To Autoencoders: A Brief Overview
No ratings yet
Introduction To Autoencoders: A Brief Overview
27 pages
DL Unit 4
No ratings yet
DL Unit 4
21 pages
Autoencoders
No ratings yet
Autoencoders
12 pages
Lecture 6373 07
No ratings yet
Lecture 6373 07
53 pages
ch14 Autoencoder
No ratings yet
ch14 Autoencoder
42 pages
Unit 3
No ratings yet
Unit 3
23 pages
Autoencoders: Types and Applications
No ratings yet
Autoencoders: Types and Applications
91 pages
M2 - Autoencoders
No ratings yet
M2 - Autoencoders
25 pages
Unit V
No ratings yet
Unit V
32 pages
7& 9 Autoencoder and Variational Autoencoder
No ratings yet
7& 9 Autoencoder and Variational Autoencoder
13 pages
Auto Encoder S
No ratings yet
Auto Encoder S
52 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
DL Unit 2B
No ratings yet
DL Unit 2B
23 pages
Unit 5e - Autoencoders
No ratings yet
Unit 5e - Autoencoders
32 pages
DL Unit - 4
No ratings yet
DL Unit - 4
26 pages
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
No ratings yet
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
22 pages
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
No ratings yet
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
22 pages
UNIT-5 Part1
No ratings yet
UNIT-5 Part1
15 pages
Lesson 8 AutoEncoders
No ratings yet
Lesson 8 AutoEncoders
29 pages
Vae Gan
No ratings yet
Vae Gan
214 pages
Study Materials - Denoising Autoencoders
No ratings yet
Study Materials - Denoising Autoencoders
7 pages
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
61 pages
Auto Encoders
No ratings yet
Auto Encoders
4 pages
Chapter17 Autoencoders
No ratings yet
Chapter17 Autoencoders
23 pages
Instant Access To The Problem Centred Interview Principles and Practice Andreas Witzel Ebook Full Chapters
100% (10)
Instant Access To The Problem Centred Interview Principles and Practice Andreas Witzel Ebook Full Chapters
77 pages
Subject: Grade 9A
No ratings yet
Subject: Grade 9A
12 pages
Summer Internship at MSME
No ratings yet
Summer Internship at MSME
2 pages
Introduction To Psychology
No ratings yet
Introduction To Psychology
9 pages
Electrical and Solar Installation Technology
100% (1)
Electrical and Solar Installation Technology
4 pages
Ahmed Radwan 1-1
No ratings yet
Ahmed Radwan 1-1
3 pages
Bulk Solids Chute Design Guide
No ratings yet
Bulk Solids Chute Design Guide
17 pages
EEAR 2022 Aeronáutica English Exam
No ratings yet
EEAR 2022 Aeronáutica English Exam
10 pages
Nuclear Fuel Rod Thermal Analysis
No ratings yet
Nuclear Fuel Rod Thermal Analysis
12 pages
Expt 7 Classification Tests For Hydrocarbons
87% (30)
Expt 7 Classification Tests For Hydrocarbons
7 pages
Chemistry Practical
No ratings yet
Chemistry Practical
14 pages
Algorithms For Polynomial and Rational Approximation
No ratings yet
Algorithms For Polynomial and Rational Approximation
141 pages
Grade 7 PE VPA Paper 1 Midyear 2024
75% (4)
Grade 7 PE VPA Paper 1 Midyear 2024
4 pages
TR 28
100% (1)
TR 28
4 pages
Interference Between Trawl Gear and Pipelines: By: Afifah Abdul Rashid
No ratings yet
Interference Between Trawl Gear and Pipelines: By: Afifah Abdul Rashid
30 pages
Web Engineering Essentials
No ratings yet
Web Engineering Essentials
7 pages
Bommer Et Al 2015 A Sshac Level 3 Probabilistic Seismic Hazard Analysis For A New Build Nuclear Site in South Africa
No ratings yet
Bommer Et Al 2015 A Sshac Level 3 Probabilistic Seismic Hazard Analysis For A New Build Nuclear Site in South Africa
38 pages
LS 6 Lesson 1 - Parts of A Desktop Computer
No ratings yet
LS 6 Lesson 1 - Parts of A Desktop Computer
64 pages
Situating Uncertainty in Clinical Decisi
No ratings yet
Situating Uncertainty in Clinical Decisi
7 pages
Poems
No ratings yet
Poems
7 pages
Od 226429569883076000
No ratings yet
Od 226429569883076000
2 pages
Relative Masses of Atoms and Molecules
100% (1)
Relative Masses of Atoms and Molecules
23 pages
Dutch Flower Industry Analysis
No ratings yet
Dutch Flower Industry Analysis
13 pages
Hydraulic Systems Design Guidelines
100% (3)
Hydraulic Systems Design Guidelines
29 pages
Office: of The Secretary
No ratings yet
Office: of The Secretary
8 pages
Epson SureLab OrderController Operation Guide en
No ratings yet
Epson SureLab OrderController Operation Guide en
196 pages
SDN Labs
No ratings yet
SDN Labs
282 pages
12 Mock Test 2025
No ratings yet
12 Mock Test 2025
2 pages
Value Based Healthcare - DR Robert Kaplan
100% (1)
Value Based Healthcare - DR Robert Kaplan
43 pages
BTD-300 Software Manual - 106546.04C
No ratings yet
BTD-300 Software Manual - 106546.04C
46 pages

DL - Module 3

Uploaded by

DL - Module 3

Uploaded by

DEEP LEARNING-MODULE 3

Dr.Neethu Anna Sabu

For the same input

• An autoencoder is a type of neural network architecture designed to efficiently

• Auto-encoder is a complex mathematical model which trains on unlabeled as well as

Common loss functions include:

Error should be minimum

•X: The input dataset.

By forcing the model to pass the input through

• h has lower dimension than x

• f or g has low capacity (e.g., linear g)

• Must discard some information in h

• h has higher dimension than x

• Linear transformation and then adding non-

Analogy with PCA-h has all imp characteristics to

Appropriate fn for decoder-f?

However, decoder regularization is also important to:

It's particularly useful when the hidden layer

compute the actual average activation of the j-th

Use the Kullback–Leibler (KL) divergence to

Loss function-to minimize-squared error

Learns vector field that maps an input

The contractive autoencoder (CAE) uses a

You might also like