Introduction
• A Generative Model is a powerful way of learning any
 kind of data distribution using unsupervised learning and
 has achieved tremendous success.
• All types of generative models aim at learning the true
 data distribution of the training set so as to generate new
 data points with some variations.
• Deep generative models (DGMs) are neural networks with
 many hidden layers trained to approximate complicated,
 high-dimensional probability distributions using a large
 number of samples.
 Introduction
• These models have gained significant attention in recent years
 due to their ability to learn complex data distributions and
 generate new samples from those distributions.
• When trained successfully, we can use DGMs to estimate the
 likelihood of each observation and create new samples from the
 underlying distribution.
• The two most popular approaches for deep generative modeling
 are:
1. Variational Autoencoders (VAE)
2. Generative Adversarial Networks (GAN).
 Introduction
1. Variational Autoencoders (VAE):
VAEs are probabilistic graphical models rooted in Bayesian
inference. VAEs aim to learn a low-dimensional latent
representation of training data, which can be used to
generate new data points.
VAEs combine an encoder and a decoder network.
The encoder maps input data to a latent space, and the
decoder generates samples from this latent space.
VAEs are commonly used for generative tasks and
representation learning.
 Introduction
2. Generative Adversarial Networks (GAN): GANs consist of
a generator and a discriminator.
• The   generator   generates   data   samples,   while   the
 discriminator evaluates whether a given sample is real or
 generated.
• The training process involves an adversarial game
 between the generator and discriminator, leading to the
 generator learning to produce realistic data samples.
  Boltzmann machine
• Boltzmann machine is designed to learn probability
 distributions over its set of inputs.
• There are three key concepts to know about Boltzmann
 machine:
1. Stochasticity : Unlike traditional deterministic neural
networks, Boltzmann machines incorporate randomness.
• The state of each neuron (node) in the network is
 determined probabilistically based on the states of the
 neighboring neurons and a temperature parameter.
   Boltzmann machine
2. Energy Function: The Boltzmann machine assigns an
energy to each possible state of the system.
Lower energy states are more probable. The energy function
typically involves weights between nodes and biases.
3. Equilibrium: The machine aims to reach a thermal
equilibrium where the distribution of states follows the
Boltzmann distribution.
This distribution specifies that the probability of a system
being in a certain state decreases exponentially with the
energy of that state.
  Boltzmann machine
• A Boltzmann machine is essentially a fully connected,
 two-layer neural network.
• These two layers represents as the visual and hidden
 layers.
• The visual layer is analogous to the input layer in
 feedforward neural networks.
• A Boltzmann machine has a hidden layer, it functions
 more as an output layer.
• The Boltzmann machine has no hidden layer between
 the input and output layers.
   Boltzmann machine
• The basic units of a Boltzmann machine are binary
 neurons that can be in one of two states: on (1) or off (0).
• There are two types of units in a Boltzmann machine:
• Visible units: Correspond to the input data.
• Hidden units: Capture dependencies and abstract features
 that are not directly observed.
• Weights: Represent connections between pairs of units.
 These can be symmetric (i.e., the weight from unit i to unit
 j is the same as from unit j to unit i).
• Biases: Represent the threshold for each unit.
   Boltzmann machine
• Figure below shows the very simple structure of a Boltzmann
 machine:
• The above Boltzmann machine has three hidden neurons and
 four visible neurons.
• A Boltzmann machine is fully connected because every neuron
 has a connection to every other neurons. However, no neuron is
 connected to itself.
   Boltzmann machine
• Types of Boltzmann Machines
1. Restricted Boltzmann Machines (RBMs): A simplified
version of the Boltzmann machine where the network is
restricted to a bipartite graph, meaning there are no
connections within the visible units or the hidden units.
• The Figure below RBM is not fully connected. All hidden
 neurons are connected to each visible neuron.
   Boltzmann machine
• There are no connections among the hidden neurons nor there
 are connections among the visible neurons.
2. Deep Belief Networks (DBNs): Composed of multiple layers of
RBMs. These networks can learn hierarchical representations of
the data.
3. Deep Boltzmann Machines (DBMs): A Deep Boltzmann Machine
(DBM) is an advanced type of Boltzmann machine designed to
model complex, high-dimensional data.
It extends the idea of a Restricted Boltzmann Machine (RBM) by
stacking multiple layers of hidden units, creating a deep
architecture that can capture intricate patterns and dependencies
in data.
  Restricted Boltzmann machine (RBM)
• A Restricted Boltzmann Machine (RBM) is a simplified
 version of a Boltzmann machine with certain restrictions
 that make it easier to train and more practical for many
 applications.
• Structure of Restricted Boltzmann Machines
                         A Restricted Boltzmann Machine
                         (RBM) is a generative, stochastic,
                         and    2-layer      artificial      neural
                         network      that     can        learn   a
                         probability distribution over its set
                         of inputs.
  Restricted Boltzmann machine (RBM)
• Visible Units (V): These units represent the input data.
 The number of visible units corresponds to the number of
 features in the input data.
• Hidden Units (H): These units capture the dependencies
 and patterns in the input data. The number of hidden
 units is a hyperparameter that can be tuned.
• Weights (W): Each visible unit is connected to every
 hidden unit with a symmetric weight. The weight matrix
 W defines these connections.
  Restricted Boltzmann machine (RBM)
• Biases: There are bias terms for both visible units (𝑎) and
 hidden units (𝑏). These biases help in adjusting the
 activation thresholds of the units.
• The restriction in a Restricted Boltzmann Machine is that
 there is no intra-layer communication(nodes of the same
 layer are not connected).
• Visible units are not connected to other visible units, and
 hidden units are not connected to other hidden units.
• This restriction allows for more efficient training
 algorithm in the class of class of Boltzmann machines
   Restricted Boltzmann machine (RBM)
• Energy function in RBM
• The energy of a configuration (a state of visible and hidden
  units) in an RBM is defined as:
• 𝐸(𝑣,ℎ) = − ∑ 𝑖 𝑎𝑖 𝑣𝑖 − ∑𝑗 𝑏𝑗 ℎ𝑗 − ∑𝑖,𝑗 𝑣𝑖 𝑊𝑖𝑗 ℎ𝑗
• where:
• 𝑣𝑖 is the state of visible unit 𝑖,
• ℎ𝑗 is the state of hidden unit j,
• 𝑎𝑖 is the bias of visible unit 𝑖,
• 𝑏𝑗 is the bias of hidden unit j,
• 𝑊𝑖𝑗 is the weight between visible unit 𝑖 and hidden unit j.
   Restricted Boltzmann machine (RBM)
• Probabilistic Activation
• The states of the units are binary (0 or 1) and are
 activated probabilistically based on their energies.
• The probability that a hidden unit ℎ𝑗 is activated (i.e., set
 to 1) given the visible units 𝑣 is:
• P ( hj = 1∣v ) = σ ( bj + ∑i vi Wij )
• Similarly, the probability that a visible unit 𝑣𝑖 is activated given the
 hidden units ℎ is:
• P ( vi = 1∣h ) = σ ( ai + ∑j hj Wij)
  Restricted Boltzmann machine (RBM)
• where 𝜎(𝑥) is the logistic sigmoid function:
• σ (x)=1 / 1+e−x1
 Training RBMs
• Training an RBM involves adjusting the weights and
 biases to minimize the difference between the observed
 data distribution and the distribution modeled by the
 RBM.
• The   primary      algorithm   used   for   this   purpose   is
 Contrastive Divergence (CD).
  Restricted Boltzmann machine (RBM)
• Working    of     Restricted    Boltzmann    Machine
 RBM works in two biases
• The hidden bias helps the RBM produce the activations
 on the forward pass, while
• The visible layer’s biases help the RBM learn the
 reconstructions on the backward pass.
• Forward pass
• The following Figure shows the working of RBM in
 forward pass.
    Restricted Boltzmann machine (RBM)
• The forward pass is the first step in training an RBM with
 multiple inputs.
• The inputs are multiplied by the weights and then added to the
 bias.
   Restricted Boltzmann machine (RBM)
• The result is then passed through a sigmoid activation
 function and the output determines if the hidden state
 gets activated or not.
• Weights will be a matrix with the number of input nodes
 as the number of rows and the number of hidden nodes as
 the number of columns.
• The first hidden node will receive the vector multiplication
 of the inputs multiplied by the first column of weights
 before the corresponding bias term is added to it.
  Restricted Boltzmann machine (RBM)
• The sigmoid function is given by:
• So the equation that we get in this step would be,
• where h(1) and v(0) are the corresponding vectors (column
  matrices) for the hidden and the visible layers with the
  superscript as the iteration (v(0) means the input that we
  provide to the network) and a is the hidden layer bias
  vector.
  Restricted Boltzmann machine (RBM)
• Backward pass
• The backward pass is the reverse or the reconstruction
 phase.
• It is similar to the first pass but in the opposite direction
 as shown below:
   Restricted Boltzmann machine (RBM)
• Where v(1) and h(1) are the corresponding vectors (column
 matrices) for the visible and the hidden layers with the
 superscript as the iteration and a is the visible layer bias
 vector.
  Applications of RBM
• RBMs have been used in various applications, including:
1. Dimensionality      Reduction:     Learning     compact
   representations of data.
2. Feature Learning: Extracting useful features from raw
   data.
3. Collaborative    Filtering:   Building   recommendation
   systems.
4. Pre-training Deep Networks: Initializing the weights of
   deep networks in a layer-wise manner.
  Deep Belief Neural Networks
• A Restricted Boltzmann Machine (RBM) is a type of
 generative stochastic artificial neural network that can
 learn a probability distribution from its inputs.
• Deep belief networks, in particular, can be created by
 “stacking” RBMs and fine-tuning the resulting deep
 network via gradient descent and backpropagation.
• DBF belong to the family of unsupervised learning
 algorithms and are known for their ability to learn
 hierarchical representations from data.
  Deep Belief Neural Networks
• DBN vary in operation, unlike autoencoders and RBMs
 work with raw input data whereas DBN operate on an
 input layer with one neuron for each input vector and go
 through numerous levels before arriving at the final layer.
• The final outputs are produced using probabilities
 acquired from earlier layers.
  Deep Belief Neural Networks
• The Architecture of DBN
• The top two layers are the associative memory, and the
 bottom layer is the visible units.
• The arrows pointing towards the layer closest to the data
 point to relationships between all lower layers.
   Deep Belief Neural Networks
• Directed acyclic connections in the lower layers translate
 associative memory to observable variables.
• The lowest layer of visible units receives input data as binary or
 actual data.
• Like RBM, there are no intralayer connections in DBN.
• The hidden units represent features that encapsulate the data’s
 correlations.
• A matrix of proportional weights W connects two layers.
  Deep Belief Neural Networks
• The “Input Layer” represents the initial layer, which has one neuron
 for each input vector.
• “Hidden Layer 1” is the first layer of Restricted Boltzmann Machine
 (RBM), which learns the fundamental structure of the data.
  Deep Belief Neural Networks
• “Hidden Layer 2” and subsequent layers are additional RBMs
 that learn higher-level features as we move through the
 network.
• We can have multiple hidden layers depending on the
 complexity of the task.
• “Output Layer” is used for supervised learning tasks like
 classification or regression.
• The arrows indicate the flow of information from one layer to
 the next, and the connections between neurons in adjacent
 layers represent the weights that are learned during training.
  Deep Belief Neural Networks
• Training the RBMs:
• One of the unique aspects of DBNs is that each RBM is
 trained   independently    using   a   technique      called
 contrastive divergence.
• This method allows us to approximate the gradient of the
 log-likelihood of the data with respect to the RBM’s
 parameters.
• After training, the output of one RBM becomes the input
 for the next, creating a stacked structure of RBMs.
  Deep Belief Neural Networks
• Fine-Tuning for Supervised Learning:
• After the DBN has been assembled through the training of its
 RBMs, it can be fine-tuned for supervised learning tasks.
• This fine-tuning process entails adjusting the weights of the
 final   layer   using   supervised   learning   techniques   like
 backpropagation.
• DBNs have gained popularity for their impressive performance
 across various applications.
• From image and speech recognition to natural language
 processing, they have consistently delivered state-of-the-art
 results.
  Deep Belief Neural Networks
• One of the main advantages of DBNs is their ability to
 learn features from the data in an unsupervised manner.
• 1. A hierarchical representation of the data can also be
 learned by DBNs, with each layer learning increasingly
 sophisticated features from lower layers to higher layers.
• 2. DBNs have proven to be resistant to overfitting issue
 due to model regularisation and by just using a small
 amount of labelled data during the fine-tuning phase.
• 3. The capacity of DBNs to manage missing data that
 happens frequently in many real-world applications for
 some data to be corrupted or absent.