Unit 5 Notes
Unit 5 Notes
Av = λv
In other words, when you multiply the matrix A by the eigenvector v, you get a new vector
that is just a scaled version of v (scaled by the eigenvalue λ).
Where
A is the square matrix for which you want to find eigenvalues and
eigenvectors.
λ is the eigenvalue you’re trying to find.
I is the identity matrix (a diagonal matrix with 1s on the diagonal and 0s
elsewhere).
v is the eigenvector you’re trying to find.
Solving this equation involves finding the values of λ that make the matrix (A — λI) singular
(i.e., its determinant is zero), and then finding the corresponding v vectors.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
Eigenvalue 1 (λ₁) = 5
Eigenvalue 2 (λ₂) = 1
Eigenvector 1 (v₁) = [0.8, 0.6]
Eigenvector 2 (v₂) = [-0.6, 0.8]
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
A generative adversarial network (GAN) is a deep learning architecture. It trains two neural networks
to compete against each other to generate more authentic new data from a given training dataset. For
instance, you can generate new images from an existing image database or original music from a
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
database of songs. A GAN is called adversarial because it trains two different networks and pits them
against each other. One network generates new data by taking an input data sample and modifying it
as much as possible. The other network tries to predict whether the generated data output belongs in
the original dataset. In other words, the predicting network determines whether the generated data is
fake or real. The system generates newer, improved versions of fake data values until the predicting
network can no longer distinguish fake from original.
The Generator attempts to fool the Discriminator, which is tasked with accurately
distinguishing between produced and genuine data, by producing random noise samples.
Realistic, high-quality samples are produced as a result of this competitive interaction, which
drives both networks toward advancement.
GANs are proving to be highly versatile artificial intelligence tools, as evidenced by their
extensive use in image synthesis, style transfer, and text-to-image synthesis.
They have also revolutionized generative modeling.
Through adversarial training, these models engage in a competitive interplay until the generator
becomes adept at creating realistic samples, fooling the discriminator approximately half the time.
Generative Adversarial Networks (GANs) can be broken down into three parts:
Generative: To learn a generative model, which describes how data is generated in terms of a
probabilistic model.
Adversarial: The word adversarial refers to setting one thing up against another. This means that, in
the context of GANs, the generative result is compared with the actual images in the data set. A
mechanism known as a discriminator is used to apply a model that attempts to distinguish between
real and fake images.
Networks: Use deep neural networks as artificial intelligence (AI) algorithms for training purposes.
Types of GANs
Vanilla GAN: This is the simplest type of GAN. Here, the Generator and the Discriminator are
simple basic multi-layer perceptrons. In vanilla GAN, the algorithm is really simple, it tries to
optimize the mathematical equation using stochastic gradient descent.
Conditional GAN (CGAN): CGAN can be described as a deep learning method in which some
conditional parameters are put into place. In CGAN, an additional parameter ‘y’ is added to the
Generator for generating the corresponding data. Labels are also put into the input to the
Discriminator in order for the Discriminator to help distinguish the real data from the fake generated
data.
Deep Convolutional GAN (DCGAN): DCGAN is one of the most popular and also the most
successful implementations of GAN. It is composed of ConvNets in place of multi-layer perceptrons.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
The ConvNets are implemented without max pooling, which is in fact replaced by convolutional
stride. Also, the layers are not fully connected.
Laplacian Pyramid GAN (LAPGAN): The Laplacian pyramid is a linear invertible image
representation consisting of a set of band-pass images, spaced an octave apart, plus a low-frequency
residual. This approach uses multiple numbers of Generator and Discriminator networks and different
levels of the Laplacian Pyramid. This approach is mainly used because it produces very high-quality
images. The image is down-sampled at first at each layer of the pyramid and then it is again up-scaled
at each layer in a backward pass where the image acquires some noise from the Conditional GAN at
these layers until it reaches its original size.
Super Resolution GAN (SRGAN): SRGAN as the name suggests is a way of designing a GAN in
which a deep neural network is used along with an adversarial network in order to produce higher-
resolution images. This type of GAN is particularly useful in optimally up-scaling native low-
resolution images to enhance their details minimizing errors while doing so.
Technically, the GAN works as follows. A complex mathematical equation forms the basis of the
entire computing process, but this is a simplistic overview:
1. The generator neural network analyzes the training set and identifies data attributes
2. The discriminator neural network also analyzes the initial training data and distinguishes
between the attributes independently
3. The generator modifies some data attributes by adding noise (or random changes) to certain
attributes
4. The generator passes the modified data to the discriminator
5. The discriminator calculates the probability that the generated output belongs to the original
dataset
6. The discriminator gives some guidance to the generator to reduce the noise vector
randomization in the next cycle
The generator attempts to maximize the probability of mistake by the discriminator, but the
discriminator attempts to minimize the probability of error. In training iterations, both the
generator and discriminator evolve and confront each other continuously until they reach an
equilibrium state. In the equilibrium state, the discriminator can no longer recognize synthesized
data. At this point, the training process is over.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
Chain rule: The chain rule is a fundamental concept in calculus that is used to compute the gradients
of a function composed of multiple nested functions. In the context of neural networks, the chain rule
is used to compute the gradients of the loss function with respect to the weights of the network.
Gradient descent: Gradient descent is a method used to update the weights of the network using the
gradients computed in the previous step. There are several variants of gradient descent, including
stochastic gradient descent, which uses random subsets of the training data to compute the gradients
and update the weights.
Activation functions: Activation functions are used to introduce nonlinearity into the network and
make it capable of modeling complex, nonlinear relationships. Common activation functions include
sigmoid, tanh, and ReLU.
Regularization: Regularization techniques are used to prevent overfitting, which can occur when the
network becomes too complex and fits the training data too closely. Common regularization
techniques include L1 and L2 regularization, which add penalties to the loss function based on the
magnitudes of the weights.
Dropout: Dropout is a technique used to prevent overfitting by randomly dropping out (setting to
zero) some of the activations in the network during training. This helps to prevent the network from
becoming too reliant on any one set of features.
Batch normalization: Batch normalization is a technique used to improve the stability and speed of
training by normalizing the activations in each batch of data before they are input to the next layer.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
By using these techniques, we can train deep neural networks that are capable of modeling complex,
high-dimensional relationships between inputs and outputs. These models can be used for a wide
range of applications, including image and speech recognition, natural language processing, and
recommendation systems.
Layer of input: It contains the neurons that receive input. The data is subsequently passed on to the
next tier. The input layer’s total number of neurons is equal to the number of variables in the dataset.
Hidden layer: This is the intermediate layer, which is concealed between the input and output layers.
This layer has a large number of neurons that perform alterations on the inputs. They then
communicate with the output layer.
Output layer: It is the last layer and is depending on the model’s construction. Additionally, the
output layer is the expected feature, as you are aware of the desired outcome.
Neurons weights: Weights are used to describe the strength of a connection between neurons. The
range of a weight’s value is from 0 to 1.
a method for improving performance by making minor adjustments to weights and biases using a
smooth cost function.
In its most basic form, a Feed-Forward Neural Network is a single-layer perceptron. A sequence of
inputs enters the layer and is multiplied by the weights in this model. The weighted input values are
then summed together to form a total. If the sum of the values is more than a predetermined threshold,
which is normally set at zero, the output value is usually 1, and if the sum is less than the threshold,
the output value is usually -1. The single-layer perceptron is a popular feed-forward neural network
model that is frequently used for classification. Single-layer perceptrons can also contain machine-
learning features.
Batch Normalization
Batch Normalization is a technique used to improve the training of deep neural networks. Introduced
by Sergey Ioffe and Christian Szegedy in 2015, batch normalization is used to normalize the inputs of
each layer in such a way that they have a mean output activation of zero and a standard deviation of
one. This normalization process helps to combat issues that deep neural networks face, such as
internal covariate shifts, which can slow down training and affect the network's ability to generalize
from the training data.
deeper layers. This can result in the need for lower learning rates and careful parameter initialization,
making the training process slow and less efficient.
Batch normalization works by normalizing the output of a previous activation layer by subtracting the
batch mean and dividing by the batch standard deviation. After this step, the result is then scaled and
shifted by two learnable parameters, gamma, and beta, which are unique to each layer. This process
allows the model to maintain the mean activation close to 0 and the activation standard deviation
close to 1.
1. Calculate the mean and variance of the activations for each feature in a mini-batch.
2. Normalize the activations of each feature by subtracting the mini-batch mean and dividing by
the mini-batch standard deviation.
3. Scale and shift the normalized values using the learnable parameters gamma and beta, which
allow the network to undo the normalization if that is what the learned behavior requires.
Batch normalization is typically applied before the activation function in a network layer, although
some variations may apply it after the activation function.
Improved Optimization: It allows the use of higher learning rates, speeding up the training process
by reducing the careful tuning of parameters.
Regularization: It adds a slight noise to the activations, similar to dropout. This can help to regularize
the model and reduce overfitting.
Reduced Sensitivity to Initialization: It makes the network less sensitive to the initial starting
weights.
Allows Deeper Networks: By reducing internal covariate shift, batch normalization allows for the
training of deeper networks.
While batch normalization is straightforward to apply during training, it requires special consideration
during inference. Since the mini-batch mean and variance are not available during inference, the
network uses the moving averages of these statistics that were computed during training. This ensures
that the normalization is consistent and the network's learned behavior is maintained.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
The effectiveness of batch normalization can depend on the size of the mini-batch. Very small batch
sizes can lead to inaccurate estimates of the mean and variance, which can destabilize the training
process.
Sequence Data: Applying batch normalization to recurrent neural networks and other architectures
that handle sequence data can be less straightforward and may require alternative approaches.
Regularization
Regularization refers to techniques that are used to calibrate machine learning models in order to
minimize the adjusted loss function and prevent overfitting or underfitting. Using Regularization, we
can fit our machine learning model appropriately on a given test set and hence reduce the errors in it.
Regularization Techniques
There are two main types of regularization techniques: Ridge Regularization and Lasso
Regularization.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
1. Ridge Regularization:
A regression model that uses the L2 regularization technique is called Ridge regression.
Ridge regression adds the “squared magnitude” of the coefficient as a penalty term to the loss
function (L). Also known as Ridge Regression, it modifies the over-fitted or under fitted
models by adding the penalty equivalent to the sum of the squares of the magnitude of
coefficients. This means that the mathematical function representing our machine learning
model is minimized and coefficients are calculated. The magnitude of coefficients is squared
and added. Ridge Regression performs regularization by shrinking the coefficients present.
The function depicted below shows the cost function of ridge regression:
In the cost function, the penalty term is represented by Lambda λ. By changing the values of
the penalty function, we are controlling the penalty term. The higher the penalty, it reduces
the magnitude of coefficients. It shrinks the parameters. Therefore, it is used to prevent
multicollinearity, and it reduces the model complexity by coefficient shrinkage.
2. Lasso Regression :A regression model which uses the L1 Regularization technique is called
LASSO (Least Absolute Shrinkage and Selection Operator) regression. Lasso Regression
adds the “absolute value of magnitude” of the coefficient as a penalty term to the loss
function (L). Lasso regression also helps us achieve feature selection by penalizing the
weights to approximately equal to zero if that feature does not serve any purpose in the
model. It modifies the over-fitted or under-fitted models by adding the penalty equivalent to
the sum of the absolute values of coefficients. Lasso regression also performs coefficient
minimization, but instead of squaring the magnitudes of the coefficients, it takes the true
values of coefficients. This means that the coefficient sum can also be 0, because of the
presence of negative coefficients. Consider the cost function for Lasso regression:
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
rotating, adding a small amount of noise, etc as shown in the example figure below. The idea
is to artificially create more data in the hopes that the augmented dataset will be a better
representation of the underlying hidden distribution. Since we are limited by the available
dataset only, this method generally doesn’t work very well as a regularizer.
5. Dropout: Dropout is used when the training model is a neural network. A neural network
consists of multiple hidden layers, where the output of one layer is used as input to the
subsequent layer. The subsequent layer modifies the input through learnable parameters
(usually by multiplying it by a matrix and adding a bias followed by an activation function).
The input flows through the neural network layers until it reaches the final output layer,
which is used for prediction.
6. Early stopping: It is an optimization technique used to reduce overfitting without
compromising on model accuracy. The main idea behind early stopping is to stop training
before a model starts to overfit.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
Regularization works by adding a penalty term to the model’s loss function, which constrains large
parameter values. This constraint on parameter values helps prevent overfitting by reducing the
model’s complexity and promoting better generalization to new data.
Regularization helps to prevent overfitting by adding constrain to keep the model from not getting too
complicated and too closely fitting the training data. It makes the model better at making predictions
on new data.
Gradient descent: Gradient descent is a simple optimizer that computes the gradients of the loss
function with respect to the weights of the network and updates the weights in the opposite direction
of the gradients. This process is repeated iteratively until the loss is minimized. However, gradient
descent can be slow to converge, especially for large datasets and complex models.
gradient directions. This may result in quicker convergence and improved generalization. However,
this method has significant downsides. One of the most significant concerns is that the cumulative
gradient magnitudes may get quite big over time, resulting in a meager effective learning rate that can
inhibit further learning. Adam and RMSProp, two contemporary optimization algorithms, combine
their adaptive learning rate method with other strategies to limit the growth of gradient magnitudes
over time.
Visualization of AdaGrad
The optimization process is visualized by plotting the cost function and the trajectory of the
parameters during the optimization. The cost function is plotted by evaluating the functionf(x) on a
grid of points and using the contour function to plot the contours of the cost function. The trajectory
of the parameters is plotted by storing the values of x at each iteration and using the plot function to
plot the path of x. Finally, the plot is displayed using the show function.
Adadelta Algorithm
Adadelta (or “ADADELTA”) is an extension to the gradient descent optimization algorithm. Adadelta
is designed to accelerate the optimization process, e.g. decrease the number of function evaluations
required to reach the optima, or to improve the capability of the optimization algorithm, e.g. result in a
better final result. It is best understood as an extension of the AdaGrad and RMSProp algorithms.
AdaGrad is an extension of gradient descent that calculates a step size (learning rate) for each
parameter for the objective function each time an update is made. The step size is calculated by first
summing the partial derivatives for the parameter seen so far during the search, then dividing the
initial step size hyperparameter by the square root of the sum of the squared partial derivatives. The
idea behind Adadelta is that instead of summing up all the past squared gradients from 1 to “t” time
steps, what if we could restrict the window size. For example, computing the squared gradient of the
past 10 gradients and average out. This can be achieved using Exponentially Weighted Averages over
Gradient.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
Adam optimizer
Adam optimizer is by far one of the most preferred optimizers. The idea behind Adam optimizer is to
utilize the momentum concept from “SGD with momentum” and adaptive learning rate from “Ada
delta”.
RMSprop: RMSprop is an optimization algorithm that uses adaptive learning rates for each weight
based on the average of the squares of the previous gradients. Tuning the RMSprop hyperparameters
involves finding values for the learning rate and decay rate that allow the optimizer to converge
quickly and reach a good solution.
Data Segmentation: Data segmentation is the process of breaking down a dataset into discrete
groups according to specific standards or attributes. These subsets can be identified by several criteria,
including behavior, demographics, or certain dataset features. Enabling more focused analysis and
modeling to produce better results is the main goal of data segmentation.
Segmentation plays a critical role in machine learning by enhancing the quality of data analysis and
model performance. Here’s why segmentation is important in the context of machine learning:
Improved Model Accuracy: Segmentation allows machine learning models to focus on specific
subsets of data, which often leads to more accurate predictions or classifications. By training models
on segmented data, they can capture nuances and patterns specific to each segment, resulting in better
overall performance.
Customized Solutions: Segmentation makes it easier to create strategies and solutions that are
specific to certain dataset segments. Personalized techniques have been shown to considerably
improve outcomes in a variety of industries, including marketing, healthcare, and finance. Segmented
patient data, for instance, enables customized treatment programs and illness management techniques
in the healthcare industry.
Optimized Resource Allocation: By segmenting data, organizations can allocate resources more
efficiently. For instance, in marketing campaigns, targeting specific customer segments with tailored
messages or offers can maximize the return on investment by focusing resources where they are most
likely to yield results.
Effective Risk Management: Segmentation aids in identifying high-risk segments within a dataset,
enabling proactive risk assessment and mitigation strategies. This is particularly crucial in fields like
finance and insurance, where accurately assessing risk can prevent financial losses.
Customer Segmentation: Companies employ segmentation to put customers into groups according to
their preferences, buying habits, or demographics. This allows for more individualized advice,
focused marketing strategies, and happier customers.
Image segmentation: is a technique used in computer vision to divide images into objects or
meaningful regions. This makes performing tasks like scene comprehension, object detection, and
image classification possible.
Text Segmentation: Text segmentation in natural language processing is the process of breaking text
up into smaller chunks, like phrases, paragraphs, or subjects. This makes information retrieval,
sentiment analysis, and document summarization easier.
Healthcare Segmentation: To determine risk factors, forecast disease outcomes, and customize
treatment regimens, healthcare practitioners divide up patient data into smaller groups. Better patient
care and medical decision-making result from this.
Financial Segmentation: To provide specialized financial goods and services, banks and other
financial organizations divide up their clientele into groups according to credit risk, income levels,
and spending patterns. This aids in risk management and profitability maximization.
Problem Statements
While Machine learning is extensively used across industries to make data-driven decisions, its
implementation observes many problems that must be addressed. Here’s a list of organizations' most
common machine learning challenges when inculcating ML in their operations.
4. Delayed Implementation
ML models offer efficient results but consume a lot of time due to data overload, slow programs, and
excessive requirements. Additionally, they demand timely monitoring and maintenance to deliver the
best output.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
Some problem statements that are often addressed by machine learning or data analytics:
Predictive Maintenance: How can machine learning algorithms be used to predict when equipment
or machinery will require maintenance, to minimize downtime and optimize operations?
Fraud Detection: How can data analytics be used to identify fraudulent transactions or behavior,
prevent financial losses, and protect customers?
Personalized Recommendations: How can machine learning be used to analyze customer data and
provide personalized recommendations, to improve customer experience and increase sales?
Image/Text/Speech Recognition: How can machine learning be used to recognize and classify
images, text, or speech, in order to enable applications such as autonomous vehicles, virtual assistants,
or medical diagnosis?
Forecasting: How can machine learning or data analytics be used to forecast future trends or events,
in order to enable better decision-making and planning?
Sentiment Analysis: How can machine learning be used to analyze customer sentiment, in order to
better understand customer needs and preferences, and improve brand reputation?
Data Engineering: Machine Learning in Data Engineering: Integrating ML techniques into data
engineering processes for enhanced data processing and analysis. Data engineering is the complex
task of making raw data usable to data scientists and groups within an organization. Data engineering
encompasses numerous specialties of data science. Incorporating machine learning within the realm of
data engineering involves a seamless fusion of essential processes: data preprocessing and the
construction of effective data pipelines.
Effectively integrating these components empowers data engineers to harness the power of machine
learning to derive insights and predictions from complex datasets, fostering a synergistic relationship
between the fields of data engineering and machine learning.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
Model Selection
Model selection is an essential phase in the development of powerful and precise predictive models in
the field of machine learning. Model selection is the process of deciding which algorithm and model
architecture is best suited for a particular task or dataset. It entails contrasting various models,
assessing their efficacy, and choosing the one that most effectively addresses the issue at hand. The
choice of an appropriate machine learning model is crucial since there are various levels of
complexity, underlying assumptions, and capabilities among them. A model's ability to generalize to
new, untested data may not be as strong as its ability to perform effectively on a single dataset or
problem. Finding a perfect balance between the complexity of models & generalization is therefore
key to model selection. Choosing a model often entails a number of processes. The first step in this
process is to define a suitable evaluation metric that matches the objectives of the particular situation.
According to the nature of the issue, this statistic may refer to precision, recall, accuracy, F1-score, or
any other relevant measure. The selection of numerous candidate models is then made in accordance
with the problem at hand and the data that are accessible. These models might be as straightforward as
decision trees or linear regression or as sophisticated as deep neural networks, random forests, or
support vector machines. During the selection process, it is important to take into account the
assumptions, constraints, and hyperparameters that are unique to each model. Using a suitable
methodology, such as cross-validation, the candidate models are trained and evaluated after being
selected. To do this, the available data must be divided into validation and training sets, with each
model fitting on the training set before being evaluated on the validation set. The models are
compared using their performance metrics, then the model with the highest performance is chosen.
Model selection is a continuous process, though. In order to make wise selections, it frequently calls
for an iterative process that involves testing several models and hyperparameters. The models are
improved through this iterative process, which also aids in choosing the ideal mix of algorithms &
hyperparameters.
Model Engineering: Machine learning (ML) model engineering is a technical process that involves
various steps, including data collection and preprocessing, model selection, training, deployment and
monitoring. It aims to develop effective and efficient ML models to solve specific problems and meet
diverse business use cases. This process requires a range of technical skills, including expertise in
programming languages, data structures, algorithms and ML frameworks.
Model Outcome: During training, the machine learning algorithm is optimized to find certain
patterns or outputs from the dataset, depending on the task. The output of this process - often a
computer program with specific rules and data structures - is called a machine learning model.
Modal analysis: It is the process of determining the inherent dynamic characteristics of a system in
the forms of natural frequencies, damping factors, and mode shapes, and using them to formulate a
mathematical model for its dynamic behavior.
Optimization: The concept of optimization is integral to machine learning. Most machine learning
models use training data to learn the relationship between input and output data. The models can then
be used to make predictions about trends or classify new input data. This training is a process of
optimization, as each iteration aims to improve the model’s accuracy and lower the margin of error.
Optimization is a theme that runs through every step of machine learning. This includes a data
scientist optimizing and refining labeled training data or the iterative training and improvement of
models. At its core, the training of a machine learning model is an optimization problem, as the model
learns to perform a function most effectively. The most important part of machine learning
optimization is the tweaking and tuning of model configurations or hyperparameters.
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
NCR CAMPUS, MODINAGAR
DEPARTMENT OF MCA
UDS21401J - DEEP LEARNING FOR ENTERPRISE
UNIT-V
Hyperparameters are the elements of the model set by the data scientist or developer. It includes
elements like the learning rate or number of classification clusters and is a way of refining a model to
fit a specific dataset. In contrast, parameters are elements developed by the machine learning model
itself during training. Selecting the optimal hyperparameters is key to ensuring an accurate and
efficient machine-learning model.
User Interface (UI): It defines the way humans interact with the information systems. In Layman’s
terms, User Interface (UI) is a series of pages, screens, buttons, forms, and other visual elements that
are used to interact with the device. Every app and every website has a user interface. User Interface
(UI) Design is the creation of graphics, illustrations, and the use of photographic artwork and
typography to enhance the display and layout of a digital product within its various device views.