0% found this document useful (0 votes)

44 views29 pages

MLT Unit 1 Notes

The document provides an overview of machine learning (ML), defining it as a subset of artificial intelligence that allows systems to learn from data rather than through explicit programming. It discusses the importance of data in business, the historical development of ML techniques, various applications across industries, advantages and disadvantages, challenges faced, and types of ML problems. Key applications include image and speech recognition, self-driving cars, and online fraud detection, highlighting the transformative impact of ML on various sectors.

Uploaded by

ranandraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views29 pages

MLT Unit 1 Notes

Uploaded by

ranandraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 29

MACHINE LEARNING TECHNIQUES

UNIT-1

INTRODUCTION AND MATHEMATICAL FOUNDATIONS

1.WHAT IS MACHINE LEARNING ?:

Machine learning (ML) is a type of artificial intelligence (AI) that enables a system to learn from data
instead of through explicit programming. ML uses algorithms that iteratively learn from data to improve,
describe data, and predict outcomes.

2.NEEDS

Data is the lifeblood of all business. Data-driven decisions increasingly make the difference between
keeping up with competition or falling further behind. Machine learning can be the key to unlocking the
value of corporate and customer data and enacting decisions that keep a company ahead of the
competition.

Advancements in AI for applications like natural language processing (NLP) and computer vision
(CV) are helping industries like financial services, healthcare, and automotive accelerate innovation,
improve customer experience, and reduce costs. Machine learning has applications in all types of
industries, including manufacturing, retail, healthcare and life sciences, travel and hospitality, financial
services, and energy, feedstock, and utilities. Use cases include:
 Manufacturing. Predictive maintenance and condition monitoring
 Retail. Upselling and cross-channel marketing
 Healthcare and life sciences. Disease identification and risk satisfaction
 Travel and hospitality. Dynamic pricing
 Financial services. Risk analytics and regulation
 Energy. Energy demand and supply optimization

Page 1
3.HISTORY
It’s all well and good to ask if androids dream of electric sheep, but science fact has evolved to a point
where it’s beginning to coincide with science fiction. No, we don’t have autonomous androids struggling
with existential crises — yet — but we are getting ever closer to what people tend to call “artificial
intelligence.”

Machine Learning is a sub-set of artificial intelligence where computer algorithms are used to
autonomously learn from data and information. In machine learning computers don’t have to be explicitly
programmed but can change and improve their algorithms by themselves.

Today, machine learning algorithms enable computers to communicate with humans, autonomously drive
cars, write and publish sport match reports, and find terrorist suspects. I firmly believe machine learning
will severely impact most industries and the jobs within them, which is why every manager should have
at least some grasp of what machine learning is and how it is evolving.

In this post I offer a quick trip through time to examine the origins of machine learning as well as the
most recent milestones.

1950 — Alan Turing creates the “Turing Test” to determine if a computer has real intelligence. To pass
the test, a computer must be able to fool a human into believing it is also human.

1952 — Arthur Samuel wrote the first computer learning program. The program was the game of
checkers, and the IBM computer improved at the game the more it played, studying which moves made
up winning strategies and incorporating those moves into its program.

1957 — Frank Rosenblatt designed the first neural network for computers (the perceptron), which
simulate the thought processes of the human brain.

1967 — The “nearest neighbor” algorithm was written, allowing computers to begin using very basic
pattern recognition. This could be used to map a route for traveling salesmen, starting at a random city
but ensuring they visit all cities during a short tour.

1979 — Students at Stanford University invent the “Stanford Cart” which can navigate obstacles in a
room on its own.

1981 — Gerald Dejong introduces the concept of Explanation Based Learning (EBL), in which a
computer analyses training data and creates a general rule it can follow by discarding unimportant data.

Page 2
1985 — Terry Sejnowski invents NetTalk, which learns to pronounce words the same way a baby does.

1990s — Work on machine learning shifts from a knowledge-driven approach to a data-driven approach.
Scientists begin creating programs for computers to analyze large amounts of data and draw conclusions
— or “learn” — from the results.

1997 — IBM’s Deep Blue beats the world champion at chess.

2006 — Geoffrey Hinton coins the term “deep learning” to explain new algorithms that let computers
“see” and distinguish objects and text in images and videos.

2010 — The Microsoft Kinect can track 20 human features at a rate of 30 times per second, allowing
people to interact with the computer via movements and gestures.

2011 — IBM’s Watson beats its human competitors at Jeopardy.

2011 — Google Brain is developed, and its deep neural network can learn to discover and categorize
objects much the way a cat does.

2012 – Google’s X Lab develops a machine learning algorithm that is able to autonomously browse
YouTube videos to identify the videos that contain cats.

2014 – Facebook develops DeepFace, a software algorithm that is able to recognize or verify individuals
on photos to the same level as humans can.

2015 – Amazon launches its own machine learning platform.

2015 – Microsoft creates the Distributed Machine Learning Toolkit, which enables the efficient
distribution of machine learning problems across multiple computers.

2015 – Over 3,000 AI and Robotics researchers, endorsed by Stephen Hawking, Elon Musk and Steve
Wozniak (among many others), sign an open letter warning of the danger of autonomous weapons which
select and engage targets without human intervention.

2016 – Google’s artificial intelligence algorithm beats a professional player at the Chinese board game
Go, which is considered the world’s most complex board game and is many times harder than chess. The

Page 3
AlphaGo algorithm developed by Google DeepMind managed to win five games out of five in the Go
competition.

So are we drawing closer to artificial intelligence? Some scientists believe that’s actually the wrong
question.

They believe a computer will never “think” in the way that a human brain does, and that comparing the
computational analysis and algorithms of a computer to the machinations of the human mind is like
comparing apples and oranges.

Regardless, computers’ abilities to see, understand, and interact with the world around them is growing at
a remarkable rate. And as the quantities of data we produce continue to grow exponentially, so will our
computers’ ability to process and analyze — and learn from — that data grow and expand.
4.DEFINITIONS

Machine learning is a subset of artificial intelligence (AI) that focuses on the development of algorithms
and statistical models that enable computers to perform tasks without being explicitly programmed for
those tasks. In essence, machine learning algorithms learn from data, iteratively improving their
performance over time as they are exposed to more data.
A commonly cited definition of machine learning is given by Tom Mitchell, a computer scientist and
professor at Carnegie Mellon University:
" A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."
In this definition:
"Experience E" refers to the data or examples provided to the algorithm.
"Tasks T" refer to the specific problems or tasks that the algorithm aims to solve or perform.
"Performance measure P" is the metric used to evaluate the algorithm's performance on the tasks T.
5.APPLICATIONS
1. Image Recognition:

Image recognition is one of the most common applications of machine learning. It is used to identify
objects, persons, places, digital images, etc. The popular use case of image recognition and face detection
is, Automatic friend tagging suggestion:

Facebook provides us a feature of auto friend tagging suggestion. Whenever we upload a photo with our
Facebook friends, then we automatically get a tagging suggestion with name, and the technology behind
this is machine learning's face detection and recognition algorithm.

It is based on the Facebook project named "Deep Face," which is responsible for face recognition and
person identification in the picture.

Page 4
2. Speech Recognition:

While using Google, we get an option of "Search by voice," it comes under speech recognition, and it's a
popular application of machine learning.

Speech recognition is a process of converting voice instructions into text, and it is also known as "Speech
to text", or "Computer speech recognition." At present, machine learning algorithms are widely used by
various applications of speech recognition. Google assistant, Siri, Cortana, and Alexa are using speech
recognition technology to follow the voice instructions.

3. Traffic prediction:

If we want to visit a new place, we take help of Google Maps, which shows us the correct path with the
shortest route and predicts the traffic conditions.

It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily congested
with the help of two ways:

o Real Time location of the vehicle form Google Map app and sensors

o Average time has taken on past days at the same time.

Everyone who is using Google Map is helping this app to make it better. It takes information from the
user and sends back to its database to improve the performance.

4. Product recommendations:

Machine learning is widely used by various e-commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation to the user. Whenever we search for some product
on Amazon, then we started getting an advertisement for the same product while internet surfing on the
same browser and this is because of machine learning.

Google understands the user interest using various machine learning algorithms and suggests the product
as per customer interest.

As similar, when we use Netflix, we find some recommendations for entertainment series, movies, etc.,
and this is also done with the help of machine learning.

5. Self-driving cars:

One of the most exciting applications of machine learning is self-driving cars. Machine learning plays a
significant role in self-driving cars. Tesla, the most popular car manufacturing company is working on
self-driving car. It is using unsupervised learning method to train the car models to detect people and
objects while driving.

6. Email Spam and Malware Filtering:

Whenever we receive a new email, it is filtered automatically as important, normal, and spam. We always
receive an important mail in our inbox with the important symbol and spam emails in our spam box, and
the technology behind this is Machine learning. Below are some spam filters used by Gmail:

o Content Filter

o Header filter

Page 5
o General blacklists filter

o Rules-based filters

o Permission filters

Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree, and Naïve Bayes
classifier are used for email spam filtering and malware detection.

7. Virtual Personal Assistant:

We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri. As the name
suggests, they help us in finding the information using our voice instruction. These assistants can help us
in various ways just by our voice instructions such as Play music, call someone, Open an email,
Scheduling an appointment, etc.

These virtual assistants use machine learning algorithms as an important part.

These assistant record our voice instructions, send it over the server on a cloud, and decode it using ML
algorithms and act accordingly.

8. Online Fraud Detection:

Machine learning is making our online transaction safe and secure by detecting fraud transaction.
Whenever we perform some online transaction, there may be various ways that a fraudulent transaction
can take place such as fake accounts, fake ids, and steal money in the middle of a transaction. So to
detect this, Feed Forward Neural network helps us by checking whether it is a genuine transaction or a
fraud transaction.

For each genuine transaction, the output is converted into some hash values, and these values become the
input for the next round. For each genuine transaction, there is a specific pattern which gets change for
the fraud transaction hence, it detects it and makes our online transactions more secure.

9. Stock Market trading:

Machine learning is widely used in stock market trading. In the stock market, there is always a risk of up
and downs in shares, so for this machine learning's long short term memory neural network is used for
the prediction of stock market trends.

10. Medical Diagnosis:

In medical science, machine learning is used for diseases diagnoses. With this, medical technology is
growing very fast and able to build 3D models that can predict the exact position of lesions in the brain.

It helps in finding brain tumors and other brain-related diseases easily.

11. Automatic Language Translation:

Nowadays, if we visit a new place and we are not aware of the language then it is not a problem at all, as
for this also machine learning helps us by converting the text into our known languages. Google's GNMT
(Google Neural Machine Translation) provide this feature, which is a Neural Machine Learning that
translates the text into our familiar language, and it called as automatic translation.

Page 6
The technology behind the automatic translation is a sequence to sequence learning algorithm, which is
used with image recognition and translates the text from one language to another language.

6.ADVANTAGES

 Automation: Machine learning algorithms can automate complex tasks and processes, reducing
the need for human intervention.

 Data mining: Machine learning can assess data and find patterns in it.

 Better advertising and marketing: Machine learning algorithms can predict which consumers are
the most likely to actually buy a product.

 More accurate predictions: Machine learning can analyze vast amounts of data, identify patterns,
and make accurate predictions without being explicitly programmed for every possible scenario.

 Data-driven decision making: Machine learning uses real-time and historical data to make
informed decisions, which significantly reduces reliance on decisions based on intuition or
assumptions.

 Continuous improvement: Machine learning models can continuously learn and improve over
time.

 Enhanced online shopping experience: Machine learning algorithms can analyze user behavior to
offer personalized product suggestions.

7.DISADVANTAGES

 Time-consuming: ML training requires large databases and is expensive.

 Inaccurate data: ML algorithms can be difficult to train due to data errors, unusable data formats,
and the inability to label data.

 Biased data: Human biases can creep into datasets and spoil outcomes. For example, the selfie
editor Face App was initially trained to make faces “hotter” by lightening the skin tone.

 Ethical concerns: ML can amplify existing biases in the training data, which can raise ethical
concerns related to discrimination, stereotyping, and data privacy.

 Lack of interpretability: ML lacks interpretability, leading to potential biases and black-box

decision-making.

 Security risks: ML relies heavily on data, which raises privacy and security concerns.

 Other disadvantages: ML requires significant computational power, and can lead to excessive use
that harms mankind.

8.CHALLENGES

 Data quality and availability

Poor quality data is one of the primary obstacles to machine learning. Algorithms need access to valuable
information to advance, but exhausting available data can lead to issues.

 Data bias

Page 7
Data bias can occur in collection, analysis, and utilization. Biased data can lead to inaccurate and
unreliable results, making it ineffective at achieving desired goals.

 Overfitting

Overfitting occurs when a model is trained too closely on the training data, and as a result, it performs
poorly on new, unseen data.

 Underfitting

Under fitting occurs when a machine learning algorithm is unable to capture the relationship between the
input and output variables accurately. This generates a high error rate on both the training set and unseen
data.

 Data collection

Most of the time for running machine learning end-to-end is spent on preparing the data, which includes
collecting, cleaning, analyzing, visualizing, and feature engineering.

9.TYPES OF MACHINE LEARNING PROBLEMS

In this post, you will learn about the most common types of machine learning (ML) problems along with
a few examples. Without further ado, let’s look at these problem types and understand the details.

 Regression

 Classification

 Clustering

 Time-series forecasting

 Anomaly detection

 Ranking

 Recommendation

 Data generation

 Optimization

Problem types Details Algorithms

Linear regression,
When the need is to predict numerical values, K-NN, random
such kinds of problems are called regression forest, neural
Regression problems. For example, house price prediction networks

Classification When there is a need to classify the data in Logistic

different classes, it is called a classification regression, random
problem. If there are two classes, it is called a forest, K-NN,

Page 8
binary classification problem. When it is
multiple classes, it is multi-nomial classification.
For example, classify whether a person is
suffering from a disease or otherwise. Classify
whether a stock is “buy”, “sell”, or “hold”. gradient boosting
Check this related post – Machine learning classifier, neural
techniques for stock prediction networks

K-Means,
DBSCAN,
Hierarchical
When there is a need to categorize the data clustering, Gaussian
points in similar groupings or clusters, this is mixture models,
Clustering called a clustering problem. BIRCH

When there is a need to predict a number based

on the time-series data, it is called a time-series
forecasting problem. A time series is a sequence
of numerical data points in successive order.
Time series data means that data is in a series of
particular time periods or intervals. For example,
a time-series forecasting problem is about ARIMA, SARIMA,
forecasting the sales demand for a product, based LSTM, Exponential
on a set of input data such as previous sales smoothing, Prophet,
figures, consumer sentiment, and weather. GARCH, TBATS,
Time-series Another kind of time series problem is demand Dynamic linear
forecasting forecasting. models

When there is a need to find the outliers in the

dataset, the problem is called an anomaly
detection problem. In other words, if a given IsolationForest,
record can be classified as an outlier or Minimum
unexpected event/item, this can be called an covariance
anomaly detection problem. For example, credit determinant, Local
card fraud transactions detection is an anomaly outlier factor, One-
Anomaly detection detection problem. class SVM

Ranking When there is a need to order the results of a Bipartite

request or a query based on some criteria, the ranking (Bipartite
problem is ranking problems. We rank the output Rankboost, Bipartite
of query execution based on scores we assign to RankSVM)
each output based on some algorithms. These
algorithms are called a ranking algorithm.
Recommendation engines make use of the

Page 9
ranking algorithm to recommend the next items.

When there is a need to recommend such as

“next item” to buy or “next video” to watch or Content-based and
“next song” to listen to, the problem is called a collaborative
recommendation problem. The solutions to such filtering machine
Recommendation problems are called recommender systems. learning methods

Generative
When there is a need to generate data such as adversarial network
images, videos, articles, posts, etc, the problem is (GAN), Hidden
Data generation called a data generation problem. Markov models

When there is a need to generate a set of outputs

that optimize outcomes related to some objective Linear programming
(objective function), the problem is called an methods, genetic
Optimization objective function. programming

10.MATHEMATICAL FOUNDATIONS

This course is an introduction to key mathematical concepts at the heart of machine learning. The focus is
on matrix methods and statistical models and features real-world applications ranging from classification
and clustering to denoising and recommender systems.

11.LINEAR ALGEBRA

Linear algebra serves as a fundamental mathematical foundation for many aspects of machine learning,
providing the framework for representing and manipulating data, defining models, and solving
optimization problems. Here are some key concepts in linear algebra that are essential for understanding
machine learning:

1. Vectors and Matrices:

 Vectors represent quantities with magnitude and direction, often used to represent features
or observations in machine learning.

 Matrices are arrays of numbers, often used to represent datasets or transformations.

2. Matrix Operations:

 Addition and subtraction of matrices.

 Scalar multiplication (multiplying a matrix by a scalar).

 Matrix multiplication, which involves multiplying rows and columns to produce new
matrices. This operation is fundamental in many machine learning algorithms, such as
linear regression and neural networks.

Page 10
3. Transpose and Inverse:

 The transpose of a matrix flips it across its diagonal, swapping rows and columns.

 The inverse of a square matrix A, denoted as A^-1, is a matrix such that A * A^-1 = I,
where I is the identity matrix. The inverse is crucial for solving systems of linear equations
and for certain optimization problems.

4. Eigenvalues and Eigenvectors:

 Eigenvalues and eigenvectors are properties of square matrices. An eigenvector of a matrix

A is a nonzero vector v such that Av = λv, where λ is a scalar known as the eigenvalue.
Eigenvalues and eigenvectors are important in dimensionality reduction techniques like
Principal Component Analysis (PCA) and in understanding the behavior of linear
transformations.

5. Matrix Decompositions:

 Singular Value Decomposition (SVD), which decomposes a matrix into singular vectors
and singular values. SVD is used in PCA and other dimensionality reduction techniques.

 Eigenvalue Decomposition (EVD), which decomposes a matrix into eigenvectors and

eigenvalues. EVD is closely related to SVD and is used in certain applications where the
matrix is symmetric.

6. Linear Transformations:

 Linear transformations map vectors to other vectors while preserving addition and scalar
multiplication properties. Many machine learning algorithms involve linear
transformations, such as feature transformations and linear classifiers.

Page 11
12.ANALYTICAL GEOMETRY

Applications of Analytic Geometry: Analytic geometry known as coordinate geometry is a branch of

mathematics that combines algebra and geometry. It involves the study of the geometric shapes using the
coordinates and equations.

Analytic geometry is used in physics and engineering, as well as in fields like aviation, rocketry, space
science, and spaceflight. It serves as the cornerstone for many contemporary geometric disciplines,
including algebraic, differential, discrete, and computational geometry.

What is Analytic Geometry?

Analytic geometry, also called coordinate geometry, is a branch of mathematics that combines algebra
and geometry. It deals with geometric figures using coordinate systems and algebraic techniques.

Instead of only dealing with on geometric properties and relationships, analytic geometry introduces a
coordinate system that represent geometric figures using numerical coordinates.

Uses / Applications of Analytic Geometry

Various applications of analytic geometry are:

Engineering Design

Analytic geometry is widely used in engineering design to the model and analyze complex shapes and
structures. Engineers use coordinate systems and equations to the design buildings, bridges and
mechanical components.

Example: Engineers use analytic geometry to design the curves and surfaces of the car bodies for the
aerodynamics and aesthetics.

Computer Graphics

Analytic geometry forms the basis of the computer graphics allowing the programmers to the create and
manipulate images on the screens. By representing objects as sets of the coordinates and equations
computers can render realistic 2D and 3D graphics.

Page 12
Physics and Astronomy

Analytic geometry is used in physics and astronomy to the study the motion of the objects and understand
the behavior of the celestial bodies. The Equations describing the trajectories of the particles and planets
can be analyzed using the geometric techniques.

Example: NASA uses analytic geometry to calculate the orbits of the spacecraft and predict their
trajectories.

Cartography

Analytic geometry is essential in the cartography the science of the mapmaking. The Cartographers use
coordinate systems and equations to the represent the Earth’s surface accurately and create maps that are
used for the navigation, urban planning and environmental studies.

Example: GPS systems rely on analytic geometry to the determine the coordinates of the user’s location
and provide directions.

Robotics

Analytic geometry plays a crucial role in robotics enabling the engineers to the program robots to perform
the tasks accurately and efficiently. The Robots use coordinate systems and geometric algorithms to the
navigate environments and manipulate objects.

Example: Industrial robots use analytic geometry to the precisely control their movements when
assembling products on assembly lines.

ORTHOGONAL PROJECTION

13.PROBABILITY AND STATISTICS

Probability and statistics are considered as the base foundation for ML and data science to develop ML
algorithms and build decision-making capabilities. Also, Probability and statistics are the primary
prerequisites to learn ML.

Machine Learning is an interdisciplinary field that uses statistics, probability, algorithms to learn from
data and provide insights which can be used to build intelligent applications. In this article, we will
discuss some of the key concepts widely used in machine learning.

Probability and statistics are related areas of mathematics which concern themselves with analyzing the
relative frequency of events.

Page 13
Probability deals with predicting the likelihood of future events, while statistics involves the analysis of
the frequency of past events.

Probability

Most people have an intuitive understanding of degrees of probability, which is why we use words like
“probably” and “unlikely” in our daily conversation, but we will talk about how to make quantitative
claims about those degrees [1].

In probability theory, an event is a set of outcomes of an experiment to which a probability is assigned.

If E represents an event, then P(E) represents the probability that Ewill occur. A situation where E might
happen (success) or might not happen (failure) is called a trial.

This event can be anything like tossing a coin, rolling a die or pulling a colored ball out of a bag. In
these examples the outcome of the event is random, so the variable that represents the outcome of these
events is called a random variable.

Let us consider a basic example of tossing a coin. If the coin is fair, then it is just as likely to come up
heads as it is to come up tails. In other words, if we were to repeatedly toss the coin many times, we
would expect about about half of the tosses to be heads and and half to be tails. In this case, we say that
the probability of getting a head is 1/2 or 0.5 .

The empirical probability of an event is given by number of times the event occurs divided by the total
number of incidents observed. If forntrials and we observe ssuccesses, the probability of success is s/n. In
the above example. any sequence of coin tosses may have more or less than exactly 50% heads.

Theoretical probability on the other hand is given by the number of ways the particular event can occur
divided by the total number of possible outcomes. So a head can occur once and possible outcomes are
two (head, tail). The true (theoretical) probability of a head is 1/2.

Joint Probability

Probability of events A and B denoted byP(A and B) or P(A ∩ B)is the probability that events A and B
both occur. P(A ∩ B) = P(A). P(B) . This only applies if Aand Bare independent, which means that
if Aoccurred, that doesn’t change the probability of B, and vice versa.

Conditional Probability

Let us consider A and B are not independent, because if A occurred, the probability of B is higher. When
A and B are not independent, it is often useful to compute the conditional probability, P (A|B), which is
the probability of A given that B occurred: P(A|B) = P(A ∩ B)/ P(B).

The probability of an event A conditioned on an event B is denoted and defined P(A|B) = P(A∩B)/P(B)

Similarly, P(B|A) = P(A ∩ B)/ P(A) . We can write the joint probability of as A and B as P(A ∩ B)=
p(A).P(B|A), which means : “The chance of both things happening is the chance that the first one
happens, and then the second one given the first happened.”

Bayes’ Theorem

Bayes’s theorem is a relationship between the conditional probabilities of two events. For example, if we
want to find the probability of selling ice cream on a hot and sunny day, Bayes’ theorem gives us the

Page 14
tools to use prior knowledge about the likelihood of selling ice cream on any other type of day (rainy,
windy, snowy etc.).

where Hand E are events, P(H|E) is the conditional probability that event H occurs given that event E has
already occurred. The probability P(H) in the equation is basically frequency analysis; given
our prior data what is the probability of the event occurring. The P(E|H) in the equation is called
the likelihood and is essentially the probability that the evidence is correct, given the information from
the frequency analysis. P(E) is the probability that the actual evidence is true.

a Sensitivity (also called the true positive rate) result for 95% of the patients with the disease, and
a Specificity (also called the true negative rate) result for 95% of the healthy patients.

If we let “+” and “−” denote a positive and negative test result, respectively, then the test accuracies are
the conditional probabilities : P(+|disease) = 0.95, P(-|healthy) = 0.95,

In Bayesian terms, we want to compute the probability of disease given a positive test, P(disease|+).

P(disease|+) = P(+|disease)* P(disease)/P(+)

Descriptive Statistics

Descriptive statistics refers to methods for summarizing and organizing the information in a data set. We
will use below table to describe some of the statistical concepts [4].

Elements: The entities for which information is collected are called the elements. In the above table, the
elements are the 10 applicants. Elements are also called cases or subjects.

Variables: The characteristic of an element is called a variable. It can take different values for different
elements.e.g., marital status, mortgage, income, rank, year, and risk. Variables are also called attributes.

Variables can be either qualitative or quantitative.

Qualitative: A qualitative variable enables the elements to be classified or categorized according to some
characteristic. The qualitative variables are marital status, mortgage, rank, and risk. Qualitative variables
are also called categorical variables.

Quantitative: A quantitative variable takes numeric values and allows arithmetic to be meaningfully
performed on it. The quantitative variables are income and year. Quantitative variables are also
called numerical variables.

Discrete Variable: A numerical variable that can take either a finite or a countable number of values is a
discrete variable, for which each value can be graphed as a separate point, with space between each
point. ‘year’ is an example of a discrete variable..

Page 15
Continuous Variable: A numerical variable that can take infinitely many values is a continuous variable,
whose possible values form an interval on the number line, with no space between the points. ‘income’ is
an example of a continuous variable.

Population: A population is the set of all elements of interest for a particular problem. A parameter is a
characteristic of a population.

Sample: A sample consists of a subset of the population. A characteristic of a sample is called a statistic.

Random sample: When we take a sample for which each element has an equal chance of being selected.

Measures of Center: Mean, Median, Mode, Mid-range

Indicate where on the number line the central part of the data is located.

Mean

The mean is the arithmetic average of a data set. To calculate the mean, add up the values and divide by

The population mean is the arithmetic average of a population, and is denoted 𝜇 (“myu”, the Greek letter
the number of values.The sample mean is the arithmetic average of a sample, and is denoted x̄ (“x-bar”).

for m).

Median

The median is the middle data value, when there is an odd number of data values and the data have been
sorted into ascending order. If there is an even number, the median is the mean of the two middle data
values. When the income data are sorted into ascending order, the two middle values are $32,100 and
$32,200, the mean of which is the median income, $32,150.

Mode

The mode is the data value that occurs with the greatest frequency. Both quantitative and categorical
variables can have modes, but only quantitative variables can have means or medians. Each income value

Measures of Variability: Range, Variance, Standard Deviation

Quantify the amount of variation, spread or dispersion present in the data.

Range

The range of a variable equals the difference between the maximum and minimum values. The range of
income is:

range(income) = max (income) − min (income) = 48,000 − 24,000 =$24000

Range only reflects the difference between largest and smallest observation, but it fails to reflect how data
is centralized.

Variance

as 𝜎² (“sigma-squared”):
Population variance is defined as the average of the squared differences from the Mean, denoted

Page 16
Larger Variance means the data are more spread out.

The sample variance s² is approximately the mean of the squared deviations, with N replaced by n-1. This
difference occurs because the sample mean is used as an approximation of the true population mean.

Standard Deviation

The standard deviation or sd of a bunch of numbers tells you how much the individual numbers tend to
differ from the mean.

The sample standard deviation is the square root of the sample variance: sd = √ s². For example, incomes
deviate from their mean by $7201.

The population standard deviation is the square root of the population variance: sd= √ 𝜎².

Three different data distributions with same mean (100) and different standard deviation (5,10,20)

The smaller the standard deviation, narrower the peak, the data points are closer to the mean. The further
the data points are from the mean, the greater the standard deviation.

Measures of Position: Percentile, Z-score, Quartiles

Indicate the relative position of a particular data value in the data distribution.

Percentile

The pth percentile of a data set is the data value such that p percent of the values in the data set are at or
below this value. The 50th percentile is the median. For example, the median income is $32,150, and
50% of the data values lie at or below this value.

Percentile rank

The percentile rank of a data value equals the percentage of values in the data set that are at or below that
value. For example, the percentile rank. of Applicant 1’s income of $38,000 is 90%, since that is the
percentage of incomes equal to or less than $38,000.

Page 17
Interquartile Range (IQR)

The first quartile (Q1) is the 25th percentile of a data set; the second quartile (Q2) is the 50th percentile
(median); and the third quartile (Q3) is the 75th percentile.

The IQR measures the difference between 75th and 25th observation using the formula: IQR = Q3 − Q1.

A data value x is an outlier if either x ≤ Q1 − 1.5(IQR), or x ≥ Q3 + 1.5(IQR).

Uni-variate Descriptive Statistics

Different ways you can describe patterns found in uni-variate data include central tendency : mean, mode
and median and dispersion: range, variance, maximum, minimum, quartiles , and standard deviation.

Pie chart [left] & Bar chart [right] of Marital status from loan applicants table.

The various plots used to visualize uni-variate data typically are Bar Charts, Histograms, Pie Charts. etc.

Bi-variate Descriptive Statistics

Bi-variate analysis involves the analysis of two variables for the purpose of determining the empirical
relationship between them. The various plots used to visualize bi-variate data typically are scatter-plot,
box-plot.

Scatter Plots

The simplest way to visualize the relationship between two quantitative variables , x and y. For two
continuous variables, a scatter-plot is a common graph. Each (x, y) point is graphed on a Cartesian plane,
with the x axis on the horizontal and the y axis on the vertical. Scatter plots are sometimes called
correlation plots because they show how two variables are correlated.

Page 18
Correlation

A correlation is a statistic intended to quantify the strength of the relationship between two variables.
The correlation coefficient r quantifies the strength and direction of the linear relationship between two
quantitative variables. The correlation coefficient is defined as:

Positive correlation (r > 0), Negative correlation (r < 0), No correlation (r = 0)

Box Plots

A box plot is also called a box and whisker plot and it’s used to picture the distribution of values. When
one variable is categorical and the other continuous, a box-plot is commonly used. When you use a box
plot you divide the data values into four parts called quartiles. You start by finding the median or middle
value. The median splits the data values into halves. Finding the median of each half splits the data values
into four parts, the quartiles.

Each box on the plot shows the range of values from the median of the lower half of the values at the
bottom of the box to the median of the upper half of the values at the top of the box. A line in the middle
of the box occurs at the median of all the data values. The whiskers then point to the largest and smallest
values in the data.

14.BAYESIAN CONDITIONAL PROBABILITY

Bayes’ Theorem is used to determine the conditional probability of an event. It was named after an
English statistician, Thomas Bayes who discovered this formula in 1763.

Bayes Theorem is a very important theorem in mathematics, that laid the foundation of a unique
statistical inference approach called the Bayes’ inference.

Page 19
It is used to find the probability of an event, based on prior knowledge of conditions that might be related
to that event. For example, if we want to find the probability that a white marble drawn at random came
from the first bag, given that a white marble has already been drawn, and there are three bags each
containing some white and black marbles, then we can use Bayes’ Theorem.

This article will explore the Bayes theorem. We will present the statement, proof, derivation, and formula
of the theorem, as well as illustrate its applications with various examples.

What is Bayes’ Theorem?

Bayes theorem (also known as the Bayes Rule or Bayes Law) is used to determine the conditional
probability of event A when event B has already occurred.

The general statement of Bayes’ theorem is “The conditional probability of an event A, given the
occurrence of another event B, is equal to the product of the event of B, given A and the probability of A
divided by the probability of event B.” i.e.

P(A|B) = P(B|A)P(A) / P(B)

where,

 P(A) and P(B) are the probabilities of events A and B

 P(A|B) is the probability of event A when event B happens

 P(B|A) is the probability of event B when A happens

Bayes Theorem Statement

Bayes’ Theorem for n set of events is defined as,

Let E1, E2,…, En be a set of events associated with the sample space S, in which all the events E1, E2,
…, En have a non-zero probability of occurrence. All the events E1, E2,…, E form a partition of S. Let A
be an event from space S for which we have to find probability, then according to Bayes’ theorem,

P(Ei|A) = P(Ei)P(A|Ei) / ∑ P(Ek)P(A|Ek)

for k = 1, 2, 3, …., n

Bayes Theorem Formula

For any two events A and B, then the formula for the Bayes theorem is given by: (the image given below
gives the Bayes’ theorem formula)

Bayes’ Theorem Formula

where,

 P(A) and P(B) are the probabilities of events A and B also P(B) is never equal to zero.

Page 20
 P(A|B) is the probability of event A when event B happens

 P(B|A) is the probability of event B when A happens

Bayes Theorem Derivation

The proof of Bayes’ Theorem is given as, according to the conditional probability formula,

P(Ei|A) = P(Ei∩A) / P(A)…..(i)

Then, by using the multiplication rule of probability, we get

P(Ei∩A) = P(Ei)P(A|Ei)……(ii)

Now, by the total probability theorem,

P(A) = ∑ P(Ek)P(A|Ek)…..(iii)

Substituting the value of P(Ei∩A) and P(A) from eq (ii) and eq(iii) in eq(i) we get,

P(Ei|A) = P(Ei)P(A|Ei) / ∑ P(Ek)P(A|Ek)

Bayes’ theorem is also known as the formula for the probability of “causes”. As we know, the Ei‘s are a
partition of the sample space S, and at any given time only one of the events Ei occurs. Thus we conclude
that the Bayes’ theorem formula gives the probability of a particular Ei, given the event A has occurred.

Various terms used in the Bayes theorem are explained below in this article.

Terms Related to Bayes Theorem

After learning about Bayes theorem in detail, let us understand some important terms related to the
concepts we covered in formula and derivation.

Hypotheses

Events happening in the sample space E1, E2,… En is called the hypotheses

Priori Probability

Priori Probability is the initial probability of an event occurring before any new data is taken into
account. P(Ei) is the priori probability of hypothesis Ei.

Posterior Probability

Posterior Probability is the updated probability of an event after considering new information.
Probability P(Ei|A) is considered as the posterior probability of hypothesis Ei

Conditional Probability

The probability of an event A based on the occurrence of another event B is termed conditional
Probability. It is denoted as P(A|B) and represents the probability of A when event B has already
happened.

Joint Probability

Page 21
When the probability of two more events occurring together and at the same time is measured it is
marked as Joint Probability. For two events A and B, it is denoted by joint probability is denoted
as, P(A∩B).

Random Variables

Real-valued variables whose possible values are determined by random experiments are called random
variables. The probability of finding such variables is the experimental probability.

Applications of Bayes’ Theorem

Bayesian inference is very important and has found application in various activities, including medicine,
science, philosophy, engineering, sports, law, etc., and Bayesian inference is directly derived from Bayes’
theorem.

Example: Bayes’ theorem defines the accuracy of the medical test by taking into account how likely a
person is to have a disease and what is the overall accuracy of the test.

Difference Between Conditional Probability and Bayes Theorem

The difference between Conditional Probability and Bayes Theorem can be understood with the help of
the table given below,

Bayes’ Theorem Conditional Probability

Bayes’ Theorem is derived using the definition of

Conditional Probability is the probability of
conditional probability. It is used to find the reverse
event A when event B has already occurred.
probability.

Formula: P(A|B) = [P(B|A)P(A)] / P(B) Formula: P(A|B) = P(A∩B) / P(B)

15.VECTOR CALCULUS AND OPTIMIZATION

In machine learning, an objective function (also known as a loss or cost function) quantifies how well a
model’s predictions match the actual target values in the training data. The goal of optimization is then to
find the set of model parameters that minimizes this objective function, effectively making the model’s
predictions as accurate as possible.

That’s when calculus comes into play, as it provides the mathematical foundation for understanding how
functions change and how to optimize them with the help of tools like derivatives and gradients.

Page 22
Functions

Functions are fundamental to the concepts we’ll be mentioning. As a reminder, A function f is a

mathematical concept that associates each element from the set of its possible inputs, called the domain,
with a unique element in another set, called the codomain or range, which is the set of its possible
outputs. From here on, we will work under the assumption that the functions we consider possess the
property of differentiability. This property implies that these functions have well-defined derivatives,
enabling us to analyze how they change over their domains.

Differentiation of Univariate Functions

Roughly speaking, a derivative describes how the function’s output changes with respect to a small
change in its input. Mathematically, a derivative can be defined as follows:

A function can have more than one variable; we call it a multivariate function. In such cases, the function
can have partial derivatives obtained by varying one variable at a time.

By collecting these partial derivatives in a vector, we obtain a Gradient (see where I’m heading ?)

Page 23
In a similar way to how we defined a function f in R:

The collection of all first-order partial derivatives of a vector-valued function f : Rn → Rm is called

the Jacobian. The gradient of a Jacobian J is an m × n matrix :

We can also encounter situations where we need to take gradients of matrices with respect to vectors (or
other matrices), which yields a multidimensional tensor.

Now after all these definitions, let’s (finally) explain why is this useful in Machine Learning algorithms.

Backpropagation

Consider a neural network with multiple layers. The goal of training is to adjust the weights and biases of
the network in such a way that the predicted outputs are as close as possible to the true outputs. This is
achieved by minimizing a loss function that quantifies the discrepancy between predictions and actual
values.

Let’s break down the backpropagation algorithm step by step :

Step 1: Forward Pass

1. Start by feeding an input example x through the neural network.

2. Propagate the input through each layer of the network, one by one, to compute the output
activations. At each layer, compute the weighted sum of the inputs, apply the activation function,
and pass the result to the next layer.

3. Continue this process until you reach the output layer, obtaining the predicted output ypred

Neural Network Representation (from Deeplearning.ai)

Page 24
Step 2: Compute Loss

1. Calculate the loss between the predicted output ypred and the actual target ytrue using a suitable
loss function, such as mean squared error or cross-entropy

Step 3: Backward Pass

1. For each hidden layer l, compute the gradient of the loss L with respect to the weighted sum of
inputs z(l) before the activation function.

2. Compute the gradient of the loss with respect to the layer’s weights and biases using the computed
local gradient and the input activations from the previous layer.

Step 4: Update Weights and Biases

After computing the gradients for all layers, we update the weights and biases using an optimization
algorithm like gradient descent :

updating weights and biases using gradient descent

Step 5: Repeat

We continue this process for a predefined number of iterations (epochs) or until the loss converges to a
satisfactory level.

The Gradient Descent algorithm mentioned above provides an overview of Optimization. Training a
machine learning model involves finding a set of parameters that allows the model to make accurate
predictions or classifications based on input data. The process of training is essentially the search for
these parameter values that optimize the model’s performance.

Optimization

In machine learning, most objective functions are designed to be minimized, eaning that the optimal value
corresponds to the minimum. Generally, finding the minimum using analytical methods isn’t feasible,
therefore we start at an initial valuevalue, then follow the negative gradient.

Page 25
Where alpha is called the step size ans ∇f(xt) represents the gradient of f evaluated at the point xt.

It’s important to choose a good step size alpha, also called the learning rate. could lead to slow
algorithmic performance, while a large one might cause the gradient descent to overshoot or diverge.
Another common challenge in optimization is reaching a global minimum instead of a local minimum.
You can overcome the issue by trying different step sizes, different initial points, incorporate momentum
into the optimization algorithm or using Stochastic Gradient Descent (SGD)

Stochastic Gradient Descent (SGD)

When dealing with very large datasets, the feasibility of loading the entire dataset into memory for the
standard Gradient Descent algorithm becomes questionable. That’s where Stochastic Gradient Descent
comes into play. Instead of working with the whole datset, SGD splits the training data into mini-batches
and then for each mini-batch computes the gradient of the loss function with respect to the parameters
using the mini-batch data and finally updates the parameters. Of course, this randomness from mini-batch
sampling introduces noise into the optimization process. That’s why many variations of the SGD
algorithm were developed, such as Adam (Adaptive Moment Estimation) which is commonly used for
training deep learning models.

Constrained Optimization

The problems we have previously considered were free of constraints, this isn’t always the case. For
example, in various contexts, you may have limited resources to allocate or limits to respect. A general
constrained optimization problem looks like this :

Constrained to

The inequality constraints

Lagrange multipliers are a technique used to solve this kind of problems. The method involves
introducing additional variables (Lagrange multipliers) to incorporate the constraints into the objective
function. . This creates a new function, called the Lagrangian, by introducing, Lagrange multipliers for
each inequality constraint and for each equality constraint. The Lagrangian function is:

The Lagrangian function

The constrained optimization problem we have previously stated is called the primal problem. To which
we associate the dual problem The dual function is defined as:

Page 26
The dual function

And the dual problem is to find

Subject to

Why it’s useful

Support Vector Machines (SVMs) utilize the concept of Lagrange multipliers to define an optimal
hyperplane that maximizes the margin between distinct classes of data points. This margin represents the
separation distance between the hyperplane and the nearest data points from each class. In scenarios
where our data doesn’t exhibit perfect linear separability, the concept of the soft margin SVM introduces
the notion of slack variables and an associated penalty term to accommodate misclassified points. These
slack variables, denoted as ξ, come into play when points either fall within the margin or are incorrectly
classified. They quantify the extent by which a data point resides on the “incorrect” side of its respective
margin hyperplane.

The SVM optimization problem with a soft margin is formulated as:

SVM optimization problem with Soft Margin

Here, C is a hyperparameter controlling the trade-off between maximizing the margin and minimi-zing
the classification error and ξi is the slack variable associated with the i-th training example, measuring the
margin violation.

The Lagrangian formulation corresponding to this problem is given by:

The Lagrangian of SVM with soft margin

The dual problem of SVM optimization is frequently tackled due to its lower dimensionality in
comparison to the primal problem. It is important to recognize that although SVMs offer a well-structured
optimization problem, addressing it can necessitate substantial computational resources, particularly
when dealing with very large datasets.

Page 27
16.DECISION THEORY

Decision theory is a study of an agent's rational choices that supports all kinds of progress in technology
such as work on machine learning and artificial intelligence. Decision theory looks at how decisions are
made, how multiple decisions influence one another, and how decision-making parties deal with
uncertainty.

Decision theory is a fundamental concept in the mathematical foundation of machine learning,

particularly in the context of supervised learning. It provides a framework for making decisions under
uncertainty, which is inherent in many machine learning problems. Here's how decision theory is
incorporated into machine learning:

1. Decision-making Process: At its core, machine learning involves making decisions based on
data. Decision theory formalizes this process by considering the available choices (actions),
possible outcomes, and the probabilities associated with each outcome. In the context of
classification, for example, the decision-making process involves selecting a class label for a
given input based on the observed data.

2. Utility Theory: Decision theory often employs utility functions to quantify the desirability of
different outcomes. In machine learning, utility functions may represent various objectives, such
as accuracy, precision, recall, or any other performance metric relevant to the specific application.
By maximizing expected utility (or minimizing expected loss), machine learning algorithms can
make decisions that lead to the most desirable outcomes on average.

3. Loss Functions: In supervised learning, the choice of a loss function plays a crucial role in
training machine learning models. Loss functions quantify the discrepancy between predicted
outcomes and true outcomes. Different loss functions correspond to different notions of error, and
their selection depends on the specific characteristics of the problem at hand. Decision theory
provides a principled framework for choosing appropriate loss functions based on the decision-
making objectives and the underlying uncertainty.

4. Bayesian Decision Theory: Bayesian decision theory is particularly relevant in machine learning,
especially in probabilistic modeling and Bayesian inference. It combines Bayesian probability
theory with decision theory to make optimal decisions in uncertain environments. In Bayesian
decision theory, decisions are made by considering both prior knowledge (expressed as a
probability distribution) and observed data, leading to posterior decisions that maximize expected
utility.

5. Risk Minimization: Machine learning algorithms often aim to minimize some notion of risk,
which encompasses the expected loss or error over the entire input space. Decision theory
provides a formal framework for risk minimization, allowing practitioners to design learning
algorithms that make decisions with desirable properties, such as robustness to uncertainty and
generalization to unseen data.

6. Reinforcement Learning: In reinforcement learning, agents learn to make sequential decisions in

dynamic environments by interacting with the environment and receiving feedback in the form of
rewards or penalties. Decision theory underpins the design of reinforcement learning algorithms
by formalizing the decision-making process, reward structure, and trade-offs between exploration
and exploitation.

Page 28
17.INFORMATION THEORY

Machine learning aims to extract interesting signals from data and make critical predictions. On the other
hand, information theory studies encoding, decoding, transmitting, and manipulating information.

Information theory plays a crucial role in the mathematical foundation of machine learning, particularly
in understanding and quantifying various aspects of data, learning, and communication. Here's how
information theory is incorporated into machine learning:

1. Entropy: Entropy is a fundamental concept in information theory that measures the uncertainty or
randomness of a random variable. In machine learning, entropy is often used in decision trees and
other classification algorithms to quantify the impurity or disorder of a set of labels. By
minimizing entropy, these algorithms can construct decision boundaries that effectively separate
different classes.

2. Cross-Entropy Loss: Cross-entropy is a measure of the difference between two probability

distributions. In machine learning, it is commonly used as a loss function for classification tasks,
where the goal is to minimize the discrepancy between the predicted class probabilities and the
true class labels. Cross-entropy loss encourages the model to assign high probabilities to the
correct classes and low probabilities to incorrect classes.

3. KL Divergence: Kullback-Leibler (KL) divergence is a measure of the difference between two

probability distributions. It quantifies how much one distribution diverges from another. In
machine learning, KL divergence is often used in probabilistic modeling, particularly in
variational inference and model comparison. Minimizing KL divergence allows practitioners to
match a learned distribution to a target distribution or to select the model that best approximates
the true data distribution.

4. Mutual Information: Mutual information measures the amount of information that two random
variables share. In machine learning, mutual information is used in feature selection and feature
extraction to identify informative features that are relevant to the prediction task. By maximizing
mutual information between features and labels, machine learning algorithms can focus on the
most discriminative aspects of the data.

5. Compression and Coding: Information theory provides insights into data compression and
coding techniques, which are essential for reducing the storage and transmission costs of data. In
machine learning, compression algorithms can be used to pre-process data and reduce its
dimensionality, leading to more efficient learning algorithms and improved generalization
performance.

6. Channel Capacity: Channel capacity represents the maximum rate at which information can be
reliably transmitted over a communication channel. In machine learning, understanding channel
capacity can help in designing efficient communication protocols for distributed learning systems,
where data is transmitted between multiple nodes or devices.

Page 29

Machine Learning Overview & History
No ratings yet
Machine Learning Overview & History
5 pages
ML-1st Unit
No ratings yet
ML-1st Unit
23 pages
FAM Unit4
No ratings yet
FAM Unit4
11 pages
L001 Introduction
No ratings yet
L001 Introduction
15 pages
Unit-1 ML
No ratings yet
Unit-1 ML
23 pages
Notes Unit 1 ML
No ratings yet
Notes Unit 1 ML
17 pages
Machine Learning
No ratings yet
Machine Learning
54 pages
Machine Learning
No ratings yet
Machine Learning
81 pages
The Fundamentals of Machine Learning
No ratings yet
The Fundamentals of Machine Learning
19 pages
University Institute of Technology Barkatullah University Bhopal
No ratings yet
University Institute of Technology Barkatullah University Bhopal
16 pages
SSRN 3323796
No ratings yet
SSRN 3323796
22 pages
ML Basic
No ratings yet
ML Basic
12 pages
ML New New 1
No ratings yet
ML New New 1
15 pages
MLNotes 1
No ratings yet
MLNotes 1
8 pages
ml3 2
No ratings yet
ml3 2
59 pages
Machine Learning UNIT-3
100% (1)
Machine Learning UNIT-3
16 pages
MLT Unit-1 Notes
No ratings yet
MLT Unit-1 Notes
26 pages
Need For Machine Learning
No ratings yet
Need For Machine Learning
194 pages
ML PPT1
No ratings yet
ML PPT1
70 pages
MLF 1
No ratings yet
MLF 1
15 pages
Machine Learning Documentation
No ratings yet
Machine Learning Documentation
18 pages
Machine Learning UNIT I
No ratings yet
Machine Learning UNIT I
42 pages
Lecture 1 - SML Introduction
No ratings yet
Lecture 1 - SML Introduction
29 pages
ML Material
No ratings yet
ML Material
40 pages
Machine Learning Notes Unit 1 To 4
No ratings yet
Machine Learning Notes Unit 1 To 4
101 pages
Ama U3
No ratings yet
Ama U3
19 pages
1 - Intro To Machine Learning
100% (1)
1 - Intro To Machine Learning
20 pages
Unit 4
No ratings yet
Unit 4
39 pages
1 ML
No ratings yet
1 ML
24 pages
Module 4 & 5
No ratings yet
Module 4 & 5
58 pages
Unit-3 ML Mech 3-2
No ratings yet
Unit-3 ML Mech 3-2
16 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
16 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
54 pages
ML 1234
No ratings yet
ML 1234
10 pages
AI and Machine Learning Autosaved
No ratings yet
AI and Machine Learning Autosaved
26 pages
Unit 5 Machine Learning
No ratings yet
Unit 5 Machine Learning
14 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Unit-1 MLA
No ratings yet
Unit-1 MLA
31 pages
Machine Learning Basics for Beginners
100% (2)
Machine Learning Basics for Beginners
139 pages
Machine Learning Unit 1
No ratings yet
Machine Learning Unit 1
14 pages
A Study On Machine Learning Algorithms and Its Applications
No ratings yet
A Study On Machine Learning Algorithms and Its Applications
13 pages
Story of ML
No ratings yet
Story of ML
4 pages
Unit 1
No ratings yet
Unit 1
88 pages
History
No ratings yet
History
7 pages
Faiml Unit 2
No ratings yet
Faiml Unit 2
7 pages
Unit - 5.1 - Introduction To Machine Learning
No ratings yet
Unit - 5.1 - Introduction To Machine Learning
38 pages
Describe Artificial Intelligence and Machine Learning
No ratings yet
Describe Artificial Intelligence and Machine Learning
27 pages
ML
No ratings yet
ML
18 pages
Report On Machine Learning
No ratings yet
Report On Machine Learning
13 pages
ML Notes
No ratings yet
ML Notes
202 pages
Chapter One-F
No ratings yet
Chapter One-F
36 pages
ML 01
No ratings yet
ML 01
23 pages
Case Study
No ratings yet
Case Study
6 pages
Unit 3
No ratings yet
Unit 3
47 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
17 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
74 pages
FML CSOE-007 FML B Tech 6th Sem OE Till 9th Feb 2024
No ratings yet
FML CSOE-007 FML B Tech 6th Sem OE Till 9th Feb 2024
134 pages
Machine Learning
No ratings yet
Machine Learning
27 pages
12th Physics Important Questions English Medium
No ratings yet
12th Physics Important Questions English Medium
8 pages
Cmep Unit - 2 Notes
No ratings yet
Cmep Unit - 2 Notes
22 pages
Cmep Unit - 3 Notes
No ratings yet
Cmep Unit - 3 Notes
15 pages
Case-Advanced Networking For AWS Cloud
No ratings yet
Case-Advanced Networking For AWS Cloud
10 pages
MLT Unit 5 Notes
No ratings yet
MLT Unit 5 Notes
14 pages
Lecture 7
No ratings yet
Lecture 7
10 pages
Presentati w06d1
No ratings yet
Presentati w06d1
31 pages
Ch19 Thermo 2 Kotz
No ratings yet
Ch19 Thermo 2 Kotz
22 pages
Electricity PowerPoint 0
No ratings yet
Electricity PowerPoint 0
32 pages
Lecture 29
No ratings yet
Lecture 29
16 pages
VERITAS Cluster Server Commands Guide
No ratings yet
VERITAS Cluster Server Commands Guide
6 pages
SEMESTER
No ratings yet
SEMESTER
9 pages
NumPy Array Operations Guide
No ratings yet
NumPy Array Operations Guide
9 pages
Sage Guide for Undergraduates
No ratings yet
Sage Guide for Undergraduates
366 pages
Climatic Subdivisions in Saudi Arabia: An Application of Principal Component Analysis
No ratings yet
Climatic Subdivisions in Saudi Arabia: An Application of Principal Component Analysis
17 pages
M Rades Mechanical Vibrations 2
No ratings yet
M Rades Mechanical Vibrations 2
354 pages
It Syllabus 2014 15
No ratings yet
It Syllabus 2014 15
147 pages
Analysing Amino Acids in Galanin Graph Theoretical Approach
No ratings yet
Analysing Amino Acids in Galanin Graph Theoretical Approach
5 pages
NUS DSA4212 Midterm Exam
No ratings yet
NUS DSA4212 Midterm Exam
7 pages
Periyar University: B.Sc. Mathematics
No ratings yet
Periyar University: B.Sc. Mathematics
64 pages
Linear Algebra
No ratings yet
Linear Algebra
65 pages
Eigen Values and Vectors Ai HL
No ratings yet
Eigen Values and Vectors Ai HL
9 pages
Mathematics III
100% (1)
Mathematics III
2 pages
Chapter 6 State Space Analysis
100% (1)
Chapter 6 State Space Analysis
41 pages
Advanced Engineering Math Notes
No ratings yet
Advanced Engineering Math Notes
230 pages
Lsevp
No ratings yet
Lsevp
242 pages
Discrete State-Variable Techniques
No ratings yet
Discrete State-Variable Techniques
65 pages
An Integrated Analysis of Landsat OLI Image and Satellite Gravity Data For Geological Mapping in North Kordofan State, Sudan
No ratings yet
An Integrated Analysis of Landsat OLI Image and Satellite Gravity Data For Geological Mapping in North Kordofan State, Sudan
8 pages
IMC 2017 - Day 1 (Problems and Solutions)
No ratings yet
IMC 2017 - Day 1 (Problems and Solutions)
4 pages
Kaur 2015
No ratings yet
Kaur 2015
5 pages
MATLAB-Based Programs For Power System Dynamic Analysis: Ismael Abdulrahman
No ratings yet
MATLAB-Based Programs For Power System Dynamic Analysis: Ismael Abdulrahman
11 pages
Subsection IVLT: Invertible Linear Transformations
No ratings yet
Subsection IVLT: Invertible Linear Transformations
11 pages
!MA2001 Summary Notes
No ratings yet
!MA2001 Summary Notes
12 pages
CHOICES 3 - C&C Manual
No ratings yet
CHOICES 3 - C&C Manual
51 pages
MYSTRAN Demo Problem Manual
No ratings yet
MYSTRAN Demo Problem Manual
61 pages
Koopman For Hybrid Systems
No ratings yet
Koopman For Hybrid Systems
8 pages
Lecture 6
No ratings yet
Lecture 6
18 pages
Scipy Cheat Sheet Python For Data Science: Linear Algebra
No ratings yet
Scipy Cheat Sheet Python For Data Science: Linear Algebra
1 page
BV Cvxbook Extra Exercises
No ratings yet
BV Cvxbook Extra Exercises
165 pages
Delhi University Mathematics UG Curriculum
No ratings yet
Delhi University Mathematics UG Curriculum
59 pages
Triad y Quest
No ratings yet
Triad y Quest
8 pages