01 Important Machine Learning Algorithms

The document provides an overview of important machine learning algorithms, categorized into supervised, unsupervised, semi-supervised, and reinforcement learning algorithms. Key algorithms discussed include Decision Trees, Random Forests, Support Vector Machines, Naïve Bayes, Markov Models, Artificial Neural Networks, k-means, Principal Component Analysis, and Q-Learning, along with their advantages and disadvantages. The text serves as a foundational introduction to these algorithms, which are explored in more detail in subsequent chapters.

Uploaded by

R K Roja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views5 pages

01 Important Machine Learning Algorithms

Uploaded by

R K Roja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

IMPORTANT MACHINE LEARNING ALGORITHMS

This section is aimed at giving an overall view of the important machine learning algorithms.
These algorithms are discussed in a detailed manner in the subsequent chapters of this book.
The presented algorithms are based on their popularity as well as their performance. Let us
discuss some of these algorithms now.

Supervised Algorithms

Supervised algorithms include classification algorithms and regression algorithms.

Classification algorithms classify an unknown instance by finding the labels for it. Some of the
important classification algorithms are listed below.

Decision Tree Algorithm

The decision tree (DT) algorithm was developed by J. Ross Quinlan; the first algorithm was called
iterative dichotomize 3 (ID3) algorithm. This algorithm takes input and produces a tree for each
decision. Each branch in the tree demonstrates possible outcomes of a decision based on some
condition. A decision tree is a simple depiction of classifying instances.
A decision tree (DT) consists of nodes and edges. The nodes may be internal (known as non-leaf) and
external (known as leaf). DT is one of the most important classification algorithms. It is close to
human thinking. The internal nodes have attributes and the decisions made from it are branches.
These branches represent the possible values and outcomes of the feature. Each leaf of the decision
tree is marked with a class or probability distribution. The classification rules can be obtained by
tracing from the root to the leaf. The variations of this tree algorithm are ID3, C4.5, and CART.
Advantages:
 Useful for linearly separable classification boundary type of problems
 Fast
 Accurate
 Understandable and easily interpretable
 Rules can be generated using decision tree
Disadvantage:

 Computationally expensive when there are a lot of uncorrelated attributes

Random Forest Algorithm

This algorithm was designed by Tim Keion Ho in 1995 and later extended by Leo Breiman and Adele
Cutler. This approach creates a group of trees by the randomization of features’ selection. Then, the
outputs are combined by pooling the results of all decision trees or by going with the majority vote to
get final solution.
Advantages:
 Accuracy is high
 Provides higher classification accuracy
 Can be parallelized
Disadvantage:
 The theoretical analysis of this algorithm is difficult. This algorithm is used in the product
DeepSpeech.
Support Vector Machines
The Support Vector Machines (SVMs) algorithm was developed by Vladimir N. Vapnik and Alexey
Ya. Chervonenkis in 1963. The concept of Kernels was developed by Bernhard E. Boser, Isabelle M.
Guyoun, and Vladimir N. Vapnik in 1992.
SVM is a binary classifier that uses the decision line with maximal margin to assign new examples
into categories 1 or 2. SVM is a classification method and good for classifying the samples of non-
linear separable classification problems also.
Advantages:
 Good generalization
 Flexibility
 SVMs are robust ness is characteristic of this method
Disadvantages:
 Slow
 High algorithmic complexity

Naïve Bayes
This is a family of algorithms that includes Bayesian network as well. Naive Bayes treats all features
independent of each other. The algorithm uses Bayesian formula for classification. Bayes theorem
was proposed by Reverend Thomas Bayes. The algorithm computes the posterior probability based on
the prior/probability of the classes. This algorithm is useful when the features are independent of each
other.
Advantages:
 Fast
 Easily understandable
 Robust
Disadvantages:
 Cannot handle large training set
 Computationally intensive

Markov and Hidden Markov Models

Markov models are probabilistic sequence models with the system broadly assuming Markovian
assumptions. Imagine, one takes a sequence of climate conditions for 20 days. The climate is rainy,
sunny, rainy, sunny, sunny … and so on. This data is called sequence data. The focus of this problem
aim is to answer the questions such as – Will it be rainy or sunny tomorrow?
Markovian assumption means one need not consider the entire sequence data. Instead, the present day
depends only on previous data. This implies that today’s climate is based on yesterday’s climate.
The Markov model in the form of Markov chain was proposed by Andrey Markov in 1906 in the form
process of Markov chain. Later, Hidden Markov Models (HMM) was developed by L.E Baum in
1960s. In HMM, the states are not visible or observable fully or partially.
Markov models and HMM models can be used for prediction. Speech recognition is one application
where these models are in great demand.
Artificial Neural Networks
The artificial neural network (ANN) is modelled on the human brain by showing several neurons that
are interconnected. These networks receive a set of inputs, use activation functions, and result in
further activation of neurons forming a model. These models can be trained and later used by the
unknown test inputs. ANN can be used for classification, regression, and clustering.
Advantages:
 Good for classification algorithm, chatbots
 Simple to implement
 Effective
Disadvantages:
 Longer training time and large data is required for higher quality results
Feed forward networks are classifiers. The networks like Self Organizing Maps (SOM) can be used
for clustering. These algorithms are discussed in Chapter 10 of this book.
Deep networks are an extension of neural networks. Any neural network that has more than two
hidden layers are called Deep networks. Deep networks are discussed in Chapter 16 of this book.
Some of the classic applications of deep neural networks are face recognition, image recognition,
recommendation systems, and driverless cars.
Regression Algorithms
Linear regression is used to model the linear relationship between dependent and independent
variables. Linear regression is used when the response involves continuous variables.
Linear regression originated from the method of least squares that was proposed by Legendre in 1805
and by Gauss in 1809. Francis Galton used the name ‘regression’.
Types of regression algorithms:
 Polynomial regression
 Multiple regression
 Logistic regression
 Ridge/ Lasso/ Elastic net regression
Disadvantages:
 Large datasets are required to uncover relationships
 This algorithm also assumes that variables are independent but in practice, it is not possible.

Unsupervised Algorithms
Some of the important unsupervised algorithms are listed below.
k-means Algorithm
It is a non-hierarchical clustering algorithm used to group objects that are similar to each other. This
is called cluster analysis. This was developed by James MacQueen in 1967. Here, k is the number of
clusters the user wants. The algorithm randomly selects k points in the dataset and maps all the
samples to the closest cluster by computing the distance between them and the cluster centroid. This is
an iterative algorithm and continues till all the samples are clustered.
Advantages:
 Faster
 Generated clusters are spherical and hence it is easy to learn the underlying structure of the
data. For a randomly generated cluster, such learning is difficult.
 Easily understandable
 Easily interpretable
Disadvantages:
 The algorithm is sensitive to outliers and noise.
 Different initial points yield different results
 The prediction of ‘k’ is difficult
Principal Component Analysis
Principal component analysis (PCA) is a dimension reduction algorithm. Some features contribute
more for classification than others. For example, a mole on the face can help face detection better than
common features like nose. In simple words, the features should be relevant
The idea of PCA or KL transform is to transform a given set of measurements to a new set of features,
so that the features exhibit high information packing properties. This leads to a reduced and compact
set of features. Basically, this elimination of unnecessary features are made possible because of the
information redundancies present. This compact representation is of a reduced dimension.

Apriori
This is a class of algorithm that uses unsupervised learning. This is used for association mining. This
algorithm extracts association rules from recurrent itemsets that are present in the data.

Semi Supervised Algorithms

There are circumstances where the dataset has huge collections of unlabeled data and few
labelled data. Labelling is a costly process and difficult to perform by humans.

Pseudo-labelling is an algorithm of semi-supervised algorithms that uses unlabeled data by

assigning a pseudo-label. Then, the labelled and pseudo-labelled dataset can be combined to
train a classifier.

Reinforcement Algorithms

Q-Learning is known as an off-policy method. What is a Q-value? Q-value is a numerical value

assigned to a state-action pair. It means a value of the action that is performed at state ‘s’.
Q-value indicates the immediate reward and other rewards that are yet to come, and is the final is
known as the total return reward. Q-learning algorithm constructs a table. Then, the algorithms update
Q-Values of the table based on the starting state, action, reward, and new state. The algorithm, say in
a maze, simulates all paths and makes an estimate of the target state and keeps updating it in the table.
The next action is where the cell has the highest Q-value. Finally, the table guides the agent to
navigate.

AI Unit 5
No ratings yet
AI Unit 5
27 pages
3.popular Machine Learning Algorithm
No ratings yet
3.popular Machine Learning Algorithm
11 pages
Module 1 & 2
No ratings yet
Module 1 & 2
21 pages
Full Notes
No ratings yet
Full Notes
37 pages
AI (Part II)
No ratings yet
AI (Part II)
11 pages
Machine Learning Algorithms - A Review - ART20203995
No ratings yet
Machine Learning Algorithms - A Review - ART20203995
6 pages
Machine Learning - Part - 1
No ratings yet
Machine Learning - Part - 1
17 pages
Intro To ML and Its Type, Applications
No ratings yet
Intro To ML and Its Type, Applications
8 pages
UNIT1
No ratings yet
UNIT1
38 pages
Meta Motion Fitness Tracker 241109 213742 (1) Removed
No ratings yet
Meta Motion Fitness Tracker 241109 213742 (1) Removed
20 pages
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
No ratings yet
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
27 pages
Business Data Mining Week 5
No ratings yet
Business Data Mining Week 5
19 pages
Machine Learning Classification Guide
No ratings yet
Machine Learning Classification Guide
7 pages
ARTIFICIAL INTE-WPS Office
No ratings yet
ARTIFICIAL INTE-WPS Office
29 pages
Data Science Unit-4 B.sc. III Sem. MDC
No ratings yet
Data Science Unit-4 B.sc. III Sem. MDC
6 pages
Machine Learning and AI
No ratings yet
Machine Learning and AI
13 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Unit 3
No ratings yet
Unit 3
61 pages
UNIT4
No ratings yet
UNIT4
12 pages
DM Chapter 0
No ratings yet
DM Chapter 0
4 pages
3 Introduction To Machine Learning
No ratings yet
3 Introduction To Machine Learning
21 pages
Agnik KR Jana - Ca2
No ratings yet
Agnik KR Jana - Ca2
9 pages
1-Mapping Problems To Machine Learning Tasks
No ratings yet
1-Mapping Problems To Machine Learning Tasks
19 pages
Machine Learning Lab Viva
100% (1)
Machine Learning Lab Viva
9 pages
Decision Trees
No ratings yet
Decision Trees
5 pages
Datascience Notes
No ratings yet
Datascience Notes
16 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
14 pages
Unit 1
No ratings yet
Unit 1
8 pages
IMTC634 - Data Science - Chapter 6
No ratings yet
IMTC634 - Data Science - Chapter 6
22 pages
Ml-Unit 1
No ratings yet
Ml-Unit 1
53 pages
SRU ADA Unit-3
No ratings yet
SRU ADA Unit-3
78 pages
ML Unit 2
No ratings yet
ML Unit 2
6 pages
Machine Learning Techniques Survey
No ratings yet
Machine Learning Techniques Survey
6 pages
Machine Learning Is The Branch of
No ratings yet
Machine Learning Is The Branch of
12 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
10 pages
Model For Machine Learing
No ratings yet
Model For Machine Learing
11 pages
Ai Unit 4
No ratings yet
Ai Unit 4
32 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
Ds Unit 2
No ratings yet
Ds Unit 2
36 pages
ML Assignment 1
No ratings yet
ML Assignment 1
12 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
29 pages
Types of Machine Learning - Tpoint Tech
No ratings yet
Types of Machine Learning - Tpoint Tech
10 pages
Machine Learning Unit-I
No ratings yet
Machine Learning Unit-I
41 pages
1machine Learning
No ratings yet
1machine Learning
26 pages
4.introduction To Learning - Unit 2
No ratings yet
4.introduction To Learning - Unit 2
8 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
38 pages
Day 4 Content
No ratings yet
Day 4 Content
35 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
15 pages
Machine Learning - Iii
No ratings yet
Machine Learning - Iii
53 pages
11 Most Common Machine Learning Algorithms Explained in A Nutshell by Soner Yıldırım Towards Data Science
No ratings yet
11 Most Common Machine Learning Algorithms Explained in A Nutshell by Soner Yıldırım Towards Data Science
16 pages
Laurent Paper 11113
No ratings yet
Laurent Paper 11113
7 pages
ML Unit 1
No ratings yet
ML Unit 1
19 pages
Ijctt V48P126
No ratings yet
Ijctt V48P126
11 pages
Unit I Machine Learning
No ratings yet
Unit I Machine Learning
22 pages
Learning Algorithms
No ratings yet
Learning Algorithms
28 pages
ML Lecture - 1
No ratings yet
ML Lecture - 1
33 pages
AI Unit 4
No ratings yet
AI Unit 4
11 pages
Session 3 Types of Machine Learning
No ratings yet
Session 3 Types of Machine Learning
22 pages
DB - Report - Literature Review On-Morse Code Pin For Bank Security Authentication Using Ai-1
No ratings yet
DB - Report - Literature Review On-Morse Code Pin For Bank Security Authentication Using Ai-1
8 pages
VTU Provisional Results Sheet-3
No ratings yet
VTU Provisional Results Sheet-3
1 page
01 Section 2.1 QR Code Content
No ratings yet
01 Section 2.1 QR Code Content
23 pages
03 Exercises QR Code Content
No ratings yet
03 Exercises QR Code Content
2 pages
1 ML
No ratings yet
1 ML
3 pages
Fullstack Previous Scheme
No ratings yet
Fullstack Previous Scheme
11 pages
2412 19696v1
No ratings yet
2412 19696v1
23 pages
Transformer-Based Financial Fraud Detection With Cloud-Optimized Real-Time Streaming
No ratings yet
Transformer-Based Financial Fraud Detection With Cloud-Optimized Real-Time Streaming
13 pages
Enhancing Early Detection of Diabetic Retinopathy Through The Integration of Deep Learning Models and Explainable Artificial Intelligence
No ratings yet
Enhancing Early Detection of Diabetic Retinopathy Through The Integration of Deep Learning Models and Explainable Artificial Intelligence
20 pages
IVF Success Rates Prediction Using Hybrid ANN-GA Based Machine Learning Model
No ratings yet
IVF Success Rates Prediction Using Hybrid ANN-GA Based Machine Learning Model
7 pages
Module 3 RM
No ratings yet
Module 3 RM
36 pages
Bio Kemi 2024
No ratings yet
Bio Kemi 2024
108 pages
Module 4 RM
No ratings yet
Module 4 RM
24 pages
The 30-Year Cycle in The AI Debate
No ratings yet
The 30-Year Cycle in The AI Debate
30 pages
Proof of First AGI
No ratings yet
Proof of First AGI
29 pages
Deep Work Insights for Professionals
100% (1)
Deep Work Insights for Professionals
30 pages
Illumination Invariant Face Detection Using Viola Jones Algorithm
No ratings yet
Illumination Invariant Face Detection Using Viola Jones Algorithm
4 pages
Computer Vision (7th Sem)
No ratings yet
Computer Vision (7th Sem)
48 pages
2021 Magic Quadrant For Cloud AI Developer Services
No ratings yet
2021 Magic Quadrant For Cloud AI Developer Services
27 pages
Pdf?id AAx Is 3 D2 ZZ
No ratings yet
Pdf?id AAx Is 3 D2 ZZ
31 pages
Ai Assignment
No ratings yet
Ai Assignment
2 pages
Nvidia DGX A100 Datasheet PDF
No ratings yet
Nvidia DGX A100 Datasheet PDF
2 pages
Improve Your Software Engineer Skills Using AI
No ratings yet
Improve Your Software Engineer Skills Using AI
2 pages
7 Grade Sample 1
No ratings yet
7 Grade Sample 1
9 pages
Medical Thesis Writing Challenges & Solutions
100% (3)
Medical Thesis Writing Challenges & Solutions
8 pages
Ai (X) Practice Paper 5-1
No ratings yet
Ai (X) Practice Paper 5-1
5 pages
Advanced-Analytics AIS Use Cases 0.1
No ratings yet
Advanced-Analytics AIS Use Cases 0.1
28 pages
From Theory To Practice The Evolution of Artificial Intelligence in Business
No ratings yet
From Theory To Practice The Evolution of Artificial Intelligence in Business
15 pages
GAN Review - Models and Medical Image Fusion Applications
No ratings yet
GAN Review - Models and Medical Image Fusion Applications
15 pages
RRL1
No ratings yet
RRL1
10 pages
A I in Creative Industries
No ratings yet
A I in Creative Industries
14 pages
IATQ Set
No ratings yet
IATQ Set
3 pages
Deep Learning Approaches For Network Traffic Classification in The Internet of Things (Iot) : A Survey
No ratings yet
Deep Learning Approaches For Network Traffic Classification in The Internet of Things (Iot) : A Survey
15 pages
Turing LLM Data Scientist Job Description For Deve 240323 162124 (1)
No ratings yet
Turing LLM Data Scientist Job Description For Deve 240323 162124 (1)
2 pages
Science 10 Unit C Plan
No ratings yet
Science 10 Unit C Plan
10 pages
Sunilbabu253 180317094114
No ratings yet
Sunilbabu253 180317094114
22 pages
Artificial Intelligence Adoptionin Financial Services
No ratings yet
Artificial Intelligence Adoptionin Financial Services
18 pages
Essentials of Python For Artificial Intelligence and Machine Learning Pramod Gupta Instant Download
100% (1)
Essentials of Python For Artificial Intelligence and Machine Learning Pramod Gupta Instant Download
66 pages
ML - 5
No ratings yet
ML - 5
53 pages
ChatGPT Money Machine How To Make Money With ChatGPT and The Best AI Tools To Grow Your Online Business (Updated 2024) (Reuben, Mike) (Z-Library)
No ratings yet
ChatGPT Money Machine How To Make Money With ChatGPT and The Best AI Tools To Grow Your Online Business (Updated 2024) (Reuben, Mike) (Z-Library)
144 pages
机械工程作业帮助
100% (2)
机械工程作业帮助
4 pages
Jabotinsky Pub
No ratings yet
Jabotinsky Pub
37 pages
HarshaVardhan S Resume
No ratings yet
HarshaVardhan S Resume
1 page

01 Important Machine Learning Algorithms

Uploaded by

01 Important Machine Learning Algorithms

Uploaded by

IMPORTANT MACHINE LEARNING ALGORITHMS

Supervised algorithms include classification algorithms and regression algorithms.

Decision Tree Algorithm

 Computationally expensive when there are a lot of uncorrelated attributes

Random Forest Algorithm

Markov and Hidden Markov Models

Semi Supervised Algorithms

Pseudo-labelling is an algorithm of semi-supervised algorithms that uses unlabeled data by

Q-Learning is known as an off-policy method. What is a Q-value? Q-value is a numerical value

You might also like