100% found this document useful (1 vote)

226 views39 pages

Introduction To ML

Machine learning is a branch of artificial intelligence that uses algorithms to identify patterns in data and learn from that data in order to make predictions or decisions without being explicitly programmed. The goal of machine learning is for computers to be able to learn from examples or past experiences to improve their performance on some task. Some key aspects covered in the document include the different types of machine learning tasks like classification, clustering, and prediction, as well as the basic components of a machine learning system like the hypothesis space, search strategy, and evaluation method.

Uploaded by

Pooja Patwari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

226 views39 pages

Introduction To ML

Uploaded by

Pooja Patwari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

INTRODUCTION

What is machine learning?

 Goal: programs that detect patterns and regularities in the
data
 Strong patterns  good predictions
 Problem 1: most patterns are not interesting
 Problem 2: patterns may be inexact (or
spurious)
 Problem 3: data may be garbled or missing
Related Disciplines
 Artificial Intelligence
 Data Mining
 Probability and Statistics
 Information theory
 Numerical optimization
 Computational complexity theory
 Control theory (adaptive)
 Psychology (developmental, cognitive)
 Neurobiology
 Linguistics
 Philosophy

2
What is machine learning?
 A branch of artificial intelligence, concerned with the
design and development of algorithms that allow computers
to evolve behaviors based on empirical data.
 As intelligence requires knowledge, it is necessary for the
computers to acquire knowledge.
 Flood of data…..Highly complex systems,.. Speed of
programming (Supermarkets, Banks, telephone switches,
research, medical ..etc Google??) Any alternative ???
 A program is said to learn from experience E with respect to
task T and performance measure P, if it’s performance at tasks
in T, as measured by P, improves with experience E.
 Machine learning is programming computers to optimize a
performance criterion using example data or past experience
What is ML?
 An algorithm is a sequence of instructions that when
carried out transforms input to output.
 There are tasks with no algorithms.
 The problem of sorting algorithm?
 ??? we gave a program a number of examples of unsorted
lists and corresponding sorted lists, and wanted the
program to learn (or, come up with an algorithm) to sort?
 Learn pattern in data???
 To be intelligent, a system that is in a changing environment
should have the ability to learn.
 If a system can learn and adapt to such changes, the system
designer need not foresee and provide solutions for all
possible situations.
LEARNING
There are two ways that a system can improve:
1. By acquiring new knowledge
 acquiring new facts
 acquiring new skills
2. By adapting its behavior
 solving problems more accurately
 solving problems more efficiently
Why do we need Machine Learning?
• Some tasks cannot be defined well, except by examples (e.g. recognition of
faces or people).

• Large amounts of data may have hidden relationships and correlations.

Only automated approaches may be able to detect these.

• The amount of knowledge about a certain problem / task may be too large
for explicit encoding by humans (e.g. in medical diagnostics)

• Environments change over time, and new knowledge is constantly being

discovered. A continuous redesign of the systems “by hand” may be
difficult.
Some examples of tasks that are best solved by
using a learning algorithm

 Recognizing patterns:
 Facial identities or facial expressions

 Handwritten or spoken words

 Medical images

 Generating patterns:
 Generating images or motion sequences

 Recognizing anomalies:
 Unusual sequences of credit card transactions

 Unusual patterns of sensor readings in a nuclear power

plant or unusual sound in your car engine.
 Prediction:
 Future stock prices or currency exchange rates
Some web-based examples of machine learning

 The web contains a lot of data. Tasks with very big datasets
often use machine learning
 especially if the data is noisy or non-stationary.

 Spam filtering, fraud detection:

 The enemy adapts so we must adapt too.

 Recommendation systems:
 Lots of noisy data. Million dollar prize!

 Information retrieval:
 Find documents or images with similar content.

 Data Visualization:
 Display a huge database in a revealing way
Learning task
• Classification:
 Prediction of an item class.
• Forecasting:
 Prediction of a parameter value.
• Characterization:
 Find hypotheses that describe groups of items.
• Clustering:
 Partitioning of the (unassigned) data set into clusters
with common properties. (Unsupervised learning)
dataset and pre-processing
 Complexity of datasets:
• Many instances (examples)
• Instances with multiple features (properties / characteristics)
• Dependencies between the features (correlations)
 Instance selection:
 Remove identical / inconsistent / incomplete instances (e.g.
reduction of homologous genes, removal of wrongly annotated
genes)

 Feature transformation / selection:

 Projection techniques (e.g. principal components analysis)
 Compression techniques (e.g. minimum description length)
 Feature selection techniques
Defining the Learning Task
Improve on task, T, with respect to
performance metric, P, based on experience, E.
T: Playing checkers
P: Percentage of games won against an arbitrary opponent
E: Playing practice games against itself

T: Recognizing hand-written words

P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten words

T: Driving on four-lane highways using vision sensors

P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.

T: Categorize email messages as spam or legitimate.

P: Percentage of email messages correctly classified.
E: Database of emails, some with human-given labels
Designing a Learning System
 Choose the training experience
 Choose exactly what is to be learned, i.e. the
target function.
 Choose how to represent the target function.
 Choose a learning algorithm to infer the target
function from the experience.

Learner
Environment/
Experience Knowledge

Performance
Element
What is ML?

Can we improve investment gain with help of stock data?

The learning Model
Understanding Hypothesis space
How many possible Boolean functions

4 features = 216 = 65536

After 7 examples, we still have

29 possibilities

The space of all hypothesis that

can be output by a learning algorithm

Version space : space not ruled

out by a training examples
Learning as search
 Inductive learning: find a concept description that fits the data
 Example: rule sets as description language
 Enormous, but finite, search space
 Simple solution:
 enumerate the concept space
 eliminate descriptions that do not fit examples
 surviving descriptions contain target concept

18
witten&eibe
Uses of machine Learning
 Machine Learning creates an optimized model of the
concept being learned based on data or past
experience. The model is parameterized.
 Learning is the execution of a computer program to
optimize the parameter values so that the model fits
data or past experience well.
 Uses of learning: Predictive and/or Descriptive.
 Predictive: Use the model to predict things about an
unseen example.
 Descriptive: Use the model to describe the examples
seen or experiences had. This model can be used in
some problem-solving situation.
The basic principle
 10^5 machine learning algorithms
 Hundreds new every year
 Every algorithm has three components: –
1. Hypothesis space—possible outputs ( ANN,
SVM, Decision tree, Bayes network etc )
2. Search strategy---strategy for exploring space
(optimizing an objective function)
3. Evaluation like accuracy, precision and recall,
squared error ,Likelihood • Posterior probability •
Cost / Utility , Margin
Learning system model

Testing

Input Learning
Samples Method

System

Training
Training and testing

Data acquisition Practical usage

Universal set
(unobserved)

Training set Testing set

(observed) Labels are known (unobserved)
Labels are known but not given
Performance
 There are several factors affecting the performance:
 Types of training provided
 The form and extent of any initial background knowledge
 The type of feedback provided
 The learning algorithms used

 Two important factors:

 Modeling
 Optimization
Algorithms
 The success of machine learning system also depends on the
algorithms.

 The algorithms control the search to find and build the

knowledge structures.

 The learning algorithms should extract useful information

from training examples.
Algorithms
 Supervised learning ( )
 Prediction
 Classification (discrete labels), Regression (real values)
 Unsupervised learning ( )
 Clustering
 Probability distribution estimation
 Finding association (in features)
 Dimension reduction [NO FEEDBACK]
 Semi-supervised learning
 Reinforcement learning [INDIRECT FEEDBACK]
 Decision making (robot, chess machine)
Types of learning task
 Supervised learning
 Learn to predict output when given an input vector
 Who provides the correct answer?
 Reinforcement learning
 Learn action to maximize payoff
 Not much information in a payoff signal
 Payoff is often delayed
 Reinforcement learning is an important area that will not be
covered in this course.
 Unsupervised learning
 Create an internal representation of the input e.g. form
clusters; extract features
 How do we know if a representation is good?
 This is the new frontier of machine learning because most big
datasets do not come with labels.
Algorithms

Supervised learning Unsupervised learning

27 Semi-supervised learning
Machine learning structure

 Supervised learning
Machine learning structure
 Unsupervised learning
Semi-supervised learning (SSL)

 Traditional supervised learning is limited to using labeled data.

 SSL also uses unlabeled data to learn.

Let (x,y) be a labeled instance and (x,ø) be an unlabeled instance.

L: a set of n labaled instances.
U: a set of m unlabeled instances.
n << m
SSL tries to use L U U to learn a predictive model.
Learning techniques
• Linear classifier

, where w is an d-dim vector (learned)

 Techniques:
 Perceptron
 Logistic regression
 Support vector machine (SVM)
 Ada-line
 Multi-layer perceptron (MLP)
Learning techniques
• Non-linear case

 Support vector machine (SVM):

 Linear to nonlinear: Feature transform and kernel function
Learning techniques
 Unsupervised learning categories and techniques
 Clustering
 K-means clustering

 Spectral clustering

 Density Estimation
 Gaussian mixture model (GMM)

 Graphical models

 Dimensionality reduction
 Principal component analysis (PCA)

 Factor analysis
Classification
 There are three methodologies:
a) Model a classification rule directly
Examples: k-NN, linear classifier, SVM, neural nets, …
b) Model the probability of class memberships given input data
Examples: logistic regression, probabilistic neural nets (softmax),…
c) Make a probabilistic model of data within each class
Examples: naive Bayes, model-based ….
 Important ML taxonomy for learning models
probabilistic models vs non-probabilistic models
discriminative models vs generative models
 Resulting model is also called the hypothesis

Classification

zebra tiger rhino panda

Algorith Model lion
hippo
m
elephant
giraffe
lion penguin snake

Given a model space and an optimality criterion, a model satisfying this criterion is sought
Some optimizing criteria:

 Maximizing the prediction accuracy

 Minimizing the hypothesis’ size
 Maximizing the hypothesis fitness to the input data
 Maximizing the hypothesis interpretability
 Minimizing the time complexity of prediction
Classification
Learn a method for predicting the instance class from
pre-labeled (classified) instances

Many approaches:
Regression,
Decision Trees,
Bayesian,
Neural Networks,
...

Given a set of points from classes

what is the class of new point ?
37
Linear and Non-Linear Decision
boundary
Regression
• Regression analysis is used to predict the value of one variable (the
dependent variable) on the basis of other variables (the
independent variables).
• Learn a continuous function.

• Given, the following data, can we find

the value of the output when x = 0.44?
• Goal is to predict for input x an output
f(x) that is close to the true y.

• It is generally a problem of function approximation, or

interpolation, working out the value between values that we
know.
39

Machine Learning: Presentation By: C. Vinoth Kumar SSN College of Engineering
100% (1)
Machine Learning: Presentation By: C. Vinoth Kumar SSN College of Engineering
15 pages
Unit - 5.1 - Introduction To Machine Learning
No ratings yet
Unit - 5.1 - Introduction To Machine Learning
38 pages
Machine Learning Basics Stanford Notes
No ratings yet
Machine Learning Basics Stanford Notes
15 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Machine Learning
100% (1)
Machine Learning
21 pages
ML Algorithms for Data Scientists
100% (1)
ML Algorithms for Data Scientists
148 pages
What Are The Types of Machine Learning?
100% (1)
What Are The Types of Machine Learning?
24 pages
Introduction of Neural Network
No ratings yet
Introduction of Neural Network
31 pages
Top 45 Machine Learning Interview Questions in 2025
100% (1)
Top 45 Machine Learning Interview Questions in 2025
37 pages
Machine Learning Tutorial
100% (1)
Machine Learning Tutorial
44 pages
L2 - Machine Learning Process
No ratings yet
L2 - Machine Learning Process
17 pages
11 Machine Learning System Design PDF
No ratings yet
11 Machine Learning System Design PDF
7 pages
Ensemble Learning: Wisdom of The Crowd
100% (1)
Ensemble Learning: Wisdom of The Crowd
12 pages
Feature Engineering Guide
100% (2)
Feature Engineering Guide
44 pages
Python Feature Engineering Guide
No ratings yet
Python Feature Engineering Guide
27 pages
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
100% (1)
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
12 pages
30 Amazing Machine Learning Projects For The Past Year (v.2018)
No ratings yet
30 Amazing Machine Learning Projects For The Past Year (v.2018)
22 pages
Python Machine Learning Algorithms
100% (3)
Python Machine Learning Algorithms
16 pages
Natural Language Toolkit NLTK PDF
No ratings yet
Natural Language Toolkit NLTK PDF
23 pages
Machine Learning Basics: An Illustrated Guide For Non-Technical Readers
No ratings yet
Machine Learning Basics: An Illustrated Guide For Non-Technical Readers
16 pages
Predictive Model for Retailers
100% (1)
Predictive Model for Retailers
3 pages
Twitter Sentiment Analysis Project
100% (1)
Twitter Sentiment Analysis Project
14 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Symbolic Machine Learning: M.S.Kaysar, M.Engg Cse, Iub
100% (2)
Symbolic Machine Learning: M.S.Kaysar, M.Engg Cse, Iub
112 pages
ML Projects 1
No ratings yet
ML Projects 1
29 pages
LSTM for Touchpoint Prediction
100% (1)
LSTM for Touchpoint Prediction
73 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Notes On Machine Learning
No ratings yet
Notes On Machine Learning
2 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
Combined ML
100% (1)
Combined ML
705 pages
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
100% (1)
Hyperparameter Tuning in XGBoost Using Genetic Algorithm
11 pages
Introduction To Machine Learning: Methods, Applications, Etc
No ratings yet
Introduction To Machine Learning: Methods, Applications, Etc
15 pages
Python - Module 3
No ratings yet
Python - Module 3
86 pages
NLP and Generative AI Syllabus - 2025
No ratings yet
NLP and Generative AI Syllabus - 2025
5 pages
Word2Vec Tutorial - The Skip-Gram Model Chris McCormick PDF
No ratings yet
Word2Vec Tutorial - The Skip-Gram Model Chris McCormick PDF
39 pages
Machine Lpipearning Interview Questions: Algorithms/Tp: Q1-What's The Trade-Off Between Bias and Variance?
No ratings yet
Machine Lpipearning Interview Questions: Algorithms/Tp: Q1-What's The Trade-Off Between Bias and Variance?
46 pages
Artificial Intelligence: Slide 6
100% (1)
Artificial Intelligence: Slide 6
42 pages
Machine Learning Lab Manual 7
100% (1)
Machine Learning Lab Manual 7
8 pages
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
No ratings yet
An Introduction To Supervised Learning With Scikit-Learn: Machine Learning: The Problem Setting
4 pages
Feature Engineering Handout
No ratings yet
Feature Engineering Handout
33 pages
Statistics in Details
100% (2)
Statistics in Details
283 pages
Fake News Detection for Researchers
No ratings yet
Fake News Detection for Researchers
5 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Beginner's Guide to Regression Models
No ratings yet
Beginner's Guide to Regression Models
18 pages
Azure Machine Learning Algorithm Cheat Sheet Nov2019
100% (1)
Azure Machine Learning Algorithm Cheat Sheet Nov2019
1 page
Deep Learning Interview Questions and Answers
No ratings yet
Deep Learning Interview Questions and Answers
21 pages
Python Machine Learning Guide
100% (2)
Python Machine Learning Guide
70 pages
StatisticsMachineLearningPythonDraft PDF
100% (1)
StatisticsMachineLearningPythonDraft PDF
323 pages
Data Science Theory: Analysis and Analytics
No ratings yet
Data Science Theory: Analysis and Analytics
14 pages
Deep Learning for Beginners
100% (1)
Deep Learning for Beginners
87 pages
Machine Learning
No ratings yet
Machine Learning
31 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
ConvNet Insights for Tech Enthusiasts
No ratings yet
ConvNet Insights for Tech Enthusiasts
7 pages
Machine Learning Programming Exercise
100% (2)
Machine Learning Programming Exercise
118 pages
ARIMA Models in Python Chapter4 PDF
100% (1)
ARIMA Models in Python Chapter4 PDF
50 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
Unit 1
No ratings yet
Unit 1
62 pages
Unit 1
No ratings yet
Unit 1
92 pages
UNIT I-Machine Learning
No ratings yet
UNIT I-Machine Learning
68 pages
SVM Optimization: Derivation of The Lagrangian Dual
No ratings yet
SVM Optimization: Derivation of The Lagrangian Dual
13 pages
Soft Max
No ratings yet
Soft Max
6 pages
Support Vector Machines (SVM) : N I y X D
No ratings yet
Support Vector Machines (SVM) : N I y X D
5 pages
Kernel Methods in Machine Learning
No ratings yet
Kernel Methods in Machine Learning
25 pages
Non-Linear Classifiers & Neural Networks
No ratings yet
Non-Linear Classifiers & Neural Networks
19 pages
Introduction To SVM
No ratings yet
Introduction To SVM
24 pages
Backpropagation Learning in Neural Networks
No ratings yet
Backpropagation Learning in Neural Networks
27 pages
GD in LR
No ratings yet
GD in LR
23 pages
Gradient Descent Learning: Minimize Objective Function: Error Landscape
No ratings yet
Gradient Descent Learning: Minimize Objective Function: Error Landscape
14 pages
Notes EIC17103 11 8 20 PDF
No ratings yet
Notes EIC17103 11 8 20 PDF
8 pages
Succession Plan
No ratings yet
Succession Plan
9 pages
Kartilya & 1898 Philippine Independence
No ratings yet
Kartilya & 1898 Philippine Independence
7 pages
Biology Levels for Students
No ratings yet
Biology Levels for Students
3 pages
Analysis of Air-Conditioning Processes Question Only
No ratings yet
Analysis of Air-Conditioning Processes Question Only
4 pages
Master Chinese Pinyin in 7 Days
No ratings yet
Master Chinese Pinyin in 7 Days
1 page
Starfinder Alien Archive 4 Pawn Collection 3 4
No ratings yet
Starfinder Alien Archive 4 Pawn Collection 3 4
2 pages
Job Focused
No ratings yet
Job Focused
4 pages
Za HL 368 Big Book Original in This Together Ver 2
No ratings yet
Za HL 368 Big Book Original in This Together Ver 2
26 pages
OPENMARK 4000 Brochure-Re
No ratings yet
OPENMARK 4000 Brochure-Re
4 pages
P35 Portable Dewpoint Meter Datasheet 1898 Iss7
No ratings yet
P35 Portable Dewpoint Meter Datasheet 1898 Iss7
3 pages
ETAP 16 Keyboard Shortcuts Guide
No ratings yet
ETAP 16 Keyboard Shortcuts Guide
1 page
6630-Article Text-12424-1-10-20180412
No ratings yet
6630-Article Text-12424-1-10-20180412
13 pages
Blizzard Entertainment - Resume Tips
No ratings yet
Blizzard Entertainment - Resume Tips
2 pages
Endemism: Definition, Types, and Examples
No ratings yet
Endemism: Definition, Types, and Examples
39 pages
AICh EWeir Loading SPR 2009
No ratings yet
AICh EWeir Loading SPR 2009
13 pages
Fzo Ain Rep
No ratings yet
Fzo Ain Rep
42 pages
TRS501 Vocabulary List
No ratings yet
TRS501 Vocabulary List
9 pages
IIT Kharagpur M. Tech Cutoff 2008-09
100% (3)
IIT Kharagpur M. Tech Cutoff 2008-09
2 pages
1.1 General: Chapter - 1
No ratings yet
1.1 General: Chapter - 1
10 pages
4shapes in Tide Pools
No ratings yet
4shapes in Tide Pools
7 pages
Happiness Is Not Something Ready Made
100% (1)
Happiness Is Not Something Ready Made
11 pages
Pers Soc Psychol Schultz
No ratings yet
Pers Soc Psychol Schultz
13 pages
Product Catalogue 11 Stauff Hire
No ratings yet
Product Catalogue 11 Stauff Hire
20 pages
Cat Red
No ratings yet
Cat Red
5 pages
Grade 5 Learning Activities
No ratings yet
Grade 5 Learning Activities
7 pages
Krebs Cycle Study Resources
No ratings yet
Krebs Cycle Study Resources
1 page
Insulating Fire Bricks Guide
No ratings yet
Insulating Fire Bricks Guide
11 pages
Grose 2014
No ratings yet
Grose 2014
9 pages
Kit Instructions: SCN66 USB Cash Acceptor Upgrade
No ratings yet
Kit Instructions: SCN66 USB Cash Acceptor Upgrade
28 pages
Presentation of ENISA Study - Recommendations - Christina Skouloudi
No ratings yet
Presentation of ENISA Study - Recommendations - Christina Skouloudi
31 pages

Introduction To ML

Uploaded by

Introduction To ML

Uploaded by

INTRODUCTION

What is machine learning?

• Large amounts of data may have hidden relationships and correlations.

• Environments change over time, and new knowledge is constantly being

 Handwritten or spoken words

 Unusual patterns of sensor readings in a nuclear power

 Spam filtering, fraud detection:

 Feature transformation / selection:

T: Recognizing hand-written words

T: Driving on four-lane highways using vision sensors

T: Categorize email messages as spam or legitimate.

Can we improve investment gain with help of stock data?

4 features = 216 = 65536

After 7 examples, we still have

The space of all hypothesis that

Version space : space not ruled

Data acquisition Practical usage

Training set Testing set

 Two important factors:

 The algorithms control the search to find and build the

 The learning algorithms should extract useful information

Supervised learning Unsupervised learning

 Traditional supervised learning is limited to using labeled data.

Let (x,y) be a labeled instance and (x,ø) be an unlabeled instance.

, where w is an d-dim vector (learned)

 Support vector machine (SVM):

zebra tiger rhino panda

 Maximizing the prediction accuracy

Given a set of points from classes

• Given, the following data, can we find

• It is generally a problem of function approximation, or

You might also like