KEMBAR78
INTRODUCTION TO MACHINE LEARNING TECHNIQUES | PPTX
INTRODUCTION
TO
MACH NE
İ
LEARN NG
İ
3RD ED T ON
İ İ
ETHEM ALPAYDIN
© The MIT Press, 2014
alpaydin@boun.edu.tr
http://www.cmpe.boun.edu.tr/~ethem/i2ml3e
Lecture Slides for
CHAPTER 1:
INTRODUCT ON
İ
3
Big Data
 Widespread use of personal computers and wireless
communication leads to “big data”
 We are both producers and consumers of data
 Data is not random, it has structure, e.g., customer
behavior
 We need “big theory” to extract that structure from
data for
(a) Understanding the process
(b) Making predictions for the future
4
Why “Learn” ?
 Machine learning is programming computers to
optimize a performance criterion using example data
or past experience.
 There is no need to “learn” to calculate payroll
 Learning is used when:
 Human expertise does not exist (navigating on Mars),
 Humans are unable to explain their expertise (speech
recognition)
 Solution changes in time (routing on a computer network)
 Solution needs to be adapted to particular cases (user
biometrics)
5
What We Talk About When We Talk About
“Learning”
 Learning general models from a data of particular
examples
 Data is cheap and abundant (data warehouses, data
marts); knowledge is expensive and scarce.
 Example in retail: Customer transactions to consumer
behavior:
People who bought “Blink” also bought “Outliers”
(www.amazon.com)
 Build a model that is a good and useful approximation
to the data.
6
Data Mining
 Retail: Market basket analysis, Customer
relationship management (CRM)
 Finance: Credit scoring, fraud detection
 Manufacturing: Control, robotics, troubleshooting
 Medicine: Medical diagnosis
 Telecommunications: Spam filters, intrusion detection
 Bioinformatics: Motifs, alignment
 Web mining: Search engines
 ...
7
What is Machine Learning?
 Optimize a performance criterion using example
data or past experience.
 Role of Statistics: Inference from a sample
 Role of Computer science: Efficient algorithms to
 Solve the optimization problem
 Representing and evaluating the model for inference
8
Applications
 Association
 Supervised Learning
 Classification
 Regression
 Unsupervised Learning
 Reinforcement Learning
9
Learning Associations
 Basket analysis:
P (Y | X ) probability that somebody who buys X
also buys Y where X and Y are products/services.
Example: P ( chips | beer ) = 0.7
10
Classification
 Example: Credit scoring
 Differentiating between
low-risk and high-risk
customers from their
income and savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
11
Classification: Applications
 Aka Pattern recognition
 Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
 Character recognition: Different handwriting styles.
 Speech recognition: Temporal dependency.
 Medical diagnosis: From symptoms to illnesses
 Biometrics: Recognition/authentication using physical
and/or behavioral characteristics: Face, iris,
signature, etc
 Outlier/novelty detection:
12
Face Recognition
Training examples of a person
Test images
ORL dataset,
AT&T Laboratories, Cambridge UK
13
Regression
 Example: Price of a
used car
 x : car attributes
y : price
y = g (x | q )
g ( ) model,
q parameters
y = wx+w0
14
Regression Applications
 Navigating a car: Angle of the steering
 Kinematics of a robot arm
α1= g1(x,y)
α2= g2(x,y)
α1
α2
(x,y)
 Response surface design
15
Supervised Learning: Uses
 Prediction of future cases: Use the rule to predict
the output for future inputs
 Knowledge extraction: The rule is easy to
understand
 Compression: The rule is simpler than the data it
explains
 Outlier detection: Exceptions that are not covered
by the rule, e.g., fraud
16
Unsupervised Learning
 Learning “what normally happens”
 No output
 Clustering: Grouping similar instances
 Example applications
 Customer segmentation in CRM
 Image compression: Color quantization
 Bioinformatics: Learning motifs
17
Reinforcement Learning
 Learning a policy: A sequence of outputs
 No supervised output but delayed reward
 Credit assignment problem
 Game playing
 Robot in a maze
 Multiple agents, partial observability, ...
18
Resources: Datasets
 UCI Repository: http://www.ics.uci.edu/~mlearn/MLRepository.html
 Statlib: http://lib.stat.cmu.edu/
19
Resources: Journals
 Journal of Machine Learning Research www.jmlr.org
 Machine Learning
 Neural Computation
 Neural Networks
 IEEE Trans on Neural Networks and Learning Systems
 IEEE Trans on Pattern Analysis and Machine Intelligence
 Journals on Statistics/Data Mining/Signal
Processing/Natural Language
Processing/Bioinformatics/...
20
Resources: Conferences
 International Conference on Machine Learning (ICML)
 European Conference on Machine Learning (ECML)
 Neural Information Processing Systems (NIPS)
 Uncertainty in Artificial Intelligence (UAI)
 Computational Learning Theory (COLT)
 International Conference on Artificial Neural Networks
(ICANN)
 International Conference on AI & Statistics (AISTATS)
 International Conference on Pattern Recognition (ICPR)
 ...

INTRODUCTION TO MACHINE LEARNING TECHNIQUES

  • 1.
    INTRODUCTION TO MACH NE İ LEARN NG İ 3RDED T ON İ İ ETHEM ALPAYDIN © The MIT Press, 2014 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e Lecture Slides for
  • 2.
  • 3.
    3 Big Data  Widespreaduse of personal computers and wireless communication leads to “big data”  We are both producers and consumers of data  Data is not random, it has structure, e.g., customer behavior  We need “big theory” to extract that structure from data for (a) Understanding the process (b) Making predictions for the future
  • 4.
    4 Why “Learn” ? Machine learning is programming computers to optimize a performance criterion using example data or past experience.  There is no need to “learn” to calculate payroll  Learning is used when:  Human expertise does not exist (navigating on Mars),  Humans are unable to explain their expertise (speech recognition)  Solution changes in time (routing on a computer network)  Solution needs to be adapted to particular cases (user biometrics)
  • 5.
    5 What We TalkAbout When We Talk About “Learning”  Learning general models from a data of particular examples  Data is cheap and abundant (data warehouses, data marts); knowledge is expensive and scarce.  Example in retail: Customer transactions to consumer behavior: People who bought “Blink” also bought “Outliers” (www.amazon.com)  Build a model that is a good and useful approximation to the data.
  • 6.
    6 Data Mining  Retail:Market basket analysis, Customer relationship management (CRM)  Finance: Credit scoring, fraud detection  Manufacturing: Control, robotics, troubleshooting  Medicine: Medical diagnosis  Telecommunications: Spam filters, intrusion detection  Bioinformatics: Motifs, alignment  Web mining: Search engines  ...
  • 7.
    7 What is MachineLearning?  Optimize a performance criterion using example data or past experience.  Role of Statistics: Inference from a sample  Role of Computer science: Efficient algorithms to  Solve the optimization problem  Representing and evaluating the model for inference
  • 8.
    8 Applications  Association  SupervisedLearning  Classification  Regression  Unsupervised Learning  Reinforcement Learning
  • 9.
    9 Learning Associations  Basketanalysis: P (Y | X ) probability that somebody who buys X also buys Y where X and Y are products/services. Example: P ( chips | beer ) = 0.7
  • 10.
    10 Classification  Example: Creditscoring  Differentiating between low-risk and high-risk customers from their income and savings Discriminant: IF income > θ1 AND savings > θ2 THEN low-risk ELSE high-risk
  • 11.
    11 Classification: Applications  AkaPattern recognition  Face recognition: Pose, lighting, occlusion (glasses, beard), make-up, hair style  Character recognition: Different handwriting styles.  Speech recognition: Temporal dependency.  Medical diagnosis: From symptoms to illnesses  Biometrics: Recognition/authentication using physical and/or behavioral characteristics: Face, iris, signature, etc  Outlier/novelty detection:
  • 12.
    12 Face Recognition Training examplesof a person Test images ORL dataset, AT&T Laboratories, Cambridge UK
  • 13.
    13 Regression  Example: Priceof a used car  x : car attributes y : price y = g (x | q ) g ( ) model, q parameters y = wx+w0
  • 14.
    14 Regression Applications  Navigatinga car: Angle of the steering  Kinematics of a robot arm α1= g1(x,y) α2= g2(x,y) α1 α2 (x,y)  Response surface design
  • 15.
    15 Supervised Learning: Uses Prediction of future cases: Use the rule to predict the output for future inputs  Knowledge extraction: The rule is easy to understand  Compression: The rule is simpler than the data it explains  Outlier detection: Exceptions that are not covered by the rule, e.g., fraud
  • 16.
    16 Unsupervised Learning  Learning“what normally happens”  No output  Clustering: Grouping similar instances  Example applications  Customer segmentation in CRM  Image compression: Color quantization  Bioinformatics: Learning motifs
  • 17.
    17 Reinforcement Learning  Learninga policy: A sequence of outputs  No supervised output but delayed reward  Credit assignment problem  Game playing  Robot in a maze  Multiple agents, partial observability, ...
  • 18.
    18 Resources: Datasets  UCIRepository: http://www.ics.uci.edu/~mlearn/MLRepository.html  Statlib: http://lib.stat.cmu.edu/
  • 19.
    19 Resources: Journals  Journalof Machine Learning Research www.jmlr.org  Machine Learning  Neural Computation  Neural Networks  IEEE Trans on Neural Networks and Learning Systems  IEEE Trans on Pattern Analysis and Machine Intelligence  Journals on Statistics/Data Mining/Signal Processing/Natural Language Processing/Bioinformatics/...
  • 20.
    20 Resources: Conferences  InternationalConference on Machine Learning (ICML)  European Conference on Machine Learning (ECML)  Neural Information Processing Systems (NIPS)  Uncertainty in Artificial Intelligence (UAI)  Computational Learning Theory (COLT)  International Conference on Artificial Neural Networks (ICANN)  International Conference on AI & Statistics (AISTATS)  International Conference on Pattern Recognition (ICPR)  ...