Unit 6
Introduction to Machine
Learning
Machine Learning
Machine Learning
Introduction
Ex: Sort a list of numbers
and search a number in
the list
Introduction
• Ex: Sort a list of
numbers and search a
number in the list
• Ex: Segregate spam
emails.
• Definition of spam
varies from person to
person
• Model can not be
predefined
Machine learning equips computers to learn and interpret
without being explicitly programmed to do so
What is machine learning?
Wants to take decision/ predict Machine learning is a part of artificial intelligence
Can make smarter choices
Motivation for Machine Learning
• Conventional methods are not able to identify the process completely
• Machine learning is based on approximated patterns which help to
make the predictions
• Assumes that future is not different from the past when sample data
was collected
• Application of machine learning for large data set is data mining
• A large volume of data is processed in order to construct a simple
model with high prediction accuracy
ML is used when
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech
recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics)
What is machine learning?
Models can be used for predictions, test assumptions and solve problems
Facial recognition
Make decision to buy a car or not
Can identify or locate things which is difficult or impossible for human
Example: Machine Learning
Some more examples
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
– Future stock prices or currency exchange rates
Introduction
• “Learning is any process by which a system improves performance from
experience.”
- Herbert Simon
• Application of artificial intelligence wherein the system gets the ability to
automatically learn and improve based on experience
• Definition by Tom Mitchell (1998):
• Machine Learning is the study of algorithms that
• improve their performance P
• of some task T
• with experience E.
• A well-defined learning task is given by <P, T, E>
Defining the Learning Task
Improve on task T, with respect to performance metric P, based on
experience E
• Example
• T: Playing checkers
• E: Playing practice games against itself
• P: Percentage of games won against an arbitrary opponent
• Example
• T: Recognizing hand-written words
• E: Database of human-labelled images of handwritten words
• P: Percentage of words correctly classified
Defining the Learning Task
• Example
• T: Driving on four-lane highways using vision sensors
• E: A sequence of images and steering commands recorded while
observing a human driver
• P: Average distance travelled to do the action (apply break/take
turn)
• Example
• T: Categorize email messages as spam or legitimate.
• E: Database of emails, some with human-given labels
• P: Percentage of email messages correctly classified.
Machine Learning Models
• Deterministic
• Stochastic
Machine Learning Models
• Deterministic:
• every time you run the model with same initial conditions, you get the same results
• Output is precisely determined through known relationship among events
• Ex: known chemical reaction
• Ex: Trains in Japan run on time
• Useful for making a choice
• Stochastic/probabilistic model
• includes element of randomness, each time you run a model you get different results even
with the same initial condition
• Output of model has confidence interval with most likely estimate
• Ex: rand() + 2
• Ex: USA trains are often late
• Output is a range of values of variables in the form of probability distribution
Risk management uses stochastic model
• Ex: concert promoter determines cost of risk if a concert is cancelled
• Has 1000 tickets at Rs. 100/ticket
• If concert is cancelled then has to refund Rs. 100,000
• There is 10% chance of rain
• Risk estimate says Rs. 10,000 at risk
Types of Machine Learning
• Supervised
• Unsupervised
• Semi Supervised
• Reinforcement
Types of Machine Learning
• Already have data and answers
• Bank: have data of persons who
have not paid loan and other who
have paid
• Take decision whether new person
will pay loan or not
Example
Supervised learning
Teaches a mapping from the input to an output where the correct values are provided, by a
supervisor
Types of Machine Learning
• Already have data and answers • Do not know the answers
• Bank: have data of persons who • Have lot of data
have not paid loan and other who • Group data based on some
have paid similarity
• Take decision whether new person • Ex: Images of houses and
will pay loan or not trees in two groups
Example
Unsupervised Learning
• provided with only the data, without labels
• The goal is to find the regularities in the input
• The input space follows certain patterns
• Goal is to build a model to identify these patterns
• Build a model to identify these patterns.
Supervised or Unsupervised?
Semi- supervised Learning
Types of Machine Learning
• Already have data and answers • Do not know the answers • Do not have data prior to starting
• Bank: have data of persons who • Have lot of data • Get data one line at a time
have not paid loan and other who • Group data based on some • Depending on whether it is a
have paid similarity good choice or bad choice apply
• Take decision whether new person • Ex: Images of houses and good or feedback
will pay loan or not trees in two groups • Ex: One minute game
Example
Reinforcement learning
• The output of the system is a sequence of actions.
• Policy is that the sequence of correct actions to reach the goal.
• An action is good if it is a part of a good policy.
• Assess the goodness of policies
• Learn from past good action sequences to be able to generate a policy.
• Such learning methods are called reinforcement learning methods.
Types of output in Machine Learning
• Belongs to a • Continuous value needs to be • Data follows specific
particular group or predicted, product prices, profit, pattern, data
not can guess the values with first 4 recommendation, people
known values bought this also bought
this. Decision tree is used
for it, fruit or vegetable
Types of output in Machine Learning
• Belongs to a • Data follows specific
particular group or • can guess the values with first 4 pattern, data
not known values recommendation, people
bought this also bought
this. Decision tree is used
for it, fruit or vegetable
Types of output in Machine Learning
• Belongs to a • can guess the values with first 4 • Data follows specific
particular group or known values pattern
not • Ex: people bought one
item and also bought
another
• Decision tree is used for it
Methods for Classification
• Data is complicated
• used for simple data i.e • No simple relationship
not complex among them
• Represent it by drawing a
line or curve • Large amount of data
• Decision tree is a part of
random forest
Problems in Machine Learning
• Belongs to a • can guess the values with first 4 • Data follows specific
particular group or known values pattern
not • Ex: people bought one
item and also bought
another
• Decision tree is used for it
Regression (Example)
• Develop a system to predict price of a used car
• Attributes(X) of the price are brand, year, engine capacity, milage and
others
• Attributes are inputs
• Output (Y) is price of the car
• Since result is a number it is a regression problem
• Learning algorithm uses known pairs of (X,Y) for many cars
• Given a new set of attributes of a car, machine should predict the price of
car
Both classification and regression are supervised learning algorithms
Problems in Machine Learning
• Belongs to a • can guess the values with first 4 • Data follows specific
particular group or known values pattern
not • Ex: people bought one
item and also bought
another
• Decision tree is used for it
Clustering
Clustering
Clustering
Clustering
Clustering
Based on conditional probability: P(Y|X), probability that a customer will purchase Y provided he has already purchased X
Based on conditional probability: P(Y|X), probability that a customer will purchase Y provided he has already purchased X
Based on conditional probability: P(Y|X), probability that a customer will purchase Y provided he has already purchased X
Machine Learning: Popular Example
- Rush hour
Machine Learning: Popular Example
Steps to apply machine learning to data
Data Abstraction/ Knowledge Representation
• Raw data is given a more abstract meaning
• For knowledge representation, the computer summarizes
stored raw data using a model
• Model is an explicit description of the patterns within the
data.
Generalization
• Learning process is not complete until the learner is able to use its
abstracted knowledge for future action
• Model can not be generated for each sample of data
• Model should work on the data with some variations
• It should consider face with glasses
Various Learning Algorithms
• Algorithms are based on learning styles
Various Learning Algorithms
• Some algorithms are based on similarity
Assessing the success of learning
• Weka Machine Learning Workbench
• Training set
• Supplied test set
• Percentage split
• Validation
Assignment
1. The students are divided into groups for this assignment.
2. The domains for the teams are given in the attached sheet.
3. The students of the respective team are supposed to sit together, and discuss the domain provided to them.
4. Topics to be present in the document:
• Description of the good or service
• Need for the product
• The potential value gained by the potential customers
• The value gained by all stakeholders
• The competitive advantage that can be gained by the Company
• Profit potential and funding requirements
• Proposed management team structure
• PEAS (Performance, Environment, Actuator and Sensor) for the proposed solution
• Environment types and agent type for the concerned solution
• You can consider any of the approach studied in your syllabus (logical approach, probabilistic approach or any other) to solve
a simple test case based on the proposed solution.