0% found this document useful (0 votes)

60 views68 pages

BookSlides 6A Probability-Based Learning

The document discusses the fundamentals of the Naive Bayes classifier, including Bayes' theorem and conditional independence. It explains how Bayes' theorem allows calculating the probability of a hypothesis given observed data. An example demonstrates how this can be used to calculate the actual probability a patient has a disease based on a positive test result. The document also provides visual examples of probability distributions and defines key probability concepts.

Uploaded by

Syedul Mursaleen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views68 pages

BookSlides 6A Probability-Based Learning

Uploaded by

Syedul Mursaleen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 68

Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Probability-based Learning
Sections 6.1, 6.2, 6.3

John D. Kelleher and Brian Mac Namee and Aoife D’Arcy

Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

1 Big Idea

2 Fundamentals
Bayes’ Theorem
Bayesian Prediction
Conditional Independence and Factorization

3 Standard Approach: The Naive Bayes’ Classifier

A Worked Example

4 Summary
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Big Idea
(a)

(b)

Figure: A game of find the lady

(a)

Likelihood

Left Center Right

(b)

Figure: A game of find the lady : (a) the cards dealt face down on a
table; and (b) the initial likelihoods of the queen ending up in each
position.
(a)

Likelihood

Left Center Right

(b)

Figure: A game of find the lady : (a) the cards dealt face down on a
table; and (b) a revised set of likelihoods for the position of the queen
based on evidence collected.
(a)

Likelihood

Left Center Right

(b)

Figure: A game of find the lady : (a) The set of cards after the wind
blows over the one on the right; (b) the revised likelihoods for the
position of the queen based on this new evidence.
Figure: A game of find the lady : The final positions of the cards in
the game.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Big Idea
We can use estimates of likelihoods to determine the most
likely prediction that should be made.
More importantly, we revise these predictions based on
data we collect and whenever extra evidence becomes
available.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Fundamentals
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Table: A simple dataset for M ENINGITIS diagnosis with descriptive

features that describe the presence or absence of three common
symptoms of the disease: H EADACHE, F EVER, and VOMITING.
ID H EADACHE F EVER VOMITING M ENINGITIS
1 true true false false
2 false true false false
3 true false true false
4 true false true false
5 false true false true
6 true false true false
7 true false true false
8 true false true true
9 false true false false
10 true false true true
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

A probability function, P(), returns the probability of a

feature taking a specific value.
A joint probability refers to the probability of an
assignment of specific values to multiple different features.
A conditional probability refers to the probability of one
feature taking a specific value given that we already know
the value of a different feature
A probability distribution is a data structure that
describes the probability of each possible value a feature
can take. The sum of a probability distribution must equal
1.0.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

A joint probability distribution is a probability distribution

over more than one feature assignment and is written as a
multi-dimensional matrix in which each cell lists the
probability of a particular combination of feature values
being assigned.
The sum of all the cells in a joint probability distribution
must be 1.0.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

P(h, f , v , m), P(¬h, f , v , m)

 

 P(h, f , v , ¬m), P(¬h, f , v , ¬m) 


 P(h, f , ¬v , m), P(¬h, f , ¬v , m) 

 P(h, f , ¬v , ¬m), P(¬h, f , ¬v , ¬m) 
P(H, F , V , M) =  

 P(h, ¬f , v , m), P(¬h, ¬f , v , m) 


 P(h, ¬f , v , ¬m), P(¬h, ¬f , v , ¬m) 

 P(h, ¬f , ¬v , m), P(¬h, ¬f , ¬v , m) 
P(h, ¬f , ¬v , ¬m), P(¬h, ¬f , ¬v , ¬m)
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Given a joint probability distribution, we can compute the

probability of any event in the domain that it covers by
summing over the cells in the distribution where that event
is true.
Calculating probabilities in this way is known as summing
out.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayes’ Theorem

Bayes’ Theorem
P(Y |X )P(X )
P(X |Y ) =
P(Y )
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayes’ Theorem

Example
After a yearly checkup, a doctor informs their patient that he
has both bad news and good news. The bad news is that the
patient has tested positive for a serious disease and that the
test that the doctor has used is 99% accurate (i.e., the
probability of testing positive when a patient has the disease is
0.99, as is the probability of testing negative when a patient
does not have the disease). The good news, however, is that
the disease is extremely rare, striking only 1 in 10,000 people.

What is the actual probability that the patient has the

disease?
Why is the rarity of the disease good news given that the
patient has tested positive for it?
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayes’ Theorem

P(t|d)P(d)
P(d|t) =
P(t)

P(t) = P(t|d)P(d) + P(t|¬d)P(¬d)

= (0.99 × 0.0001) + (0.01 × 0.9999) = 0.0101

0.99 × 0.0001
P(d|t) =
0.0101
= 0.0098
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayes’ Theorem

Deriving Bayes theorem

P(Y |X )P(X ) = P(X |Y )P(Y )

P(X |Y )P(Y ) P(Y |X )P(X )

=
P(Y ) P(Y )

P(X |Y )P(Y
) P(Y |X )P(X )
=
P(Y )

P(Y )
P(Y |X )P(X )
⇒P(X |Y ) =
P(Y )
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayes’ Theorem

The divisor is the prior probability of the evidence

This division functions as a normalization constant.

0 ≤ P(X |Y ) ≤ 1
X
P(Xi |Y ) = 1.0
i
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayes’ Theorem

We can calculate this divisor directly from the dataset.

|{rows where Y is the case}|
P(Y ) =
|{rows in the dataset}|

Or, we can use the Theorem of Total Probability to

calculate this divisor.
X
P(Y ) = P(Y |Xi )P(Xi ) (1)
i
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

Generalized Bayes’ Theorem

P(q[1], . . . , q[m]|t = l)P(t = l)

P(t = l|q[1], . . . , q[m]) =
P(q[1], . . . , q[m])
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

Chain Rule
P(q[1], . . . , q[m]) =
P(q[1]) × P(q[2]|q[1])×
· · · × P(q[m]|q[m − 1], . . . , q[2], q[1])

To apply the chain rule to a conditional probability we just

add the conditioning term to each term in the expression:
P(q[1], . . . , q[m]|t = l) =
P(q[1]|t = l) × P(q[2]|q[1], t = l) × . . .
· · · × P(q[m]|q[m − 1], . . . , q[3], q[2], q[1], t = l)
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

ID H EADACHE F EVER VOMITING M ENINGITIS

1 true true false false
2 false true false false
3 true false true false
4 true false true false
5 false true false true
6 true false true false
7 true false true false
8 true false true true
9 false true false false
10 true false true true

H EADACHE F EVER VOMITING M ENINGITIS

true false true ?
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

P(M|h, ¬f , v ) =?

In the terms of Bayes’ Theorem this problem can be stated

as:
P(h, ¬f , v |M) × P(M)
P(M|h, ¬f , v ) =
P(h, ¬f , v )

There are two values in the domain of the M ENINGITIS

feature, ’true’ and ’false’, so we have to do this calculation
twice.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

We will do the calculation for m first

To carry out this calculation we need to know the following
probabilities: P(m), P(h, ¬f , v ) and P(h, ¬f , v | m).

ID H EADACHE F EVER VOMITING M ENINGITIS

Bayesian Prediction

We can calculate the required probabilities directly from

the data. For example, we can calculate P(m) and
P(h, ¬f , v ) as follows:

|{d5 , d8 , d10 }| 3
P(m) = = = 0.3
|{d1 , d2 , d3 , d4 , d5 , d6 , d7 , d8 , d9 , d10 }| 10
|{d3 , d4 , d6 , d7 , d8 , d10 }| 6
P(h, ¬f , v ) = = = 0.6
|{d1 , d2 , d3 , d4 , d5 , d6 , d7 , d8 , d9 , d10 }| 10
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

However, as an exercise we will use the chain rule

calculate:

P(h, ¬f , v | m) =?

ID H EADACHE F EVER VOMITING M ENINGITIS

Bayesian Prediction

Using the chain rule calculate:

P(h, ¬f , v | m) = P(h | m) × P(¬f | h, m) × P(v | ¬f , h, m)

|{d8 , d10 }| |{d8 , d10 }| |{d8 , d10 }|
= × ×
|{d5 , d8 , d10 }| |{d8 , d10 }| |{d8 , d10 }|
2 2 2
= × × = 0.6666
3 2 2
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

So the calculation of P(m|h, ¬f , v ) is:

!
P(h|m) × P(¬f |h, m)
× P(v |¬f , h, m) × P(m)
P(m|h, ¬f , v ) =
P(h, ¬f , v )
0.6666 × 0.3
= = 0.3333
0.6
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

The corresponding calculation for P(¬m|h, ¬f , v ) is:

P(h, ¬f , v | ¬m) × P(¬m)

P(¬m | h, ¬f , v ) =
P(h, ¬f , v )
!
P(h|¬m) × P(¬f | h, ¬m)
× P(v |¬f , h, ¬m) × P(¬m)
=
P(h, ¬f , v )
0.7143 × 0.8 × 1.0 × 0.7
= = 0.6667
0.6
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

P(m|h, ¬f , v ) = 0.3333
P(¬m|h, ¬f , v ) = 0.6667

These calculations tell us that it is twice as probable that

the patient does not have meningitis than it is that they do
even though the patient is suffering from a headache and
is vomiting!
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

The Paradox of the False Positive

The mistake of forgetting to factor in the prior gives rise to
the paradox of the false positive which states that in
order to make predictions about a rare event the model has
to be as accurate as the prior of the event is rare or there is
a significant chance of false positives predictions (i.e.,
predicting the event when it is not the case).
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

Bayesian MAP Prediction Model

MMAP (q) = argmax P(t = l | q[1], . . . , q[m])
l∈levels(t)

P(q[1], . . . , q[m] | t = l) × P(t = l)

= argmax
l∈levels(t) P(q[1], . . . , q[m])

Bayesian MAP Prediction Model (without normalization)

MMAP (q) = argmax P(q[1], . . . , q[m] | t = l) × P(t = l)

l∈levels(t)
ID H EADACHE F EVER VOMITING M ENINGITIS
1 true true false false
2 false true false false
3 true false true false
4 true false true false
5 false true false true
6 true false true false
7 true false true false
8 true false true true
9 false true false false
10 true false true true

H EADACHE F EVER VOMITING M ENINGITIS

true true false ?
ID H EADACHE F EVER VOMITING M ENINGITIS
1 true true false false
2 false true false false
3 true false true false
4 true false true false
5 false true false true
6 true false true false
7 true false true false
8 true false true true
9 false true false false
10 true false true true

P(m | h, f , ¬v ) =?

P(¬m | h, f , ¬v ) =?
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

!
P(h|m) × P(f | h, m)
× P(¬v | f , h, m) × P(m)
P(m | h, f , ¬v ) =
P(h, f , ¬v )
0.6666 × 0 × 0 × 0.3
= =0
0.1
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

!
P(h|¬m) × P(f | h, ¬m)
× P(¬v | f , h, ¬m) × P(¬m)
P(¬m | h, f , ¬v ) =
P(h, f , ¬v )
0.7143 × 0.2 × 1.0 × 0.7
= = 1.0
0.1
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

P(m | h, f , ¬v ) = 0

P(¬m | h, f , ¬v ) = 1.0

There is something odd about these results!

Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

Curse of Dimensionality
As the number of descriptive features grows the number of
potential conditioning events grows. Consequently, an
exponential increase is required in the size of the dataset as
each new descriptive feature is added to ensure that for any
conditional probability there are enough instances in the
training dataset matching the conditions so that the resulting
probability is reasonable.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Bayesian Prediction

The probability of a patient who has a headache and a

fever having meningitis should be greater than zero!
Our dataset is not large enough → our model is over-fitting
to the training data.
The concepts of conditional independence and
factorization can help us overcome this flaw of our current
approach.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

If knowledge of one event has no effect on the probability

of another event, and vice versa, then the two events are
independent of each other.
If two events X and Y are independent then:

P(X |Y ) = P(X )
P(X , Y ) = P(X ) × P(Y )

Recall, that when two event are dependent these rules are:

P(X , Y )
P(X |Y ) =
P(Y )
P(X , Y ) = P(X |Y ) × P(Y ) = P(Y |X ) × P(X )
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

Full independence between events is quite rare.

A more common phenomenon is that two, or more, events
may be independent if we know that a third event has
happened.
This is known as conditional independence.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

For two events, X and Y , that are conditionally

independent given knowledge of a third events, here Z , the
definition of the probability of a joint event and conditional
probability are:

P(X |Y , Z ) = P(X |Z )
P(X , Y |Z ) = P(X |Z ) × P(Y |Z )

P(X , Y )
P(X |Y ) =
P(Y ) P(X |Y ) = P(X )
P(X , Y ) = P(X |Y ) × P(Y ) P(X , Y ) = P(X ) × P(Y )
= P(Y |X ) × P(X )
X and Y are independent
X and Y are dependent
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

If the event t = l causes the events q[1], . . . , q[m] to

happen then the events q[1], . . . , q[m] are conditionally
independent of each other given knowledge of t = l and
the chain rule definition can be simplified as follows:

Conditional Independence and Factorization

Using this we can simplify the calculations in Bayes’

Theorem, under the assumption of conditional
independence between the descriptive features given the
level l of the target feature:

m
!
Y
P(q[i] | t = l) × P(t = l)
i=1
P(t = l | q[1], . . . , q[m]) =
P(q[1], . . . , q[m])
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

Withouth conditional independence

P(X , Y , Z |W ) = P(X |W ) × P(Y |X , W ) × P(Z |Y , X , W ) × P(W )

With conditional independence

P(X , Y , Z |W ) = P(X |W ) × P(Y |W ) × P(Z |W ) × P(W )

| {z } | {z } | {z } | {z }
Factor 1 Factor 2 Factor 3 Factor 4
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

The joint probability distribution for the meningitis dataset.

P(h, f , v , m), P(¬h, f , v , m)
 
 P(h, f , v , ¬m), P(¬h, f , v , ¬m) 
 
 P(h, f , ¬v , m), P(¬h, f , ¬v , m) 
 
 P(h, f , ¬v , ¬m), P(¬h, f , ¬v , ¬m) 
P(H, F , V , M) = 
 
 P(h, ¬f , v , m), P(¬h, ¬f , v , m) 

 P(h, ¬f , v , ¬m), P(¬h, ¬f , v , ¬m) 
 
 P(h, ¬f , ¬v , m), P(¬h, ¬f , ¬v , m) 
P(h, ¬f , ¬v , ¬m), P(¬h, ¬f , ¬v , ¬m)
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

Assuming the descriptive features are conditionally

independent of each other given M ENINGITIS we only need
to store four factors:
Factor1 : < P(M) >
Factor2 : < P(h|m), P(h|¬m) >
Factor3 : < P(f |m), P(f |¬m) >
Factor4 : < P(v |m), P(v |¬m) >
P(H, F , V , M) = P(M) × P(H|M) × P(F |M) × P(V |M)
ID H EADACHE F EVER VOMITING M ENINGITIS
1 true true false false
2 false true false false
3 true false true false
4 true false true false
5 false true false true
6 true false true false
7 true false true false
8 true false true true
9 false true false false
10 true false true true

Calculate the factors from the data.

Factor1 : < P(M) >
Factor2 : < P(h|m), P(h|¬m) >
Factor3 : < P(f |m), P(f |¬m) >
Factor4 : < P(v |m), P(v |¬m) >
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

Factor1 : < P(m) = 0.3 >

Factor2 : < P(h|m) = 0.6666, P(h|¬m) = 0.7413 >
Factor3 : < P(f |m) = 0.3333, P(f |¬m) = 0.4286 >
Factor4 : < P(v |m) = 0.6666, P(v |¬m) = 0.5714 >
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

Factor1 : < P(m) = 0.3 >

Factor2 : < P(h|m) = 0.6666, P(h|¬m) = 0.7413 >
Factor3 : < P(f |m) = 0.3333, P(f |¬m) = 0.4286 >
Factor4 : < P(v |m) = 0.6666, P(v |¬m) = 0.5714 >

Using the factors above calculate the probability of

M ENINGITIS=’true’ for the following query.

H EADACHE F EVER VOMITING M ENINGITIS

true true false ?
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

P(h|m) × P(f |m) × P(¬v |m) × P(m)

P(m|h, f , ¬v ) = P =
i P(h|Mi ) × P(f |Mi ) × P(¬v |Mi ) × P(Mi )
0.6666 × 0.3333 × 0.3333 × 0.3
= 0.1948
(0.6666 × 0.3333 × 0.3333 × 0.3) + (0.7143 × 0.4286 × 0.4286 × 0.7)
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

Factor1 : < P(m) = 0.3 >

Factor2 : < P(h|m) = 0.6666, P(h|¬m) = 0.7413 >
Factor3 : < P(f |m) = 0.3333, P(f |¬m) = 0.4286 >
Factor4 : < P(v |m) = 0.6666, P(v |¬m) = 0.5714 >

Using the factors above calculate the probability of

M ENINGITIS=’false’ for the same query.

H EADACHE F EVER VOMITING M ENINGITIS

true true false ?
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

P(h|¬m) × P(f |¬m) × P(¬v |¬m) × P(¬m)

P(¬m|h, f , ¬v ) = P =
i P(h|Mi ) × P(f |Mi ) × P(¬v |Mi ) × P(Mi )
0.7143 × 0.4286 × 0.4286 × 0.7
= 0.8052
(0.6666 × 0.3333 × 0.3333 × 0.3) + (0.7143 × 0.4286 × 0.4286 × 0.7)
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Conditional Independence and Factorization

P(m|h, f , ¬v ) = 0.1948

P(¬m|h, f , ¬v ) = 0.8052

As before, the MAP prediction would be

M ENINGITIS = ’false’
The posterior probabilities are not as extreme!
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Standard Approach: The Naive

Bayes’ Classifier
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Naive Bayes’ Classifier

m
!
Y
M(q) = argmax P(q[i] | t = l) × P(t = l)
l∈levels(t) i=1
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Naive Bayes’ is simple to train!

1 calculate the priors for each of the target levels
2 calculate the conditional probabilities for each feature
given each target level.
Table: A dataset from a loan application fraud detection domain.
C REDIT G UARANTOR /
ID H ISTORY C O A PPLICANT ACCOMODATION F RAUD
1 current none own true
2 paid none own false
3 paid none own false
4 paid guarantor rent true
5 arrears none own false
6 arrears none own true
7 current none own false
8 arrears none own false
9 current none rent false
10 none none own true
11 current coapplicant own false
12 current none own true
13 current none rent true
14 paid none own false
15 arrears none own false
16 current none own false
17 arrears coapplicant rent false
18 arrears none free false
19 arrears none own false
20 paid none own false
P(fr ) = 0.3 P(¬fr ) = 0.7
P(CH = ’none’ | fr ) = 0.1666 P(CH = ’none’ | ¬fr ) = 0
P(CH = ’paid’ | fr ) = 0.1666 P(CH = ’paid’ | ¬fr ) = 0.2857
P(CH = ’current’ | fr ) = 0.5 P(CH = ’current’ | ¬fr ) = 0.2857
P(CH = ’arrears’ | fr ) = 0.1666 P(CH = ’arrears’ | ¬fr ) = 0.4286
P(GC = ’none’ | fr ) = 0.8334 P(GC = ’none’ | ¬fr ) = 0.8571
P(GC = ’guarantor’ | fr ) = 0.1666 P(GC = ’guarantor’ | ¬fr ) = 0
P(GC = ’coapplicant’ | fr ) = 0 P(GC = ’coapplicant’ | ¬fr ) = 0.1429
P(ACC = ’own’ | fr ) = 0.6666 P(ACC = ’own’ | ¬fr ) = 0.7857
P(ACC = ’rent’ | fr ) = 0.3333 P(ACC = ’rent’ | ¬fr ) = 0.1429
P(ACC = ’free’ | fr ) = 0 P(ACC = ’free’ | ¬fr ) = 0.0714

Table: The probabilities needed by a Naive Bayes prediction model

C REDIT H ISTORY G UARANTOR /C O A PPLICANT ACCOMODATION F RAUDULENT

paid none rent ?
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

A Worked Example

P(fr ) = 0.3 P(¬fr ) = 0.7

C REDIT H ISTORY G UARANTOR /C O A PPLICANT ACCOMODATION F RAUDULENT

paid none rent ?
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

A Worked Example

P(fr ) = 0.3 P(¬fr ) = 0.7

C REDIT H ISTORY G UARANTOR /C O A PPLICANT ACCOMODATION F RAUDULENT

paid none rent ’false’
The model is generalizing beyond the dataset!

C REDIT G UARANTOR /
ID H ISTORY C O A PPLICANT ACCOMMODATION F RAUD
1 current none own true
2 paid none own false
3 paid none own false
4 paid guarantor rent true
5 arrears none own false
6 arrears none own true
7 current none own false
8 arrears none own false
9 current none rent false
10 none none own true
11 current coapplicant own false
12 current none own true
13 current none rent true
14 paid none own false
15 arrears none own false
16 current none own false
17 arrears coapplicant rent false
18 arrears none free false
19 arrears none own false
20 paid none own false

C REDIT H ISTORY G UARANTOR /C O A PPLICANT ACCOMMODATION F RAUDULENT

paid none rent ’false’
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

Summary
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

P(d|t) × P(t)
P(t|d) = (2)
P(d)

A Naive Bayes’ classifier naively assumes that each of the

descriptive features in a domain is conditionally
independent of all of the other descriptive features, given
the state of the target feature.
This assumption, although often wrong, enables the Naive
Bayes’ model to maximally factorise the representation that
it uses of the domain.
Surprisingly, given the naivety and strength of the
assumption it depends upon, a Naive Bayes’ model often
performs reasonably well.
Big Idea Fundamentals Standard Approach: The Naive Bayes’ Classifier Summary

1 Big Idea

2 Fundamentals
Bayes’ Theorem
Bayesian Prediction
Conditional Independence and Factorization

3 Standard Approach: The Naive Bayes’ Classifier

A Worked Example

4 Summary

BookSlides 6A Probability-Based Learning PDF
No ratings yet
BookSlides 6A Probability-Based Learning PDF
68 pages
Unit 4 Ci 2017
No ratings yet
Unit 4 Ci 2017
22 pages
E-Note 14654 Content Document 20231228101425AM
No ratings yet
E-Note 14654 Content Document 20231228101425AM
10 pages
AIML Unit 2
No ratings yet
AIML Unit 2
5 pages
Bayes and Naive Bayes Predictors
No ratings yet
Bayes and Naive Bayes Predictors
5 pages
Bayes Rule
No ratings yet
Bayes Rule
29 pages
Wa0031.
No ratings yet
Wa0031.
41 pages
Introduction To Machine Learning CS - 229
No ratings yet
Introduction To Machine Learning CS - 229
109 pages
Intro-Bayes Theory
No ratings yet
Intro-Bayes Theory
17 pages
CSC 323-07 Bayesian Learning
No ratings yet
CSC 323-07 Bayesian Learning
11 pages
Bayesian Inference & Networks Guide
100% (1)
Bayesian Inference & Networks Guide
21 pages
Bayes' Formula: A Powerful But Counterintuitive Tool For Medical Decision-Making
No ratings yet
Bayes' Formula: A Powerful But Counterintuitive Tool For Medical Decision-Making
6 pages
Bayes
No ratings yet
Bayes
48 pages
ML Bayes05
No ratings yet
ML Bayes05
18 pages
Bayes Rule
No ratings yet
Bayes Rule
15 pages
ML BayesionBeliefNetwork Lect12 14
No ratings yet
ML BayesionBeliefNetwork Lect12 14
99 pages
Unit II Probabilistic Reasoning
No ratings yet
Unit II Probabilistic Reasoning
28 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
2 Unit PR Statistical Decision Making
No ratings yet
2 Unit PR Statistical Decision Making
61 pages
Bayesian Learning1
No ratings yet
Bayesian Learning1
21 pages
Unit-Iii Knowledge & Reasoning
No ratings yet
Unit-Iii Knowledge & Reasoning
35 pages
Kass 2015 Irvine
No ratings yet
Kass 2015 Irvine
111 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
Naïve Baye's Classifier
No ratings yet
Naïve Baye's Classifier
17 pages
What Is Bayes Theorem?: Something Else Has Already Occurred. Using The Conditional Probability, We Can Calculate
No ratings yet
What Is Bayes Theorem?: Something Else Has Already Occurred. Using The Conditional Probability, We Can Calculate
8 pages
Bayesian Classification, Nearest
No ratings yet
Bayesian Classification, Nearest
46 pages
Bayesian Learning
No ratings yet
Bayesian Learning
42 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
Bayes Factor (BF)
No ratings yet
Bayes Factor (BF)
13 pages
ML LECTURE#09b
No ratings yet
ML LECTURE#09b
40 pages
Slide 1
No ratings yet
Slide 1
37 pages
Bayes
No ratings yet
Bayes
31 pages
Bayes Classifier
No ratings yet
Bayes Classifier
36 pages
2024 - Slide2 - BayesML Sub
No ratings yet
2024 - Slide2 - BayesML Sub
40 pages
Classification With NaiveBayes
No ratings yet
Classification With NaiveBayes
19 pages
Baes Rule
No ratings yet
Baes Rule
8 pages
Module3 - Learning, Uncertainity Lecture Notes. 16861418577274
No ratings yet
Module3 - Learning, Uncertainity Lecture Notes. 16861418577274
30 pages
Overview of Principles of Statistics
No ratings yet
Overview of Principles of Statistics
8 pages
ML Lecture#5
No ratings yet
ML Lecture#5
65 pages
Naive Bayes in Machine Learning
No ratings yet
Naive Bayes in Machine Learning
17 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
Naive Bayes Classification Guide
No ratings yet
Naive Bayes Classification Guide
21 pages
Bayesian Learning Methods
No ratings yet
Bayesian Learning Methods
57 pages
NaiveBayes TomasWard
No ratings yet
NaiveBayes TomasWard
39 pages
Baye's Notes
No ratings yet
Baye's Notes
3 pages
Lecture AI 08 26032025 102321am
No ratings yet
Lecture AI 08 26032025 102321am
16 pages
Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means
No ratings yet
Course On Bayesian Methods in Environmental Valuation: Basics (Continued) : Models For Proportions and Means
34 pages
Chapter 4 Bayesian Networks
No ratings yet
Chapter 4 Bayesian Networks
62 pages
Bayes Theorem Topic Final
No ratings yet
Bayes Theorem Topic Final
23 pages
Module 4
No ratings yet
Module 4
57 pages
Naive Bayes
No ratings yet
Naive Bayes
6 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Lect 7 DM
No ratings yet
Lect 7 DM
65 pages
Machine Learning PPT Part III
No ratings yet
Machine Learning PPT Part III
26 pages
Bayesian Learning
No ratings yet
Bayesian Learning
44 pages
Bayesrule Stats
No ratings yet
Bayesrule Stats
4 pages
Probability
No ratings yet
Probability
45 pages
Quantitative Analytics Chapter 2
No ratings yet
Quantitative Analytics Chapter 2
32 pages
Probability Concepts for Managers
No ratings yet
Probability Concepts for Managers
45 pages
PROBABILITY
No ratings yet
PROBABILITY
15 pages
Bayes
No ratings yet
Bayes
4 pages
Dan Morris - Bayes' Theorem Examples - A Visual Introduction For Beginners-Blue Windmill (2016) PDF
No ratings yet
Dan Morris - Bayes' Theorem Examples - A Visual Introduction For Beginners-Blue Windmill (2016) PDF
174 pages
Stats - Probability
No ratings yet
Stats - Probability
53 pages
Bayes' Theorem Applications
No ratings yet
Bayes' Theorem Applications
10 pages
ML Gtu Questions
No ratings yet
ML Gtu Questions
4 pages
PGP Probability
No ratings yet
PGP Probability
24 pages
Bayes Rule
No ratings yet
Bayes Rule
1 page
Unit 2 PPT Probability
No ratings yet
Unit 2 PPT Probability
77 pages
1
No ratings yet
1
9 pages
21CS54 Module 4 2021 Scheme
No ratings yet
21CS54 Module 4 2021 Scheme
42 pages
AI & ML: Probabilistic Reasoning
No ratings yet
AI & ML: Probabilistic Reasoning
26 pages
Bayes' Theorem Explained
No ratings yet
Bayes' Theorem Explained
9 pages
Philosophical Doomsday Analysis
No ratings yet
Philosophical Doomsday Analysis
15 pages
MA 231 Lec 1-3
No ratings yet
MA 231 Lec 1-3
88 pages
Uncertainty in Qualitative Testing
No ratings yet
Uncertainty in Qualitative Testing
3 pages
Probability & Statistics For Scientist and Engineers: Dr. M. M. Bhatti
No ratings yet
Probability & Statistics For Scientist and Engineers: Dr. M. M. Bhatti
24 pages
Predicting ODI Cricket Outcomes
No ratings yet
Predicting ODI Cricket Outcomes
6 pages
Lecture 2
No ratings yet
Lecture 2
24 pages
Machine Learning & Bayesian Methods
No ratings yet
Machine Learning & Bayesian Methods
28 pages
STAT 111 Lecture 6
No ratings yet
STAT 111 Lecture 6
57 pages
Homework 6
No ratings yet
Homework 6
19 pages
One-hour-Ahead Wind Speed Prediction Using A Bayesian Methodology
No ratings yet
One-hour-Ahead Wind Speed Prediction Using A Bayesian Methodology
6 pages
Bayesian Methods in Structural Bioinformatics
No ratings yet
Bayesian Methods in Structural Bioinformatics
398 pages
CAT Permutations and Combinations Formulas PDF
No ratings yet
CAT Permutations and Combinations Formulas PDF
12 pages
AI UNIT3 Lecture Notes It
No ratings yet
AI UNIT3 Lecture Notes It
35 pages
Week 5 Decision Tree - Revised Probability
No ratings yet
Week 5 Decision Tree - Revised Probability
51 pages