Lecture Notes
Course: B.Tech. III Year I Semester
Subject: Artifical Intelligence
Course Objectives: To train the students to understand different types
of AI agents, various AI search algorithms, fundamentals of knowledge
epresentation, building of simple knowledge-based systems and to
apply knowledge representation, reasoning. Study of Markov Models
enable the student ready to step into applied AI.
Artificial Intelligence
UNIT-3
Syllabus
• Advanced Knowledge Representation and Reasoning: Knowledge Representation Issues,
Nonmonotonic Reasoning, Other Knowledge Representation Schemes
• Reasoning Under Uncertainty: Basic probability, Acting Under Uncertainty, Bayes’ Rule,
Representing Knowledge in an Uncertain Domain, Bayesian Networks
Advanced Knowledge Representation and Reasoning:
Humans are best at understanding, reasoning, and interpreting knowledge. Human knows
things, which is knowledge and as per their knowledge they perform various actions in the real
world. But how machines do all these things comes under knowledge representation and
reasoning. Hence we can describe Knowledge representation as following:
o Knowledge representation and reasoning (KR, KRR) is the part of Artificial intelligence
which concerned with AI agents thinking and how thinking contributes to intelligent
behavior of agents.
o It is responsible for representing information about the real world so that a
computer can understand and can utilize this knowledge to solve the complex real
world problems such as diagnosis a medical condition or communicating with
humans in natural language.
o It is also a way which describes how we can represent knowledge in artificial
intelligence. Knowledge representation is not just storing data into some
database, but it also enables an intelligent machine to learn from that knowledge
and experiences so that it can behave intelligently like a human.
Issues in Knowledge Representation
• The fundamental goal of knowledge Representation is to facilitate inference
(conclusions) from knowledge.
• The issues that arise while using KR techniques are many. Some of these are
explained below
ISSUES:
Important attributes
Relationship among Attributes
Choosing the granularity of representation
Representing Set of Objects
Finding right structure as needed
1. Important Attributes
Any attribute of objects so basic that they occur in almost every problem domain?
There are two attributed “instance” and “is a”, that are general significance. These
attributes are important because they support property inheritance.
2.Relationship among attributes:
Any important relationship that exists among object attributed?
The attributes we use to describe objects are themselves entities that we represent.
The relationship between the attributes of an object, independent of specific
knowledge they encode, may hold properties like:
1. Inverse — This is about consistency check, while a value is added to one
attribute. The entities are related to each other in many different ways.
2. Existence in an isa hierarchy — This is about generalization-specification, like,
classes of objects and specialized subsets of those classes, there are attributes
and specialization of attributes. For example, the attribute height is a
specialization of general attribute physical-size which is, in turn, a specialization
of physical-attribute. These generalization-specialization relationships are
important for attributes because they support inheritance.
3. Technique for reasoning about values — This is about reasoning values of
attributes not given explicitly. Several kinds of information are used in
reasoning, like, height: must be in a unit of length, Age: of a person cannot be
greater than the age of person’s parents. The values are often specified when a
knowledge base is created.
4. Single valued attributes — This is about a specific attribute that is guaranteed to
take a unique value. For example, a baseball player can at time have only a
single height and be a member of only one team. KR systems take different
approaches to provide support for single valued attributes.
3.Choosing Granularity:
At what level of detail should the knowledge be represented?
Regardless of the KR formalism, it is necessary to know:
• At what level should the knowledge be represented and what are the primitives?
• Should there be a small number or should there be a large number of low-level
primitives or High-level facts.
• High-level facts may not be adequate for inference while Low-level primitives
may require a lot of storage.
Example of Granularity:
• Suppose we are interested in following facts:
John spotted Sue.
This could be represented as
Spotted (agent(John),object (Sue))
Such a representation would make it easy to answer questions such are:
• Who spotted Sue?
Suppose we want to know:
• Did John see Sue?
Given only one fact, we cannot discover that answer.
We can add other facts, such as
Spotted(x, y) -> saw(x, y)
We can now infer the answer to the question.
4.Set of objects:
How should sets of objects be represented?
There are certain properties of objects that are true as member of a set but not as
individual;
Example: Consider the assertion made in the sentences:
“there are more sheep than people in Australia”, and
“English speakers can be found all over the world.”
To describe these facts, the only way is to attach assertion to the sets representing
people, sheep, and English.
The reason to represent sets of objects is: if a property is true for all or most
elements of a set, then it is more efficient to associate it once with the set rather
than to associate it explicitly with every elements of the set.
5.Finding Right structure:
Given a large amount of knowledge stored in a database, how can relevant parts are
accessed when they are needed?
This is about access to right structure for describing a particular situation.
This requires, selecting an initial structure and then revising the choice.
While doing so, it is necessary to solve following problems:
• How to perform an initial selection of the most appropriate structure.
• How to fill in appropriate details from the current situations.
• How to find a better structure if the one chosen initially turns out not to
be appropriate.
• What to do if none of the available structures is appropriate.
• When to create and remember a new structure.
There is no good, general purpose method for solving all these problems. Some
knowledge representation techniques solve some of these issues.
Nonmonotonic Reasoning
Monotonous information
• Conventional reasoning works with information that is
– Complete
– Consistent
– Monotonous
• When do you say information is monotonous?
If a new facts gets added to the already existing information, and still all the
information remain same, it does not change, consistency still remains across all the
facts, and no facts has to be retracted then this information is said to be monotonous
Monotonic Reasoning
• Technique for reasoning with a complete, consistent, and unchanging model of
the world
• Used in conventional reasoning systems.
• Information is complete with respect to domain of interest
• All facts that are necessary to solve a problem are present or can be derived from
those that are
• They only way to change is to add new facts.
• For new consistent facts nothing will ever be retracted.
It is consistent
Methods
• Using Predicate Logic-Resolutions
• Natural deduction
• Logic Programming
• Forward and Backward reasoning
• Matching
Example:
Earth revolves around the Sun.
• It is a true fact, and it cannot be changed even if we add another sentence in
knowledge base like, "The moon revolves around the earth" Or "Earth is not
round," etc.
Advantages of Monotonic Reasoning:
• In monotonic reasoning, each old proof will always remain valid.
• If we deduce some facts from available facts, then it will remain valid for always.
Disadvantages of Monotonic Reasoning:
• We cannot represent the real world scenarios using Monotonic reasoning.
• Hypothesis knowledge cannot be expressed with monotonic reasoning, which
means facts should be true.
• Since we can only derive conclusions from the old proofs, so new knowledge
from the real world cannot be added.
Non-Monotonic Reasoning
• Deals with problems with incomplete and uncertain models
• Used to reason when a Complete, Consistent and constant model of world is not
available.
• Resolving of inconsistency is done by rejection of facts.
• It is required to observe how revision progress downwards.
• Inconsistent beliefs are singled out.
• It illustraes problems posed by uncertain,fuzzy and changing knowledge
• At a given moment a statement is believed to be true, false or not believed
Key Issues for non-monotonic reasoning
1.How can Knowledge base be extended to allow inferences to be made on the basis
of lack of knowledge?
– We need to make clear the distinction between the existing facts
– Any information that depends on the lack of some piece os knowledge is
non-monotonic inference
– Non-monotonic inference may be defeated by addition of new information
that violates originally taken assumption
2. How can the knowledge base be updated properly when a new fact is added to the
system or when an old one is removed?
– Keep track of proofs or justifications
– Find all justifications that depends on absence of new facts,and those
proofs can be marked as invalid
– Non Monotonic inference may be defeated by the addition off new
information that violates originally taken assumptions.
3.How can Knowledge be used to resolve conflicts when there are several in consistent
non-monotonic inferences that could be drawn?
– Contradictions are more likely to occur than conventional logic systems.
– Portions of knowledgebase are locally consistent but globally inconsistent
Methods for non-monotonic reasoning
Need
• Use non-monotonic reasoning to perform default reasoning
• To draw conclusion based on what is most likely to be true.
Approach:
Non-Monotonic logic
Default logic
Example
Let suppose the knowledge base contains the following knowledge:
– Birds can fly
– Penguins cannot fly
– Pitty is a bird
• So from the above sentences, we can conclude that Pitty can fly.
• However, if we add one another sentence into knowledge base "Pitty is a
penguin", which concludes "Pitty cannot fly", so it invalidates the above
conclusion.
Advantages of Non-monotonic reasoning:
• For real-world systems such as Robot navigation, we can use non-monotonic
reasoning.
• In Non-monotonic reasoning, we can choose probabilistic facts or can make
assumptions.
Disadvantages of Non-monotonic Reasoning:
• In non-monotonic reasoning, the old facts may be invalidated by adding new
sentences.
• It cannot be used for theorem proving
Knowledge representation Schemes
Approaches to knowledge representation:
There are mainly four approaches to knowledge representation, which are givenbelow:
1.Simple relational knowledge:
o It is the simplest way of storing facts which uses the relational method, and each fact
about a set of the object is set out systematically in columns.
o This approach of knowledge representation is famous in database systems where the
relationship between different entities is represented.
o This approach has little opportunity for inference.
Example: The following is the simple relational knowledge representation
Player Weight Age
Player1 65 23
Player2 58 18
Player3 75 15
2.Inheritable knowledge:
o In the inheritable knowledge approach, all data must be stored into a hierarchy of
classes.
o All classes should be arranged in a generalized form or a hierarchal manner.
o In this approach, we apply inheritance property.
o Elements inherit values from other members of a class.
o This approach contains inheritable knowledge which shows a relation
between instance and class, and it is called instance relation.
o Every individual frame can represent the collection of attributes and its value.
o In this approach, objects and values are represented in Boxed nodes.
o We use Arrows which point from objects to their values.
Example:
3.Inferential knowledge:
Inferential knowledge approach represents knowledge in the form of formal
logics.
This approach can be used to derive more facts.
It guaranteed correctness.
Example: Let's suppose there are two statements:
Marcus is a man
All men are mortal Then it can represent as;
man(Marcus) ∀x = man (x) > mortal (x)s
Requirements for knowledge Representation system:
Good knowledge representation system must possess the following properties.
1. Representational Accuracy:
KR system should have the ability to represent all kind of required knowledge.
2. Inferential Adequacy:
KR system should have ability to manipulate the representational structures to
produce new knowledge corresponding to existing structure.
3. Inferential Efficiency:
The ability to direct the inferential knowledge mechanism into the most productive
directions by storing appropriate guides.
4. Acquisitional efficiency- The ability to acquire the new knowledge easily using
automatic methods.
Techniques of knowledge Representation
Four Knowledge representation Techniques are
• Logical Reprentation
• Semantic Network Representation
• Frame Representation
• Production Rules
1. Logical Representation
• Logical representation is a language with some concrete rules which deals with
propositions and has no ambiguity in representation.
• Logical representation means drawing a conclusion based ob various
conditions.
• This representation lays down some important
Communication rules. It consists of precisely defined syntax and
semantics which support the sound inference.
• Each sentence can be translated into logics using syntax and semantics.
Logical representation can be categories into mainly two logics
1.Propositional Logic
2. First order predicate or predicate logic
Advantages of Logical representations:
• Logical representation enables us to do logical reasoning
• Logical representation is the basis for programming language.
Disadvantages
• Logical representations have some restriction and we challenging to work with.
• Logical representation technique may not be very natural, and inference may not
be so efficient
2.Semantic Network Representation
• Semantic network are alternatives of predicate logic for knowledge
representation.
• In semantic networks, we can represent our knowledge in the form of graphical
networks.
• This network consists of nodes representing objects and arcs which describes the
relationship between those objects.
• Semantic networks can categorize the objects in different forms and can also link
those objects. Semantic networks are easy to understand and can be easily
extended.
• This representation consists of mainly two types of relation
– IS-A relation(Inheritance)
– Kind-of-relation
• A semantic network is a graphic notation for representing knowledge in patterns
of interconnected nodes.
• Semantic network became popular in artificial intelligence and natural language
processing only because it represents knowledge or support reasoning.
Example Statements
• Jerry is a cat
• Jerry is a mammal
• Jerry is owned by priya
• Jerry is white colored
• All Mammals are animal
Advantages
• Simple and easy understandable
• More natural than the logical representation
• Permits use of effective inference algorithm
• Simple apprach to investigate the problem space
• Greater expressiveness compared to logic
Used as atypical connection application among vrious fields of knowledge, for instance,
among computer science and anthropology
Disadvantages
• Semantic Networks take more computional time of runtime as we need to
traverse the complete network tree to answer some questions
• There is no standard definition for link names
• Not intelligent depents on the creator
• Negation and disjuction and general taxonomical knowledge are not easily
expressed.
3.Frame Representation
• A frame is a record like structure which consists of a collection of attributes
and its values to describe an entity in the world
• It consists of a collection of slots and slot values. These slots may be of any
type and size.
• Slots have names and values which are called facets.
• Facets: the various aspects of a slot is known as Facets
• Facets are features of frames which enable us to put constraints on frames.
• Example: IF-NEEDED facts are called when data of any particular slot is
needed
Let’s take an example of a frame for a book
• Slot(att) Filters(values)
• Title Artificial Intelligence
• Genre Computer Science
• Author Peter Novig
• Edition Third Edition
• Year 1996
• Page 1152
Advantages
• The frame knowledge representation makes the programming easier by grouping
related data
• It is flexible and used by many AI applications
• It is very easy to add slots for new attributes and relation
• It is easy to include default data and to search for missing values
• Frame representation is easy to understand and visualize.
4. Production Rules
• Production rules system consists of(condition, action) pairs which means “If
condition then action”. It has mainly three parts
– The set of production rules
– Working Memory
– The recognize-act-cycle
• In production rules, agent checks for the condition and if the condition exists
then production rule fires and corresponding action is carried out.
• The condition part of the rule determines which rule may be applied to a
problem. And the action part out the associated problem-solving step
• These complete process is called a recognizer-act-cycle.
• This working memory contains te description of the current state of problem-
solving and rules can write knowledge to working memory. This match and fire
other rules
• If there is a new situation(state) generates then multiple production rules will be
fired together, this is called conflict set.
• In this situation the agent needs to select a rule from these sets and it is called a
conflict resolution
Probabilistic reasoning in Artificial intelligence
Uncertainty:
Till now, we have learned knowledge representation using first-order logic and propositional
logic with certainty, which means we were sure about the predicates. With this knowledge
representation, we might write A→B, which means if A is true then B is true, but consider a
situation where we are not sure about whether A is true or not then we cannot express this
statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we need
uncertain reasoning or probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
1. Information occurred from unreliable sources.
2. Experimental Errors
3. Equipment fault
4. Temperature variation
5. Climate change.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the concept of
probability to indicate the uncertainty in knowledge. In probabilistic reasoning, we combine
probability theory with logic to handle the uncertainty.
We use probability in probabilistic reasoning because it provides a way to handle the uncertainty
that is the result of someone's laziness and ignorance.
In the real world, there are lots of scenarios, where the certainty of something is not confirmed,
such as "It will rain today," "behavior of someone for some situations," "A match between two
teams or two players." These are probable sentences for which we can assume that it will happen
but not sure about it, so here we use probabilistic reasoning.
Need of probabilistic reasoning in AI:
o When there are unpredictable outcomes.
o When specifications or possibilities of predicates becomes too large to handle.
o When an unknown error occurs during an experiment.
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:
▪ Bayes' rule
▪ Bayesian Statistics
Note: We will learn the above two rules in later chapters.
As probabilistic reasoning uses probability and related terms, so before understanding probabilistic
reasoning, let's understand some common terms:
Probability: Probability can be defined as a chance that an uncertain event will occur. It is the
numerical measure of the likelihood that an event will occur. The value of probability always
remains between 0 and 1 that represent ideal uncertainties.
1. 0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
1. P(A) = 0, indicates total uncertainty in an event A.
1. P(A) =1, indicates total certainty in an event A.
We can find the probability of an uncertain event by using the below formula.
o P(¬A) = probability of a not happening event.
o P(¬A) + P(A) = 1.
Event: Each possible outcome of a variable is called an event.
Sample space: The collection of all possible events is called sample space.
Random variables: Random variables are used to represent the events and objects in the real world.
Prior probability: The prior probability of an event is probability computed before observing
new information.
Posterior Probability: The probability that is calculated after all evidence or information has
taken into account. It is a combination of prior probability and new information.
Conditional probability:
Conditional probability is a probability of occurring an event when another event has already happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:
Where
P(A⋀B)= Joint
probability of a
and B P(B)=
Marginal
probability of B.
If the probability of A is given and we need to find the probability of B, then it will be given as:
It can be explained by using the below Venn diagram, where B is occurred event, so sample space
will be reduced to set B, and now we can only calculate event A when event B is already occurred
by dividing the probability of P(A⋀B) by P( B ).
Example:
In a class, there are 70% of the students who like English and 40% of the students who likes English and
mathematics, and then what is the percent of students those who like English also like mathematics?
Solution:
Let, A is an event that a student likes
Mathematics B is an event that a student likes
English.
Hence, 57% are the students who like English also like Mathematics.
Bayes' theorem in Artificial Intelligence
Bayes' theorem:
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the
probability of an event with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to Bayesian statistics.
It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
Bayes' theorem allows updating the probability prediction of an event by observing new information of the real world
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the
probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with known event B:
As from product rule we can write:
1. P(A ⋀ B)= P(A|B) P(B) or
Similarly, the probability of event B with known event A:
. P(A ⋀ B)= P(B|A) P(A)
Equating right hand side of both the equations, we will get:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of
most modern AI systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(A|B) is known as posterior, which we need to calculate, and it will be read as
Probability of hypothesis A when we have occurred an evidence B.
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we
calculate the probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the
evidence P(B) is called marginal probability, pure probability of an evidence.
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be
written as:
Where A1, A2, A3, , An is a set of mutually exclusive and exhaustive events
Applying Bayes' rule:
Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and
P(A). This is very useful in cases where we have a good probability of these three
terms and want to determine the fourth one. Suppose we want to perceive the effect
of some unknown cause, and want to compute that cause, then the Bayes' rule
becomes:
Example-1:
Question: what is the probability that a patient has diseases meningitis with a stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it
occurs 80% of the time. He is also aware of some more facts, which are given as
follows:
o The Known probability that a patient has meningitis disease is 1/30,000.
o The Known probability that a patient has a stiff neck is 2%.
Let a be the proposition that patient has stiff neck and b be the proposition that patient has meningit
calculate the following as:
P(a|b) = 0.8 P(b) =
1/30000 P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff neck.
Example-2:
Question: From a standard deck of playing cards, a single card is drawn. The probability that
the card is king is 4/52, then calculate posterior probability P(King|Face), which means the
drawn face card is a king card.
Solution:
P(king): probability that he card is King= 4/52= 1/13 P(face): probability that a card is a face card=
3/13
P(Face|King): probability of face card when we assume it is a king
= 1 Putting all values in equation (i) we will get:
Application of Bayes' theorem in Artificial intelligence:
Following are some applications of Bayes' theorem:
o It is used to calculate the next step of the robot when the already executed step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.
Bayesian Belief Network in artificial intelligence
Bayesian belief network is key computer technology for dealing with probabilistic events and to
solve a problem which has uncertainty. We can define a Bayesian network as:
"A Bayesian network is a probabilistic graphical model which represents a set of variables and
their conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability
distribution, and also use probability theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between
multiple events, we need a Bayesian network. It can also be used in various tasks including
prediction, anomaly detection, diagnostics, automated insight, reasoning, time series
prediction, and decision making under uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it
consists of two parts:
• Directed Acyclic Graph
• Table of conditional probabilities
The generalized form of Bayesian network that represents and solve decision problems under uncertain
knowledge is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where
Each node corresponds to the random variables, and a variable can be continuous or discrete.
Arc or directed arrows represent the causal relationship or conditional probabilities between random variables.
These directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed link that
means that nodes are independent with each other
In the above diagram, A, B, C, and D are random variables represented by the nodes of
the network graph.
If we are considering node B, which is connected with node A by a directed arrow, then node A is called
the parent of Node B.
Node C is independent of node A.
Note: The Bayesian network graph does not contain any cyclic graph. Hence, it is known as a
directed acyclic graph or DAG.
The Bayesian network has mainly two components:
o Causal Component
o Actual numbers
Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi) ),
which determines the effect of the parent on that node.
Bayesian network is based on Joint probability distribution and conditional probability. So let's first understand the joint
probability distribution
Joint probability distribution:
If we have variables x1, x2, x3, .... , xn, then the probabilities of a different combination of x1, x2, x3..
xn, are known as Joint probability distribution.
P[x1, x2, x3, .... , xn], it can be written as the following way in terms of the joint probability distribution.
= P[x1| x2, x3,....., xn]P[x2, x3, , xn]
= P[x1| x2, x3,....., xn]P[x2|x3,....., xn]... P[xn-1|xn]P[xn].
In general for each variable Xi, we can write the equation as:
P(Xi|Xi-1, ......... , X1) = P(Xi |Parents(Xi ))
Explanation of Bayesian network:
Let's understand the Bayesian network through an example by creating a directed acyclic graph:
Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably
responds at detecting a burglary but also responds for minor earthquakes. Harry has two neighbors
David and Sophia, who have taken a responsibility to inform Harry at work when they hear the
alarm. David always calls Harry when he hears the alarm, but sometimes he got confused with the
phone ringing and calls at that time too. On the other hand, Sophia likes to listen to high music, so
sometimes she misses to hear the alarm. Here we would like to compute the probability of
Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a burglary, nor an
earthquake occurred, and David and Sophia both called the Harry.
Solution:
o The Bayesian network for the above problem is given below. The network structure is
showing that burglary and earthquake is the parent node of the alarm and directly
affecting the probability of alarm's going off, but David and Sophia's calls depend on
alarm probability.
o The network is representing that our assumptions do not directly perceive the burglary
and also do not notice the minor earthquake, and they also not confer before calling.
o The conditional distributions for each node are given as conditional probabilities table or CPT.
o Each row in the CPT must be sum to 1 because all the entries in the table represent an
exhaustive set of cases for the variable.
o In CPT, a boolean variable with k boolean parents contains 2K probabilities. Hence, if
there are two parents, then CPT will contain 4 probability values
List of all events occurring in this network:
o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A, B, E],
can rewrite the above probability statement using joint probability distribution:
P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
= P [D| A]. P [ S| A, B, E]. P[ A, B, E]
= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]
Let's take the observed probability for the Burglary and earthquake
component: P(B= True) = 0.002, which is the probability of burglary.
P(B= False)= 0.998, which is the probability of no burglary.
P(E= True)= 0.001, which is the probability of a minor earthquake
P(E= False)= 0.999, Which is the probability that an earthquake not
occurred. We can provide the conditional probabilities as per the
below tables: Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends on Burglar and earthquake:
B E P(A= True) P(A= False)
True True 0.94 0.06
True False 0.95 0.04
False True 0.31 0.69
False False 0.001 0.999
Conditional probability table for David Calls:
The Conditional probability of David that he will call depends on the probability of Alarm.
A P(D= True) P(D= False)
True 0.91 0.09
False 0.05 0.95
Conditional probability table for Sophia Calls:
The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."
A P(S= True) P(S= False)
True 0.75 0.25
False 0.02 0.98
From the formula of joint distribution, we can write the problem statement in the form of
probability distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint
distribution. The semantics of Bayesian Network:
There are two ways to understand the semantics of the Bayesian network, which is given below:
1. To understand the network as the representation of the Joint probability distribution.
It is helpful to understand how to construct the network.
2. To understand the network as an encoding of a collection of conditional
independence statements.
It is helpful in designing inference procedure.
Question Bank Of Unit 3
1. Discuss in details about knowledge representation Issues?
2. Write about types of knowledge?
3. Discuss Bayes’ Rule Representing Knowledge in an Uncertain Domain
4. Explain Bayesian Networks with example.
5. Discuss Non-monotonic Reasoning.
6. Explain different techniques of knowledge representation
7. Discuss the basic of probability theory?
8. Define the following
a)Uncertainty b)prior probability)conditional probability d)join probability
Assignment of Unit 3
1. Specifying the probability model to Wampum world example
2. Describe Bayes theorem. Prove it.