INTRODUCTION TO ARTIFICIAL
INTELLIGENCE 18CS753
Module-2:
KNOWLEDGE
REPRESENTATION ISSUES
By
Dr. Srividya R
Representations and Mapping
✔ There are variety of ways of representing knowledge (facts) used in
AI programs.
✔ Knowledge representation deals with two different kinds of
entities.
1. Facts: truths in some relevant world. These are
things we want to represent.
2. Representations of facts in some chosen form. These
are things that can be manipulated.
✔ 1 way to structure these entities is at two levels:
1. The Knowledge level: facts are described here.
2. The Symbol level: defines the representations of objects at
knowledge level.
Representations and Mapping
✔ We will follow a model as shown here for rest of the discussion.
✔ We focus on, the facts, the representations and on 2-way mapping
existing between them (representation mappings).
✔ The forward mapping: maps facts to representations (facts
representations)
✔ The backward mapping: maps representations to facts
(representations facts)
✔ Natural Language sentences are special examples of facts
representation.
Natural Language sentences
✔ English representation and getting information into & out of system
is considered here.
✔ Figure shows how the objects are related to each other.
Mathematical logic as representational
formalism
Consider English sentence:
Spot is a dog
• Fact represented in English is now represented in logic as:
dog(Spot)
• Suppose there is a logical representation of fact that all dogs have tails:
x : dog(x) hastail(x)
• Using deductive mechanisms, we generate new representation object:
hastail(Spot)
• Using an appropriate backward mapping function , we can generate
the sentence:
Spot has a tail
• Usually mapping functions are not one-to-one, they are many-to-
many relations.
Multilated Checker Board Problem
✔ When we try English sentences we must first decide facts, that the
sentences represent, & then convert into representation.
✔ A good representation makes the operation of a reasoning program
correct & trivial.
✔ Consider the example of Multilated Checker Board Problem
“ Consider a normal checker board from which two squares, in opposite
corners, have been removed.
The task is to cover all the remaining squares exactly with dominoes, each
of which covers two squares.
No overlapping, either of dominoes on top of each other or of dominoes
over the boundary of the multilated board are allowed.
Can this task be done?
Representations and Mapping
Figure shows 3 ways in which the multilated checker board could be
represented.
▪ First representation: does not directly suggest the answer to the
problem.
▪ Second representation: may suggest the answer.
▪ Third representation: suggest the answer.
Representations and Mapping
✔ Dotted lines: represent abstract reasoning that program is
intended to model.
✔ Solid lines: represent concrete reasoning that particular program
performs.
Approaches to Knowledge Representation
✔ A good system for the representation of knowledge should have four
properties:
1. Representational Adequacy: ability to represent all kinds of
knowledge needed in that domain.
1. Inferential Adequacy: ability to manipulate the representational
structure to drive new structures.
1. Inferential Efficiency: ability to incorporate additional information
into knowledge structure.
1. Acquisitional Efficiency: ability to acquire new information easily.
• Unfortunately no single system that optimizes all these capabilities is
yet found.
Simple Relational Knowledge
• Simplest way to represent facts is as set of relations.
• Knowledge represented in this form may serve as input to more
powerful inference engines.
• Table below shows an example of such relational system.
Inheritable Knowledge
✔ Knowledge about objects, their attributes, & their values may not be
simple.
✔ We can augment basic representation with inference mechanisms.
✔ one `of the most useful forms of inference is property inheritance
✔ Here elements of specific classes inherit attributes & values from
more general classes.
Inheritable Knowledge
✔ Figure shows baseball knowledge inserted into a structure.
✔ Lines attributes
✔ Boxed nodes objects
✔ Boxed nodes values of attributes of objects
Inheritable Knowledge
✔ The structure shown in the figure is a slot-and-filler structure.
✔ It may also be called semantic network or a collection of frames.
✔ Figure shows the node for baseball player displayed as a frame.
✔ Usually the use of the term frame system implies somewhat more
structure on the attributes & inference mechanisms than the term
semantic network.
Property Inheritance
✔ Idealized form of property inheritance algorithm is stated below:
Algorithm: Property Inheritance
To retrieve a value V for attribute A of an instance object O:
1. Find O in the knowledge base
2. If there is a value there for the attribute A, report that value.
3. Otherwise , see if there is value for the attribute instance . If not, then
fail.
4. Otherwise , move to the node corresponding to that value & look for a
value for the attribute A. if one is found, report it.
5. Otherwise, do until there is no value for the isa attribute or until an
answer is found:
a. Get the value of the isa attribute & move to that node.
b. See if there is a value for the attribute A. if there is report it.
Inheritable Knowledge
✔ This procedure is simplistic it describes basic mechanism of
inheritance.
✔ Applying this procedure to our example knowledge base to derive
answers to the following queries:
▪ Team(Pee-Wee-Reese)= Brooklyn-Dodgers
This attribute had a value stored explicitly in the knowledge base.
▪ Batting-average(Three-Finger Brown)=0.106
No explicit batting average given. So follow instance attribute & extract
the value stored there.
▪ height(Pee-Wee-Reese)=6-1. represents default inference.
Baseball player height overrides male height.
▪ Bats(Three-Finger-Brown)= Right.
To get value for the attribute bats required going up the isa hierarchy to
the class Baseball-Player.
Inferential Knowledge
✔ Property inheritance is a powerful form of inference.
✔ Some times traditional logic is necessary to describe the inferences.
✔ Figure : examples of predicate logic to represent additional
knowledge about baseball.
✔ An inference procedure like resolution is required to use this
knowledge.
Procedural Knowledge
✔ Operational knowledge specifies what to do when.
✔ Code(LISP) is used to represent procedural knowledge.
✔ Its inadequate w.r.t inferential adequacy & acquisitional efficiency.
Procedural Knowledge
✔ Production rules also can be used to represent procedural
knowledge.
✔ Figure: it shows production rule representing operational
knowledge possessed by a baseball player.
Procedural Knowledge as rules
Issues in Knowledge Representation
✔ Several issues wrt mechanisms that have been used to represent
various kinds of real-world knowledge, as follows:
1. Are attributes of objects so basic that they occur in almost every
problem domain ?
1. Are there any important relationships that exist among attributes of
objects?
1. At what level should knowledge be represented?
1. How should sets of objects be represented ?
1. Given a large amount of knowledge stored in a database, how can
relevant parts be accessed when they are needed?
Important attributes
✔ There are two attributes that are of very general significance:
1. instance and
2. Isa
✔ These attributes support property inheritance.
✔ They represent class membership & class inclusion.
✔ In logic-based systems these relationships may be represented
implicitly by a set of predicates describing particular classes.
Relationships among Attributes
✔ What properties do attributes have?
1. Inverses .
1. Existence in an isa hierarchy.
1. Techniques for reasoning about values.
1. Single-valued attributes.
Inverses
✔ Entities in the world are related to each other in many different
ways.
✔ These relationships are described as attributes.
✔ We focus on one object and find its relationships with others.
✔ In figure we used instance, isa & team.(originated @ object being
described terminated @ object value)
✔ We could equally have focused on the object representing the value
and still relationship between the two entities existed.
✔ There are 2 ways to represent this relationship:
Inverses
✔ There are 2 ways to represent this relationship:
First: Represent both relationships in a single representation that
ignores focus.
Team(Pee-Wee-Reese, Brooklyn-Dodgers)
Second: Use attributes that focus on a single entity but to use them in
pairs
One inverse of the other.
✔ Here we could represent team information with two attributes:
• One associated with Pee Wee Reese:
Team= Brooklyn-Dodgers
• One associated with Brooklyn Dodgers:
Team-members = Pee-Wee-Reese
An Isa Hierarchy of Attributes
✔ As we have objects and classes there are attributes and
specializations of attributes.
✔ Consider the attribute height : it is actually a specialization of
physical-size which is specialization of physical-attribute.
✔ They support inheritance.
Techniques for Reasoning about Values
✔ Sometimes values of attributes are specified explicitly when a
knowledge base is created.
✔ The reasoning system must reason about the values it has not given
explicitly.
✔ Several kinds of information can be used in this reasoning, including:
1. Information about the type of the value. Ex- value
of height is a number
1. Constraints on the value, often stated in terms of related entities.
Ex- child’s age cannot be greater than parents age.
2. Rules for computing the value when it is needed (if-needed rules or
backward rules)
3. Rules that describes actions that should be taken if a value ever
becomes known. (forward rules or if-added rules)
Single-Valued Attributes
✔ A specific & very useful kind of attribute has unique value.
✔ Ex- baseball player @ any one time have single height value & can
be in only one team.
✔ Knowledge-representation systems have several approaches to
provide support for single – valued attributes.
✔ Approaches are as follows:
• Introduce an explicit signal for temporal interval.
• Assume that the current interval is the interested temporal interval .
• Provide no explicit support.
Choosing the Granularity of Representation
✔ At what level of detail should the world be represented?
✔ Ex- Consider the statement :
John spotted Sue
✔ This is represented as
Spotted(agent(John),object(sue))
// represent roles involved in the event
✔ Such a representation easily answers the following question?
Who spotted Sue?
✔ But if we want to know: Did John see Sue?
The obvious answer is “YES”
✔ With one fact given we cannot confidently answer the question, we
can add few more facts, such as
Spotted(x,y) saw(x,y)
Choosing the Granularity of
Representation
✔ An alternate solution to this problem is to represent the fact spotted
as seeing (spotted seeing) explicitly.
✔ We can write it as
Saw(agent(John), object(Sue), timespan(briefly)
✔ Spotting is broken here into seeing & timespan. ( spotting=
seeing+timespan )
Choosing the Granularity of Representation
✔ High level facts may require lot of storage when broken down into
primitives.
✔ Ex- suppose we represent actions as small set of primitives:
Sequences:
✔ The fact that John punched Mary be represented as 1. Contact
between fist
and Mary
2. John fist
propelled
towards
Mary
3. John went to
Mary’s place
Choosing the Granularity of Representation
✔ At what level of detail should the world be represented
Mary is Sue’s Cousin
• Now primitives incorporate some generalizations
• Point to be learned: even in simple domains, correct set of primitives
is not obvious
Choosing the Granularity of Representation
✔ In less well structured domains, even more problems arise.
✔ Given a fact
John broke the window
• Program cant decide if John’s actions consisted of primitive sequence:
1. Pick up a hard object
2. Hurl the object through window.
• Or a single action
1. Cause hand/foot to move fast & crash into the window.
• Or a single action
1. Shut the window so hard that the glass breaks.
✔ Correct granularity of representation for a particular body of
knowledge is not easy
Representing Sets of Objects
✔ Its important to represent sets of objects for several reasons:
1. Some properties are true of sets but are not true of the individual
members of a set.
1. If a property is true of all elements of a set, then its more efficient to
associate it once with the set rather than to associate it explicitly
with every element of the set.
✔ 3 ways of representing sets:
1. Simplest is just by a name.
• This makes it possible to associate predicates (specifies action /
relations) with sets.
• It does not by itself, provide any information about the set it
represents.
• It does not tell how to determine whether particular object is a
member of the set or not.
Representing Sets of Objects
✔ 2 ways to state a definition of a set & its elements
1. First: list the members (extensional definition). Ex-
description of set of Sun’s planets on which people live. {
x: sun-planet(x) Ʌ human-inhabited(x)}
1. Second: provide a rule that when an object is evaluated it returns
true/false depending on whether its present in the set or not.
(intensional definition)
• These representations can function differently in some cases.
• Extensional definitions embeds many intensional definitions.
{ x: sun-planet(x) ? nth farthest-from-sun(x,8)}
{ x: sun-planet(x) ? nth-biggest(x,5)}
Representing Sets of Objects
✔ Intensional representations can be used to describe, infinite sets
& sets, whose elements may not be explicitly known.
✔ If we allow intensional representations to depend on parameters
that change, then the actual set that is represented by the
description will change as a function of the value of those
parameters.
✔ Ex- Consider statement
“ The President of US used to be a Democrat”
Is said when current president is Republican
This sentence means 2 things:
• Current president was a democrat previously.
• Previous president of US was a democrat
Finding the Right Structures as Needed
✔ Suppose we have a script that describes typical sequence of events in
a restaurant .
✔ This script would enable us to take a text such as
John went to Steak & Ale last night. He ordered a large rare steak, paid his
bill, and left.
✔ And answer “yes” to the question- Did John eat dinner last night?
✔ Nowhere its explicitly mentioned about John eating.
✔ If we know well in advance the script then we can answer the
question easily.
Finding the Right Structures as Needed
✔ In order to have access to right structure for describing a particular
situation, its necessary to solve all of the following problems:
• How to perform an initial selection of the most appropriate structure.
• How to fill in appropriate details from the current situation.
• How to find a better structure if the one chosen initially turns out not
to be appropriate.
• What to do if none of the available structures is appropriate.
• When to create and remember a new structure.
The Frame Problem
✔ A concern is, on how to represent efficiently, sequences of problem
states, that arise from a search process.
One strategy : store each state description as a list of facts.
What happens if description is very long?
Most of the facts will not change from 1 state to another but we will run
out of memory.
Frame problem: The problem of representing the facts that change as well
as those that do not is know as the frame problem.
Frame Axioms: To support reasoning , some systems make use of explicit
set of axioms called frame axioms.
• We can write axioms as folows:
color(x,y,s1) Ʌ move(x,s1,s2) color(x,y,s2)
• An alternate assumption that can be made is “the only things that
change are the things that must”
INTRODUCTION TO ARTIFICIAL
INTELLIGENCE 18CS753
Module-2:
USING PREDICATE LOGIC
By
Dr. Srividya R
Representing Simple Facts In Logic
✔ Propositional logic is appealing because it is simple to deal with & a
decision procedure for it exists.
✔ We can easily represent real-world facts as logical propositions written
as well-formed formulas (wff’s) in propositional logic, as shown in fig.
It is raining
RAINING
It is sunny
SUNNY
It is windy
WINDY
If it is raining, then it is not sunny
RAINING רSUNNY
Fig- Some Simple Facts in Prepositional Logic
Representing Simple Facts In Logic
✔ Suppose we want to represent the obvious fact stated by the classical
sentence
Socrates is a man
✔ We could write:
SOCRATESMAN
✔ But if we also wanted to represent
Plato is a man.
✔ We would have to write something such as:
PLATOMAN
✔ Which would be a totally separate assertion. It is better to represent
these facts as:
MAN(SOCRATES)
MAN(PLATO)
✔ Here the structure of representation reflects structure of knowledge
itself.
✔ Its more difficult to represent classic sentence
All men are mortal
Representing Simple Facts In Logic
• Consider first order predicate logic, as a way of representing knowledge.
• In predicate logic real world facts are represented as statements
written as wff’s.
USE OF PREDICATE LOGIC:
Consider the following set of sentences.
1. Marcus was a man.
2. Marcus was a Pompeian.
3. All Pompeian were Romans.
4. Caesar was a ruler.
5. All Romans were either loyal to Caesar or hated him.
6. Everyone is loyal to someone.
7. People only try to assassinate rulers they are not loyal to.
8. Marcus tried to assassinate Caesar.
Representing Simple Facts In Logic
The facts described here can be represented as wff’s as follows:
1. Marcus was a man.
Man(Marcus)
This representation captures critical fact of Marcus being a man.
2. Marcus was a Pompeian.
Pompeian(Marcus)
3. All Pompeian were Romans.
Vx: Pompeian(x) Roman(x)
4. Caesar was a ruler.
Ruler(Caesar)
5. All Romans were either loyal to Caesar or hated him.
Vx: Roman(x) loyalto(x, Caesar) V hate(x,Caesar)
Here inclusive OR interpretation is used. To express that, we would
have to write:
Vx: Roman(x) [( loyalto(x, Caesar) V hate(x,Caesar))Ʌ
(רloyalto(x, Caesar) Ʌ hate(x, Caesar))]
Representing Simple Facts In Logic
6. Everyone is loyal to someone.
V x: y: loyalto(x,y)
While converting English to logic statements Scope of quantifiers is the major
problem that arises.
7. People only try to assassinate rulers they are not loyal to.
V x: Vy: person(x) Ʌ ruler(y) Ʌ tryassassinate(x,y) רloyalto(x,y)
8. Marcus tried to assassinate Caesar.
Tryassassinate (Marcus, Caesar)
If these statements are used to answer the question
“ Was Marcus loyal to Caesar? ”
Using 7 and 8 its clear that Marcus is not loyal to Caesar.
Producing formal proof reasoning backward from desired goal:
רloyalto(Marcus, Caesar)
In order to prove the goal, use the rules of inference to transform it into
another goal.
Representing Simple Facts In Logic
Here an attempt is made to produce a proof of the goal.
The attempt fails since there is no way to satisfy the goal
person(Marcus) with the statements.
The problem is that we do not have ways to conclude that Marcus was a
person.
We need to add representation of another fact to our system namely:
Representing Simple Facts In Logic
9. All men are people
V: man(x) person(x)
• Now we can satisfy the goal & produce proof that Marcus was not loyal
to Caesar.
• 3 important issues must be addressed in converting English sentences
into logical statements & then using those statements to deduce new
ones:
1. Many English sentences are ambiguous
2. There is often a choice of how to represent knowledge . Simple
representations are desirable.
3. Even in very simple situations, a set of sentences may not contain all
information necessary to reason about the topic.
Problem arises when we don’t know in advance which statement to deduce:
loyalto(Marcus, Caesar)
רloyalto(Marcus, Caesar)
Representing Instance and Isa Relationships
• Figure shows 3 ways of representing class membership
1st Part: contains representations
2nd Part: contains representations that use the instance predicate
explicitly. predicate instance arguments = object(1 st
argument)+class(2 nd argument).
3rd Part: contains representations that use both instance & isa predicates
explicitly.
Representing Instance and Isa Relationships
These examples illustrates two points
1st : is fairly specific.
Class & superclass memberships are important facts. They can be
represented using unary predicates corresponding to the classes.
2nd : is more general.
Different ways of representing given fact.
Here choice depends partly on:
• which deductions need to be supported most efficiently &
• taste.
Many errors in reasoning performed by knowledge- based programs
are result of inconsistent representation decisions.
Computable Functions & Predicates
Simple facts are expressed as combinations of individual predicates,
such as:
Tryassassinate(Marcus,Caesar)
This is fine if ,
• number of facts is not very large or
• facts themselves are sufficiently unstructured.
If we want to express simple facts, such as following
gt(1,0) lt(0,1)
gt(2,1) lt(1,2)
gt(3,2) lt(2,3)
:
:
:
We need not write the representation of each of these facts individually.
It becomes useful to augument our representation by computable
predicates.
Computable Functions & Predicates
• Its often useful to have computable predicates & computable functions.
• Thus we can evaluate the truth of
gt(2+3,1)
First compute value of plus function and then send 5 & 1 to gt.
Consider the following set of facts, again involving Marcus :
1. Marcus was a man.
Man(Marcus)
2. Marcus was a Pompeian.
Pompeian(Marcus)
3. Marcus was born in 40 A.D.
Born(Marcus,40)
4. All men are mortal.
Vx: man(x) mortal(x)
Computable Functions & Predicates
5. All Pompeian's died when the volcano erupted in 79 A.D.
erupted(volcano, 79) Ʌ Vx:[ Pompeian(x) died(x,79)]
6. No mortal lives longer than 150 years.
Vx: Vt1:Vt2: mortal(x) Ʌ born (x,t1) Ʌ gt(t2-t1, 150) dead (x,t2)
7. It is now 1991 A.D.
now=1991
If we want to answer the question “ Is Marcus Alive?”
2 ways of deducing answer
1. We can show that he must be dead because he was killed by volcano
2. We can show that he must be dead because he would otherwise be
more than 150 years old
our statements only talk about dead , but they say nothing that relates to
being alive.
Computable Functions & Predicates
8. Alive means not dead
Vx: Vt: [ alive(x,t) רdead(x,t)] [רdead(x,t) alive(x,t)]
9. If someone dies, then he is dead at all later times
Vx: Vt1: Vt2: died(x,t1)Ʌ gt(t2,t1) dead(x,t2)
• This representation says that one is dead in all years after the one in
which one died.
• Now attempt to answer “ Is Marcus alive?” by proving
רalive(Marcus, now)
Computable Functions & Predicates
• 2 proofs are shown in figures here.
• NIL- list of conditions to be proved is empty
Computable Functions & Predicates
From proofs 2 things should be clear:
1. Even very simple conclusion can require many steps to prove.
1. A variety of processes, such as matching , substitution and application
of modus ponens are involved in production of proof.
NOTE:
Modus ponens: a rule used to draw logical conclusions.
RESOLUTION
✔ Resolution is a procedure,
✔ It has efficiency because it operates on statement represented in a very
convenient standard form.
✔ Resolution produces proofs by refutation.
Refutation: showing that negation of a statement produces a
contradiction of that statement.
Conversion to Clause Form
Suppose all Romans who know Marcus either:
• hate Caesar Or
• Think that anyone who hates anyone is crazy.
We could represent this using wff:
Vx:[Roman(x) Ʌ know(x, Marcus)] [hate(x,Caesar) V (Vy: Зz: hate(y,z)
thinkcrazy(x,y))]
This formula would be easier to work if:
1. It was flatter(if only less components were embedded)
2. The quantifiers were separated from rest of formula.
Conversion to Clause Form
Conjunctive Normal Form: has both of those properties.
The statements
Suppose all Romans who know Marcus either:
• hate Caesar Or
• Think that anyone who hates anyone is crazy.
We could represented here as:
רRoman(x) Ʌ רknow(x,Marcus)V hate(x,Caesar)Vרhate(y, z)V thinkcrazy(x,z)
Algorithm: Convert to Clause Form
1. Eliminate (material implication) using the fact that a b is equivalent to
רaVb.
Performing this transformation on wff(well formed formula’s) given above
yields:
Vx: [רRoman (x)Ʌ know(x, Marcus)]V [hate(x,Caesar)V (Vy:( רƎz: hate(y,z))
Vthinkcrazy(x,y))]
2. Reduce the scope of each רto a single term, using the fact that ר(רp)=p
Vx: [רRoman (x)V רknow(x, Marcus)]V [hate(x,Caesar)V (Vy: Vz: רhate(y,z)) V
thinkcrazy(x,y))]
3. Standardize variables so that each quantifier binds a unique variable.
This process cannot affect the truth value of wff.
Vx: P(x) V Vx: Q(x) would be converted to Vx: P(x) V Vy: Q(y)
4. Move all quantifiers to left of the formula without changing their relative
order.
Algorithm: Convert to Clause Form
5. Eliminate existential quantifiers.
Ǝy: President(y) can be transformed into formula President(S1)
S1- function with no arguments, produces values to satisfy President.
Vx:Ǝy: father-of(y,x)
• y(father) here depends on x(son).
Vx: father-of(S2(x),x)) Skolem functions
6. Drop the prefix. Step 4 appears as
[רRoman (x) V רknow(x, Marcus)] V [hate(x,Caesar)V ( רhate(y,z) V
thinkcrazy(x,y))]
7. Convert Matrix into conjunction of Disjuncts.(in our example no ANDs only
ORs)
רRoman (x) V רknow(x, Marcus) V hate(x,Caesar)V רhate(y,z) V
thinkcrazy(x,y)
Algorithm: Convert to Clause Form
8.Create separate clause corresponding to each conjunct.
9. Standardize apart the variables in the set of clauses generated in step 8.
• No 2 variables should have same name.
• for this, depend on fact that
(Vx: P(x) Ʌ Q(x)) = Vx: P(x) Ʌ Vx: Q(x)
• After applying this entire procedure to set of wff’s , we will have clauses.
• These clauses are used by resolution procedure to generate proofs.
The Basis of Resolution
Resolution procedure: It is a simple iterative process.
• At each step, two clauses, called the parent clauses are
compared(resolved) to obtain new clause.
• Ex:
winter V summer
רwinter V cold
• Precisely one of winter and רwinter will be true at any point. From
these 2 clauses we can deduce
summer V cold // resolvent
• This is the deduction that the resolution procedure will make.
• The resolvent is obtained by combining all the literals of parent clauses
except the ones that cancel.
• The 2 clauses: winter and רwinter will produce the empty clause.
Resolution in Propositional Logic
• In propositional logic, the procedure for producing a proof is as follows:
Algorithm: Propositional Resolution
1. Convert all propositions of F (set of axioms) to clause form.
1. Negate P (proposition) and convert the result to clause form. Add it to set
obtained in 1.
1. Repeat until either a contradiction is found or no progress can be made:
a. Select 2 clauses . Call these as parent clauses.
b. Compare them. Resulting Resolvent, will be disjunction of all of the
literals of both parent clauses. (if L and רL present, then eliminate it)
c. If resolvent is empty clause, then contradiction is found, else add it to set
of clauses.
Resolution in Propositional Logic
A Few Facts in Propositional Logic
Given Axioms Converted to Clause form
1 P P
2 (P Ʌ Q) R רP V רQ V R
3 (S V T) Q רS VQ
4 רT VQ
5 T T
• Suppose given the axioms, to prove R , convert axioms to clause form, then
negate R (רR)
• Select pairs of clauses to resolve together.
• Pairs which contain complementary literals will produce a resolvent &
empty clause.
• Resolution process takes a set of clauses that are assumed to be true & it
generates new clauses.
Resolution in Propositional Logic
• In order for proposition 2 to be true : either רP, רQ or R must be true.
• If רR is assumed to be true, then either P or Q should be true for
proposition 2 to be true.
• Proposition 1 says P is true, then רP cannot be true.
• רQ is the only way for proposition 2 to be true.
• Proposition 4 can be true if רT or Q is true.
• WKT רQ is true ,
• The only way for proposition 4 to be true is רT .
• But proposition 5 says T is true.
• Thus there is no way for all these clauses to be true in single interpretation.
• This is indicated by empty clause.
The Unification Algorithm
• In propositional Logic, it is easy to determine that two literals cannot
both be true at the same time.
• Simply look for L and ¬L.
• In Predicate logic, matching process is complicated since arguments of
predicate should be considered.
• So we need a matching procedure that compares two literals.
• Unification Algorithm does this.
• For Ex:
tryassassinate(Marcus, Caesar)
hate(Marcus, Caesar)
• Cannot be unified.
• If predicate match then arguments must be checked.
The Unification Algorithm
Complication :
To find a Single , Consistent substitution for the entire literal.
• For Example:
P(x,x) and P(y,z) // Two instances of P match
• Compare x and y and substitute y for x. its represented as
y/.x
• If we match x and z, we produce z/x. but both y & z cannot be
substituted for x to obtain a consistent substitution.
• So we need to substitute y/x to obtain P(y,y) and P(y,z)
• Now unify y & z by z substituting y as z/y to obtain P(y,z) and P(y,z)
• Therefore, we can write the composition as
(z/y)(y/x)
The Unification Algorithm
The object of unification procedure is:
To discover at least one substitution that causes two literals to match.
• For Example, the literals
hate(x,y)
hate(Marcus,z)
• Could be unified with any of the following substitutions:
(Marcus/x,z/y)
(Marcus/x, y/z)
(Marcus/x, Caesar/y, Caesar/z)
(Marcus/x, Polonius/y, Polonius/z)
• Unify(L1,L2) procedure returns list of substitutions performed during
match.
Unification Algorithm: Unify(L1,L2)
Algorithm: Unify(L1,L2)
1. If L1 or L2 are both variables or constants, then :
a. If L1 & L2 are identical, then return NIL
b. Else if L1 is a variable , then if L1 occurs in L2 then return {FAIL}, else
return (L2/L1)
c. Else if L2 is a variable , then if L2 occurs in L1 then return {FAIL}, else
return (L1/L2)
d. Else return {FAIL}
2. If the initial predicate symbols in L1 and L2 are not identical, then
return {FAIL}
3. If L1 and L2 have different number of arguments , then return {FAIL}
4. Set SUBST to NIL. ( contains all substitutions to unify L1 and L2)
The Unification Algorithm
5. For i 1 to number of arguments in L1:
a. Call Unify with the ith argument of L1 and ith argument of L2, putting
result in S.
a. If S contains FAIL then return {FAIL}
a. If S is not equal to NIL then:
i. Apply S to the remainder of both L1 and L2
i. SUBST:= APPEND(S,SUBST) // add S to SUBST set
6. Return SUBST
• Unification has deep mathematical roots & is useful in many AI programs
Resolution in Predicate Logic
• In order to use resolution for expressions, use the unification algorithm
to locate pairs of literals that cancels out.
• If 2 instances of the same variable occur, they must be given identical
substitution.
• Assuming a set of given statements F and a statement to be proved P.
Resolution in Predicate Logic
Algorithm: Resolution
1. Convert all statements of F (set of axioms) to clause form.
1. Negate P (proposition) and convert the result to clause form. Add it to set
of clauses obtained in step 1.
1. Repeat until either a contradiction is found or no progress can be made:
a. Select 2 clauses . Call these as parent clauses.
b. Compare them. Resulting Resolvent, will be disjunction of all of the
literals of both parent clauses. (if T1 and רT2 present in one of the
parents, then eliminate them) T1 and T2 are Complementary Literals
c. If resolvent is empty clause, then contradiction is found, else add it to set
of clauses.
.
Resolution in Predicate Logic
There are strategies to speed up resolution process:
1. Only resolve pairs of clauses that contain complementary literals.
2. Eliminate certain clauses as soon as they are generated so that they
cannot participate in later resolution.
3. Whenever possible, resolve either with
✔ one of the clauses that is part of the statement we are trying to refute
or
✔ a clause generated by a resolution. This is called set-of-support
strategy.
4. Whenever possible, resolve with clauses that have single literal (
unit-preference strategy )
Resolution in Predicate Logic
Let’s see how resolution can be used to prove new things about Marcus.
Figures show the result of conversion and a resolution proof of the
statement:
hate(Marcus,Caesar)
Axioms in Clause form: // As a Result of Conversion
1. man(Marcus).
2. Pompeian(Marcus).
3. רPompeian(x1) V Roman(x1)
4. ruler(Caesar)
5. רRoman(x2) V loyalto(x2, Caesar) V hate(x2,Caesar)
6. loyalto(x3,f(x3)) // every one is loyalto someone: loyalto(x,y)
7. רman(x4) V רruler(y1) V רtryassassinate(x4,y1) V רloyalto(x4,y1) .
8. tryassassinate(Marcus,Caesar)
Prove: hate(Marcus,Caesar)
Resolution in Predicate Logic
• Suppose our actual goal in proving the assertion
hate(Marcus, Caesar)
• Was to answer the question
“ Did Marcus hate Caesar ?”
• We might prove the statement רhate(Marcus, Caesar)
• To do so , we would have added
hate(Marcus, Caesar)
to the set of available clauses & begun the resolution process.
• hate(Marcus, Caesar) will not produce a contradiction with the known
statements.
Resolution in Predicate Logic
empty clause.
The Need to Try Several Substitutions
• Resolution provides a very good way of finding a refutation proof
without actually trying all the substitution.
• It does not eliminate necessity of trying more than one substitution.
• Example:
hate(Marcus, Paulus)
hate(Marcus, Julian)
• If we want to prove Marcus hates some ruler, we could try each
substitution shown in figures next
The Need to Try Several Substitutions
• If we want to prove Marcus hates some ruler, we could try each
substitution shown in figures next
Question Answering
Resolution can be used to answer fill-in-the blank questions, such as
“when did Marcus die?” or
“ who tried to assassinate a ruler?”
Answering these involves:
1. finding known statement that matches the terms given in the question
and
1. then responding with another piece of that statement that fills the slot.
• For Ex: to answer the question “ when did Marcus die?”, we need
statement of the form
died(Marcus,??)
• With ?? Actually filled in by some particular year.
• Since we can prove died(Marcus,79) we can respond with the answer 79.
Natural Deduction
Issues with Resolution:
1. While converting to clause form valuable heuristic information is lost.
1. People do not think nor interact with resolution theorem prover.
To facilitate machine theorems to possess human theorem proving
capabilities we look for a way know as Natural deduction.
Natural Deduction:
Describes techniques used in combination to solve problems that are not
tractable by any one method alone.
Natural Deduction
One common technique is
• To arrange knowledge, not by predicates, but by the objects involved in
predicates.
predicate( object, class )
Another technique is
• Use set of rewrite rules that describe logical implications and
• Also suggest the way that those implications can be used in proofs.
INTRODUCTION TO ARTIFICIAL
INTELLIGENCE 18CS753
Module-2:
Representing Knowledge
Using Rules
By
Dr. Srividya R
Procedural Vs Declarative Knowledge
• Logical assertions can be viewed as declarative or procedural
representation.
• Declarative Representation: it is one in which knowledge is specified ,
but its usage is not known.
• Procedural Representation: it is one in which control information is
considered to be embedded in knowledge itself.
• The real difference between the declarative & procedural views of
knowledge lies in where control information resides.
• For example: Consider knowledge base
Man(Marcus)
Man(Caesar)
Ɐx: man(x) person(x)
Person(Cleopatra)
Procedural Vs Declarative Knowledge
• To find answer to question
Ǝy: person(y)
• Bind y to a particular value for which person is true. Our knowledge
base justifies any of the following answers:
y= Marcus
y= Caesar
y=Cleopatra
• Answer depends on the order in which the assertions are examined
• For example, we might specify that assertions will be examined in order
in which they appear in the program.
• If we do that, then the assertions we gave above describe a program
that will answer our question with
y= Cleopatra
Procedural Vs Declarative Knowledge
• To see clearly the difference consider the following assertions:
man(Marcus)
man(Caesar)
Ɐ x: man(x) person(x)
person(Cleopatra)
• Viewed declaratively : this is the same knowledge base that we had before.
• Viewed Procedurally: the answer to our question is Marcus.
• This happens because of the inference rule Ɐx:man(x) person(x)
• This rule sets up a subgoal to find a man.
• Again the statements are examined from the beginning.
• now Marcus is found to satisfy subgoal & thus also the goal.
• So Marcus is reported as the answer.
Logic Programming
• It’s a programming language in which logical assertions are viewed as
programs.
• There are several logic programming systems such as PROLOG.
• PROLOG programs are composed only of Horn Clause.
• HORN CLAUSE: it has at most one positive literal. Ex: p, רp V q and p q…
• It has 2 important consequences:
1. Because of uniform representation a simple & efficient interpreter can
be written.
1. The logic of Horn clause systems is decidable.
Logic Programming
• An input to PROLOG program is a goal to be proved.
• Backward reasoning is applied to try to prove the goal.
• The program is read top to bottom, left to right & search is performed
depth-first with backtracking.
• Figure shows an example of a simple knowledge base represented in
standard logical notation & then in PROLOG.
• They contain 2 types of statements
1. Facts: contain only constants & represent statements about specific
object.
1. Rules: contain variables & represent statements about classes of objects.
Logic Programming
• Figure shows an example of a simple knowledge base represented in
standard logical notation & then in PROLOG
Logic Programming
Logic representation PROLOG representation
• Variables are explicitly • Quantification is provided
quantified implicitly
Variables begin with
uppercase letters and
constants begin with lower
• Explicit symbol for and (Ʌ) & case letters or numbers
or (V) • Explicit symbol for or is (,) but
not present for and.
• Implications are written as p q • Same is written here as q: - p
Logic Programming
PROLOG programs are actually sets of Horn clauses that have been
transformed as follows:
1. If the Horn Clause contains no negative literals, then leave it as it is.
2. Otherwise, rewrite the horn clause as an implication
• combine all negative literals into antecedent of the implication.
• Leaving single positive literal as a consequence.
The PROLOG clause :
P(x):- ¬ Q(x,y)
Is equivalent to the logical expression
Ɐx: Ǝy: Q(x,y) P(x)
PROLOG interpreter has a fixed control strategy
The assertions in PROLOG program define a particular search path to
answer a question.
Logical assertions define only set of answers that they justify.
Logic Programming
• Suppose the problem we are given is to find a value X that satisfies the
predicate apartmentpet(X).
• We state this goal to PROLOG as :
?- apartmentpet(X)
• Assume this as input to program
• The PROLOG interpreter begins looking for :
1. A fact with the predicate apartmentpet
2. A rule for with that predicate as its head.
• In this example no facts , so use rule.
• Rule will succeed if both clauses on its right- hand side can be satisfied
Logic Programming
• The interpreter tries to prove each of these.
• The first rule will fail because there are no assertions (belief) about the
predicate cat in the program.
• Second will eventually lead to success using the rule about dogs & poddles
and using the fact poodle(fluffy)
• This results in variable X being bound to FLUFFY
• Now the second clause small(X) must be checked.
• Since X= FLUFFY, small(FLUFFY) must be proved. & can be done using
poddle(fluffy)
• The program halts with the result apartmentpet(fluffy)
Logic Programming
• Slogical negation cannot be represented explicitly in pure PROLOG
• We cannot encode directly the assertion Ɐx: dog(x) ¬ cat(x)
• Negation is represented as lack of assertion. This leads to negation as
failure problem.
• If PROLOG program is given the goal
?- cat(fluffy)
• It would return FALSE because it is unable to prove that Fluffy is a cat.
• Negation as failure problem can be overcome by Closed world
assumption.
• Closed world assumption : it states that all real assertions are contained
in knowledge base or derived from assertions in knowledge base.
Logic Programming
Advantage
• Programmer needs to specify only rules and facts.
• Search engine is built directly into the language.
DIsAdvantage
• Search control is fixed
• It allows for control of search through a non-logical operator called cut.
A cut can be inserted into a rule to specify a point that may not be
backtracked over.
Forward Vs Backward Reasoning
• The objective of search procedure is to discover a path from an initial
configuration to a goal state.
• PROLOG only searches from goal state.
• There are 2 directions in which such a search could proceed:
1. Forward: from the start states
2. Backward: from the goal states.
• Consider the problem of solving a particular instance of 8-puzzle. The
rules to be used for solving the puzzle can be written as shown in figure.
Forward Vs Backward Reasoning
• Using these rules we can solve the puzzle.
Forward Vs Backward Reasoning
Reason forward from initial states:
1. Build a tree of move sequences by starting with the initial
configurations at root of the tree.
1. Generate next level using rules
• whose left sides match the root node and
• use their right sides to create the new configurations.
3. Generate next level by taking each node generated at the previous level
and
• apply to it all rules whose left sides match it.
4. Continue until a configuration that matches the goal state is generated.
Forward Vs Backward Reasoning
Reason backward from goal states: (goal directed reasoning)
1. Build a tree of move sequences by starting with the goal configurations
at root of the tree.
1. Generate next level using rules
• whose right sides match the root node and
• use the left sides to generate the nodes at this 2nd level of tree.
3. Generate next level by taking each node generated at the previous level
and
• finding all rules whose right sides match it.
• Then use the corresponding left side to generate new nodes.
4. Continue until a node that matches the initial state is generated.
Forward Vs Backward Reasoning
• Most of search techniques can be used to search either forward or
backward.
• We can search both forward & backward simultaneously until two
paths meet in between. (Bi directional search)
• Its convenient if no of nodes grow exponentially with steps.
• Figure shows why bidirectional search may be ineffective.
Forward Vs Backward Reasoning
• Its useful to define 2 classes of rules for forward & backward reasoning:
1. Forward rules: encode knowledge about how to respond to certain
input configurations.
1. Backward rules: encode knowledge about how to achieve particular
goals.
Backward-Chaining Rule Systems
• These are good for goal-directed problem solving .
• For example: a query system would probably use backward chaining to
answer questions.
• In PROLOG, rules are restricted to Horn clauses.
• Rules are matched with unification procedure and in the order they appear.
• Other backward-chaining systems allow for more complex rules.
Forward-Chaining Rule Systems
• These are directed by incoming data.
• Ex: suppose we sense searing heat near our hand we jerk our hand
away.
• This is constructed as goal-directed behavior.
• It is modeled naturally by recognize-act cycle characteristics of forward-
chaining rule systems.
• In forward chaining rule systems, left sides of rules are matched against
the state description.
• Rules that match dump their right-hand side assertions into the state,
and the process repeats.
Forward-Chaining Rule Systems
• Matching is typically more complex for forward-chaining systems than
backward ones.
• Consider a rule that checks for some condition in the state description &
then adds an assertion.
• After the rule fires , its conditions are still valid, so it could fire again
immediately.
• However , we will need some mechanisms to prevent repeated firings,
especially if the state remains unchanged.
• Most forward-chaining systems implement highly efficient matchers
• Also supply several mechanisms for preferring one rule over another.
Combining Forward and Backward Reasoning
• Sometimes certain aspects of a problem are best handled via forward
chaining & other aspects by backward chaining.
• Consider a forward-chaining medical diagnosis program.
• It accepts around 20 facts about a patient’s condition.
• Then forward chain on those facts to deduce the nature / cause of the
disease.
• Suppose left side of rule was satisfied- say 9 of 10 preconditions are met
then backward chaining can be applied to satisfy 10th precondition.
• If arbitrary procedures are allowed as right side of rule then the rule
will not be reversible.
• Some production languages allow only reversible rules.
• When irreversible rules are used then commitment to direction of
search must be made @ the time the rules are written
Matching
Clever search involves choosing from among the rules that can be
applied at a particular point
the ones that are most likely to lead to a solution.
To extract from the entire collection of rules those that can be applied at
a given point requires
some kind of matching between the current state and the preconditions of
the rules.
Indexing
• 1-way to select applicable rules is to do a simple search through all the
rules , comparing each one’s preconditions to the current state and
extracting all the ones that match.
• There are 2 problems with this simple solution:
1. In order to solve very interesting problems , it will be necessary to use
a large number of rules . Scanning through them at every step is
inefficient.
1. It is not always immediately obvious whether a rule’s preconditions are
satisfied by a particular state.
• Instead of searching through rules, use current state as an index into
the rules and select the matching ones immediately.
• Ex: consider legal-move generation rule for chess as shown in figure.
Indexing
• To access the rules we need to assign an index to each board position.
• All the rules that describe given board position will be stored under
same key & so will be found together.
• Matching process is easy but lacks generality.
• It is often better to write rules in more general form .
Matching
• In PROLOG & many theorem proving systems, rules are indexed by the
predicate they contain.
• In chess example, rules can be indexed by pieces and their positions.
• Despite some limitations of this approach , indexing in some form is
very important in efficient operation of rule-based systems.
Matching with Variables
• The problem of selecting applicable rules is made more difficult when
preconditions describe properties that the situation must have.
• Rule based systems compute the whole set of rules, that match the
current state descriptions.
• Backward chaining systems usually use depth-first backtracking to
select individual rules.
• But forward-chaining systems generally employ sophisticated conflict
resolution strategies to choose among the applicable rules.
• It is more efficient to consider many-many match
Matching
• One efficient many-many match algorithm is RETE, which gains
efficiency from 3 major sources:
1. The temporal nature of data.
1. Structural similarity in rules.
1. Persistence of variable binding consistency.
Complex and Approximate Matching
• More complex matching process is required when, preconditions of a
rule, specify , required properties, that are not stated explicitly in
description of the current state.
• An even more complex matching process is required if rules should be
applied if their preconditions approximately match the current situation.
• For few problems, all action lies in matching of rules to the problem
state.
• Ex: ELIZA, an early AI program simulated behavior of Rogerian
therapist.
Complex and Approximate Matching
• A dialogue between ELIZA and user is shown in figure.
Complex and Approximate Matching
• ELIZA’s knowledge about English & psychology was coded in set of
simple rules .
• ELIZA matched left sides of rules against user’s last sentence & using
appropriate right side to generate a response.
• Ex- “ my brother is mean to me”
“who else in your family is mean to you” or “tell me more about your
family”
Some ELIZA like rules
Conflict Resolution
• The result of matching process is a list of rules whose antecedents have
matched the current state description.
• It is useful to incorporate decision making into matching process. This
phase is called conflict resolution.
• There are three basic approaches to problem conflict resolution in a
production system:
1. Assign a preference based on rule that matched.
1. Assign a preference based on objects that matched.
1. Assign a preference based on action that the matched rule would
perform.
Preference based on rule
• There are 2 common ways of assigning a preference based on rules
themselves.
1. First & simplest is, to consider the rules to have been specified in a
particular order.
• Then priority is given to rules in order in which they appear.
2. Other scheme is to give priority to special case rules than general rules.
• Purpose of such rules is to allow to solve problems directly without
search.
• Addition of special rules increases size of search.
• So matcher is built to prevent this. Matcher rejects rules that are more
general.
Preference based on rule
• Now, how can the matcher decide whether its general or not.
• There are few easy ways:
1. If set of preconditions of one rule contains all the preconditions of
another , then the second rule is more general than first.
1. If preconditions of one rule are the same as those of another except that
• in the first case variables are specified
• in the second there are constants,
then the first rule is more general than the second.
Preference based on Object
• To ease burden on search mechanism order the matches based on
importance of objects that are matched.
• Consider again ELIZA.
• The patterns looked for specific combinations of important keywords.
• Often an input sentence contain several of keywords that ELIZA knew.
• ELIZA used the fact that some keywords are more significant than
others.
• The pattern matcher returned match involving highest priority
keyword.
• Priority matching can occur as function of position of matchable objects
in the current state description.
Preference based on States
• Suppose that there are several rules waiting to fire.
• 1-way of selecting them is to fire all of them temporarily & to examine
results of each.
• Then, using heuristic function that can evaluate each of the resulting
states.
compare the merits of the results & select the preferred one.
• Discard the remaining ones.
Control Knowledge
• Knowledge about paths are most likely to lead quickly to a goal state is
often called search control knowledge .
• It can take many forms:
1. Knowledge about which states are more preferable to others.
2. Knowledge about which rule to apply in a given situation.
3. Knowledge about the order in which to pursue subgoals.
4. Knowledge about useful sequences of rules to apply.
There are many ways of representing control knowledge
1. Rules can be labeled and partitioned.
2. Assign cost and probability-of-success measures to rules.
• Search control knowledge is sometimes called meta-knowledge.
• The syntax for 1 type of control rule is shown in figure next.
Control Knowledge
syntax for 1 type of control rule
• Many AI problems represent their control knowledge with rules. 2 such
systems are
SOAR and PRODIGY
Control Knowledge-SOAR
• It’s a general architecture for building intelligent systems.
• It is based on a set of specific , cognitively motivated hypotheses about
the structure of human problem solving.
• These hypothesis are derived from what we know about short-term
memory.
• In SOAR:
1. Long-term memory is stored as a set of productions (or rules)
1. Short- term memory (working memory) is a buffer that is affected by
perceptions and serves as a storage area.
1. All problem-solving activity takes place as state space traversal.
1. All intermediate & final results of problem solving are remembered for
future
reference.
Control Knowledge-PRODIGY
• It’s a general purpose problem solving system.
• Automatically constructs a set of control rules to improve search in a
particular domain.
• PRODIGY can acquire control rules in a number of ways:
1. Through hand coding by programmers.
1. Through a static analysis of the domain’s operators.
1. Through looking at traces of its own problem-solving behavior.
Control Knowledge-PRODIGY
• It learns control rules from its experience & its failures.
• Search control knowledge specifies the order of pursuing sub goals.
• Ex- building a piece of wooden furniture: Wood must be sanded , sealed
& painted.
• Humans know the order.
• AI programs may paint first , later realize that sanding may remove the
paint if sanded after painting.
• Proper control knowledge can prevent this wasted computational effort.
Control Knowledge-PRODIGY
• Rules we might consider include:
1. If a problem’s subgoals include sanding & painting , then we should
solve sanding subgoal first.
1. If subgoals include sealing and painting, then consider type of the object.
• If object is made of wood then seal it before painting.
• Issues concerning control rules
1. Utility problem: as more control knowledge is added to system , system
is able to search more judiciously. This cuts down on number of nodes it
expands.
1. Complexity of production system interpreter: it must know how to
apply various rules and meta-rules.
• Interpreters will become more complex as we progress away from
simple backward-chaining systems.
THANK YOU