MSC - Ai Notes
MSC - Ai Notes
defines "man-made," and intelligence defines "thinking power", hence AI means "a man-made
thinking power."
"It is a branch of computer science by which we can create intelligent machines which can
behave like a human, think like humans, and able to make decisions."
Artificial Intelligence exists when a machine can have human based skills such as learning,
reasoning, and solving problems
With Artificial Intelligence you do not need to preprogram a machine to do some work, despite
that you can create a machine with programmed algorithms which can work with own
intelligence, and that is the awesomeness of AI.
It is believed that AI is not a new technology, and some people says that as per Greek myth,
there were Mechanical men in early days which can work and behave like humans.
Before Learning about Artificial Intelligence, we should know that what is the importance of
AI and why should we learn it. Following are some main reasons to learn about AI:
o With the help of AI, you can create such software or devices which can solve real-world
problems very easily and with accuracy such as health issues, marketing, traffic issues,
etc.
o With the help of AI, you can create your personal virtual Assistant, such as Cortana,
Google Assistant, Siri, etc.
o With the help of AI, you can build such Robots which can work in an environment
where survival of humans can be at risk.
o AI opens a path for other new technologies, new devices, and new Opportunities.
Artificial Intelligence is not just a part of computer science even it's so vast and requires lots
of other factors which can contribute to it. To create the AI first we should know that how
intelligence is composed, so the Intelligence is an intangible part of our brain which is a
combination of Reasoning, learning, problem-solving perception, language
understanding, etc.
To achieve the above factors for a machine or software Artificial Intelligence requires the
following discipline:
o Mathematics
o Biology
o Psychology
o Sociology
o Computer Science
o Neurons Study
o Statistics
Every technology has some disadvantages, and thesame goes for Artificial intelligence. Being
so advantageous technology still, it has some disadvantages which we need to keep in our mind
while creating an AI system. Following are the disadvantages of AI:
o High Cost: The hardware and software requirement of AI is very costly as it requires
lots of maintenance to meet current world requirements.
o Can't think out of the box: Even we are making smarter machines with AI, but still
they cannot work out of the box, as the robot will only do that work for which they are
trained, or programmed.
o No feelings and emotions: AI machines can be an outstanding performer, but still it
does not have the feeling so it cannot make any kind of emotional attachment with
human, and may sometime be harmful for users if the proper care is not taken.
o Increase dependency on machines: With the increment of technology, people are
getting more dependent on devices and hence they are losing their mental capabilities.
o No Original Creativity: As humans are so creative and can imagine some new ideas
but still AI machines cannot beat this power of human intelligence and cannot be
creative and imaginative.
Artificial Intelligence is not a new word and not a new technology for researchers. This
technology is much older than you would imagine. Even there are the myths of Mechanical
men in Ancient Greek and Egyptian Myths. Following are some milestones in the history of
AI which defines the journey from the AI generation to till date development.
Maturation of Artificial Intelligence (1943-1952)
o Year 1943: The first work which is now recognized as AI was done by Warren
McCulloch and Walter pits in 1943. They proposed a model of artificial neurons.
o Year 1949: Donald Hebb demonstrated an updating rule for modifying the connection
strength between neurons. His rule is now called Hebbian learning.
o Year 1950: The Alan Turing who was an English mathematician and pioneered
Machine learning in 1950. Alan Turing publishes "Computing Machinery and
Intelligence" in which he proposed a test. The test can check the machine's ability to
exhibit intelligent behavior equivalent to human intelligence, called a Turing test.
o Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial
intelligence program"Which was named as "Logic Theorist". This program had
proved 38 of 52 Mathematics theorems, and find new and more elegant proofs for some
theorems.
o Year 1956: The word "Artificial Intelligence" first adopted by American Computer
scientist John McCarthy at the Dartmouth Conference. For the first time, AI coined as
an academic field.
At that time high-level computer languages such as FORTRAN, LISP, or COBOL were
invented. And the enthusiasm for AI was very high at that time.
o Year 1966: The researchers emphasized developing algorithms which can solve
mathematical problems. Joseph Weizenbaum created the first chatbot in 1966, which
was named as ELIZA.
o Year 1972: The first intelligent humanoid robot was built in Japan which was named
as WABOT-1.
o The duration between years 1974 to 1980 was the first AI winter duration. AI winter
refers to the time period where computer scientist dealt with a severe shortage of
funding from government for AI researches.
o During AI winters, an interest of publicity on artificial intelligence was decreased.
A boom of AI (1980-1987)
o Year 1980: After AI winter duration, AI came back with "Expert System". Expert
systems were programmed that emulate the decision-making ability of a human expert.
o In the Year 1980, the first national conference of the American Association of Artificial
Intelligence was held at Stanford University.
o The duration between the years 1987 to 1993 was the second AI Winter duration.
o Again Investors and government stopped in funding for AI research as due to high cost
but not efficient result. The expert system such as XCON was very cost effective.
o Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary
Kasparov, and became the first computer to beat a world chess champion.
o Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum
cleaner.
o Year 2006: AI came in the Business world till the year 2006. Companies like
Facebook, Twitter, and Netflix also started using AI.
o Year 2011: In the year 2011, IBM's Watson won jeopardy, a quiz show, where it had
to solve the complex questions as well as riddles. Watson had proved that it could
understand natural language and can solve tricky questions quickly.
o Year 2012: Google has launched an Android app feature "Google now", which was
able to provide information to the user as a prediction.
o Year 2014: In the year 2014, Chatbot "Eugene Goostman" won a competition in the
infamous "Turing test."
o Year 2018: The "Project Debater" from IBM debated on complex topics with two
master debaters and also performed extremely well.
o Google has demonstrated an AI program "Duplex" which was a virtual assistant and
which had taken hairdresser appointment on call, and lady on other side didn't notice
that she was talking with the machine.
Now AI has developed to a remarkable level. The concept of Deep learning, big data, and data
science are now trending like a boom. Nowadays companies like Google, Facebook, IBM, and
Amazon are working with AI and creating amazing devices. The future of Artificial
Intelligence is inspiring and will come with high intelligence.
Artificial Intelligence can be divided in various types, there are mainly two types of main
categorization which are based on capabilities and based on functionally of AI. Following is
flow diagram which explain the types of AI.
2. General AI:
o General AI is a type of intelligence which could perform any intellectual task with
efficiency like a human.
o The idea behind the general AI to make such a system which could be smarter and think
like a human by its own.
o Currently, there is no such system exist which could come under general AI and can
perform any task as perfect as a human.
o The worldwide researchers are now focused on developing machines with General AI.
o As systems with general AI are still under research, and it will take lots of efforts and
time to develop such systems.
3. Super AI:
1. Reactive Machines
o Purely reactive machines are the most basic types of Artificial Intelligence.
o Such AI systems do not store memories or past experiences for future actions.
o These machines only focus on current scenarios and react on it as per possible best
action.
o IBM's Deep Blue system is an example of reactive machines.
o Google's AlphaGo is also an example of reactive machines.
2. Limited Memory
o Limited memory machines can store past experiences or some data for a short period
of time.
o These machines can use stored data for a limited time period only.
o Self-driving cars are one of the best examples of Limited Memory systems. These cars
can store recent speed of nearby cars, the distance of other cars, speed limit, and other
information to navigate the road.
3. Theory of Mind
o Theory of Mind AI should understand the human emotions, people, beliefs, and be able
to interact socially like humans.
o This type of AI machines are still not developed, but researchers are making lots of
efforts and improvement for developing such AI machines.
4. Self-Awareness
An AI system can be defined as the study of the rational agent and its environment. The
agents sense the environment through sensors and act on their environment through actuators.
An AI agent can have mental properties such as knowledge, belief, intention, etc.
What is an Agent?
An agent can be anything that perceiveits environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking,
and acting. An agent can be:
o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors
and hand, legs, vocal tract work for actuators.
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for
sensors and various motors for actuators.
o Software Agent: Software agent can have keystrokes, file contents as sensory input
and act on those inputs and display output on the screen.
Hence the world around us is full of agents such as thermostat, cellphone, camera, and even
we are also agents.
Before moving forward, we should first know about sensors, effectors, and actuators.
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into motion. The
actuators are only responsible for moving and controlling a system. An actuator can be an
electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be legs,
wheels, arms, fingers, wings, fins, and display screen.
Intelligent Agents:
An intelligent agent is an autonomous entity which act upon an environment using sensors and
actuators for achieving goals. An intelligent agent may learn from the environment to achieve
their goals. A thermostat is an example of an intelligent agent.
Rational Agent:
A rational agent is an agent which has clear preference, models uncertainty, and acts in a way
to maximize its performance measure with all possible actions.
A rational agent is said to perform the right things. AI is about creating rational agents to use
for game theory and decision theory for various real-world scenarios.
For an AI agent, the rational action is most important because in AI reinforcement learning
algorithm, for each best possible action, agent gets the positive reward and for each wrong
action, an agent gets a negative reward.
Note: Rational agents in AI are very similar to intelligent agents.
Rationality:
The rationality of an agent is measured by its performance measure. Rationality can be judged
on the basis of following points:
Structure of an AI Agent
The task of AI is to design an agent program which implements the agent function. The
structure of an intelligent agent is a combination of architecture and agent program. It can be
viewed as:
Following are the main three terms involved in the structure of an AI agent:
PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or
rational agent, then we can group its properties under PEAS representation model. It is made
up of four words:
o P: Performance measure
o E: Environment
o A: Actuators
o S: Sensors
Here performance measure is the objective for the success of an agent's behavior.
Types of AI Agents
Agents can be grouped into five classes based on their degree of perceived intelligence and
capability. All these agents can improve their performance and generate better action over the
time. These are given below:
o The Simple reflex agents are the simplest agents. These agents take decisions on the
basis of the current percepts and ignore the rest of the percept history.
o These agents only succeed in the fully observable environment.
o The Simple reflex agent does not consider any part of percepts history during their
decision and action process.
o The Simple reflex agent works on Condition-action rule, which means it maps the
current state to action. Such as a Room Cleaner agent, it works only if there is dirt in
the room.
o Problems for the simple reflex agent design approach:
o They have very limited intelligence
o They do not have knowledge of non-perceptual parts of the current state
o Mostly too big to generate and to store.
o Not adaptive to changes in the environment.
o The Model-based agent can work in a partially observable environment, and track the
situation.
o A model-based agent has two important factors:
o Model: It is knowledge about "how things happen in the world," so it is called
a Model-based agent.
o Internal State: It is a representation of the current state based on percept
history.
o These agents have the model, "which is knowledge of the world" and based on the
model they perform actions.
o Updating the agent state requires information about:
a. How the world evolves
b. How the agent's action affects the world.
3. Goal-based agents
o The knowledge of the current state environment is not always sufficient to decide for
an agent to what to do.
o The agent needs to know its goal which describes desirable situations.
o Goal-based agents expand the capabilities of the model-based agent by having the
"goal" information.
o They choose an action, so that they can achieve the goal.
o These agents may have to consider a long sequence of possible actions before deciding
whether the goal is achieved or not. Such considerations of different scenario are called
searching and planning, which makes an agent proactive.
4. Utility-based agents
o These agents are similar to the goal-based agent but provide an extra component of
utility measurement which makes them different by providing a measure of success at
a given state.
o Utility-based agent act based not only goals but also the best way to achieve the goal.
o The Utility-based agent is useful when there are multiple possible alternatives, and an
agent has to choose in order to perform the best action.
o The utility function maps each state to a real number to check how efficiently each
action achieves the goals.
5. Learning Agents
o A learning agent in AI is the type of agent which can learn from its past experiences,
or it has learning capabilities.
o It starts to act with basic knowledge and then able to act and adapt automatically
through learning.
o A learning agent has mainly four conceptual components, which are:
a. Learning element: It is responsible for making improvements by learning from
environment
b. Critic: Learning element takes feedback from critic which describes that how
well the agent is doing with respect to a fixed performance standard.
c. Performance element: It is responsible for selecting external action
d. Problem generator: This component is responsible for suggesting actions that
will lead to new and informative experiences.
o Hence, learning agents are able to learn, analyze performance, and look for new ways
to improve the performance.
Agent Environment in AI
An environment is everything in the world which surrounds the agent, but it is not a part of an
agent itself. An environment can be described as a situation in which an agent is present.
The environment is where agent lives, operate and provide the agent with something to sense
and act upon it. An environment is mostly said to be non-feministic.
Features of Environment
o If an agent sensor can sense or access the complete state of an environment at each
point of time then it is a fully observable environment, else it is partially observable.
o A fully observable environment is easy as there is no need to maintain the internal state
to keep track history of the world.
o An agent with no sensors in all environments then such an environment is called
as unobservable.
2. Deterministic vs Stochastic:
o If an agent's current state and selected action can completely determine the next state
of the environment, then such environment is called a deterministic environment.
o A stochastic environment is random in nature and cannot be determined completely by
an agent.
o In a deterministic, fully observable environment, agent does not need to worry about
uncertainty.
3. Episodic vs Sequential:
o In an episodic environment, there is a series of one-shot actions, and only the current
percept is required for the action.
o However, in Sequential environment, an agent requires memory of past actions to
determine the next best actions.
4. Single-agent vs Multi-agent
o If only one agent is involved in an environment, and operating by itself then such an
environment is called single agent environment.
o However, if multiple agents are operating in an environment, then such an environment
is called a multi-agent environment.
o The agent design problems in the multi-agent environment are different from single
agent environment.
5. Static vs Dynamic:
o If the environment can change itself while an agent is deliberating then such
environment is called a dynamic environment else it is called a static environment.
o Static environments are easy to deal because an agent does not need to continue looking
at the world while deciding for an action.
o However for dynamic environment, agents need to keep looking at the world at each
action.
o Taxi driving is an example of a dynamic environment whereas Crossword puzzles are
an example of a static environment.
6. Discrete vs Continuous:
o If in an environment there are a finite number of percepts and actions that can be
performed within it, then such an environment is called a discrete environment else it
is called continuous environment.
o A chess gamecomes under discrete environment as there is a finite number of moves
that can be performed.
o A self-driving car is an example of a continuous environment.
7. Known vs Unknown
o Known and unknown are not actually a feature of an environment, but it is an agent's
state of knowledge to perform an action.
o In a known environment, the results for all actions are known to the agent. While in
unknown environment, agent needs to learn how it works in order to perform an action.
o It is quite possible that a known environment to be partially observable and an
Unknown environment to be fully observable.
8. Accessible vs Inaccessible
o If an agent can obtain complete and accurate information about the state's environment,
then such an environment is called an Accessible environment else it is called
inaccessible.
o An empty room whose state can be defined by its temperature is an example of an
accessible environment.
o Information about an event on earth is an example of Inaccessible environment.
Search algorithms are one of the most important areas of Artificial Intelligence. This topic will
explain all about the search algorithms in AI.
Problem-solving agents:
Optimality: If a solution found for an algorithm is guaranteed to be the best solution (lowest
path cost) among all other solutions, then such a solution for is said to be an optimal solution.
Time Complexity: Time complexity is a measure of time for an algorithm to complete its task.
Space Complexity: It is the maximum storage space required at any point during the search,
as the complexity of the problem.
Based on the search problems we can classify the search algorithms into uninformed
(Blind search) search and informed search (Heuristic search) algorithms.
Uninformed/Blind Search:
The uninformed search does not contain any domain knowledge such as closeness, the location
of the goal. It operates in a brute-force way as it only includes information about how to traverse
the tree and how to identify leaf and goal nodes. Uninformed search applies a way in which
search tree is searched without any information about the search space like initial state
operators and test for the goal, so it is also called blind search.It examines each node of the tree
until it achieves the goal node.
o Breadth-first search
o Uniform cost search
o Depth-first search
o Iterative deepening depth-first search
o Bidirectional Search
Informed Search
Informed search can solve much complex problem which could not be solved in another way.
1. Greedy Search
2. A* Search
1. Breadth-first Search
2. Depth-first Search
3. Depth-limited Search
4. Iterative deepening depth-first search
5. Uniform cost search
6. Bidirectional Search
1. Breadth-first Search:
o Breadth-first search is the most common search strategy for traversing a tree or graph.
This algorithm searches breadthwise in a tree or graph, so it is called breadth-first
search.
o BFS algorithm starts searching from the root node of the tree and expands all successor
node at the current level before moving to nodes of next level.
o The breadth-first search algorithm is an example of a general-graph search algorithm.
o Breadth-first search implemented using FIFO queue data structure.
Advantages:
o If there are more than one solutions for a given problem, then BFS will provide the
minimal solution which requires the least number of steps.
Disadvantages:
o It requires lots of memory since each level of the tree must be saved into memory to
expand the next level.
o BFS needs lots of time if the solution is far away from the root node.
Example:
In the below tree structure, we have shown the traversing of the tree using BFS algorithm from
the root node S to goal node K. BFS search algorithm traverse in layers, so it will follow the
path which is shown by the dotted arrow, and the traversed path will be:
1. S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
Time Complexity: Time Complexity of BFS algorithm can be obtained by the number of
nodes traversed in BFS until the shallowest Node. Where the d= depth of shallowest solution
and b is a node at every state.
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of
frontier which is O(bd).
Completeness: BFS is complete, which means if the shallowest goal node is at some finite
depth, then BFS will find a solution.
Optimality: BFS is optimal if path cost is a non-decreasing function of the depth of the node.
2. Depth-first Search
o Depth-first search isa recursive algorithm for traversing a tree or graph data structure.
o It is called the depth-first search because it starts from the root node and follows each
path to its greatest depth node before moving to the next path.
o DFS uses a stack data structure for its implementation.
o The process of the DFS algorithm is similar to the BFS algorithm.
Note: Backtracking is an algorithm technique for finding all possible solutions using recursion.
Advantage:
o DFS requires very less memory as it only needs to store a stack of the nodes on the path
from root node to the current node.
o It takes less time to reach to the goal node than BFS algorithm (if it traverses in the
right path).
Disadvantage:
o There is the possibility that many states keep re-occurring, and there is no guarantee of
finding the solution.
o DFS algorithm goes for deep down searching and sometime it may go to the infinite
loop.
Example:
In the below search tree, we have shown the flow of depth-first search, and it will follow the
order as:
It will start searching from root node S, and traverse A, then B, then D and E, after traversing
E, it will backtrack the tree as E has no other successor and still goal node is not found. After
backtracking it will traverse node C and then G, and here it will terminate as it found goal node.
Completeness: DFS search algorithm is complete within finite state space as it will expand
every node within a limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the
algorithm. It is given by:
Where, m= maximum depth of any node and this can be much larger than d (Shallowest
solution depth)
Space Complexity: DFS algorithm needs to store only single path from the root node, hence
space complexity of DFS is equivalent to the size of the fringe set, which is O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a large number of steps or
high cost to reach to the goal node.
o Standard failure value: It indicates that problem does not have any solution.
o Cutoff failure value: It defines no solution for the problem within a given depth limit.
Advantages:
Disadvantages:
Example:
Completeness: DLS search algorithm is complete if the solution is above the depth-limit.
Optimal: Depth-limited search can be viewed as a special case of DFS, and it is also not
optimal even if ℓ>d.
Uniform-cost search is a searching algorithm used for traversing a weighted tree or graph. This
algorithm comes into play when a different cost is available for each edge. The primary goal
of the uniform-cost search is to find a path to the goal node which has the lowest cumulative
cost. Uniform-cost search expands nodes according to their path costs form the root node. It
can be used to solve any graph/tree where the optimal cost is in demand. A uniform-cost search
algorithm is implemented by the priority queue. It gives maximum priority to the lowest
cumulative cost. Uniform cost search is equivalent to BFS algorithm if the path cost of all edges
is the same.
Advantages:
o Uniform cost search is optimal because at every state the path with the least cost is
chosen.
Disadvantages:
o It does not care about the number of steps involve in searching and only concerned
about path cost. Due to which this algorithm may be stuck in an infinite loop.
Example:
Completeness:
Uniform-cost search is complete, such as if there is a solution, UCS will find it.
Time Complexity:
Let C* is Cost of the optimal solution, and ε is each step to get closer to the goal node. Then
the number of steps is = C*/ε+1. Here we have taken +1, as we start from state 0 and end to
C*/ε.
Space Complexity:
The same logic is for space complexity so, the worst-case space complexity of Uniform-cost
search is O(b1 + [C*/ε]).
Optimal:
Uniform-cost search is always optimal as it only selects a path with the lowest path cost.
The iterative deepening algorithm is a combination of DFS and BFS algorithms. This search
algorithm finds out the best depth limit and does it by gradually increasing the limit until a goal
is found.
This algorithm performs depth-first search up to a certain "depth limit", and it keeps increasing
the depth limit after each iteration until the goal node is found.
This Search algorithm combines the benefits of Breadth-first search's fast search and depth-
first search's memory efficiency.
The iterative search algorithm is useful uninformed search when search space is large, and
depth of goal node is unknown.
Advantages:
o Itcombines the benefits of BFS and DFS search algorithm in terms of fast search and
memory efficiency.
Disadvantages:
o The main drawback of IDDFS is that it repeats all the work of the previous phase.
Example:
Following tree structure is showing the iterative deepening depth-first search. IDDFS algorithm
performs various iterations until it does not find the goal node. The iteration performed by the
algorithm is given as:
1'st Iteration-----> A
2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G
4'th Iteration------>A, B, D, H, I, E, C, F, K, G
In the fourth iteration, the algorithm will find the goal node.
Completeness:
Time Complexity:
Let's suppose b is the branching factor and depth is d then the worst-case time complexity
is O(bd).
Space Complexity:
Optimal:
IDDFS algorithm is optimal if path cost is a non- decreasing function of the depth of the node.
Bidirectional search algorithm runs two simultaneous searches, one form initial state
called as forward-search and other from goal node called as backward-search, to find the
goal node. Bidirectional search replaces one single search graph with two small
subgraphs in which one starts the search from an initial vertex and other starts from goal
vertex. The search stops when these two graphs intersect each other.
Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.
Advantages:
Disadvantages:
Example:
In the below search tree, bidirectional search algorithm is applied. This algorithm divides one
graph/tree into two sub-graphs. It starts traversing from node 1 in the forward direction and
starts from goal node 16 in the backward direction.
The informed search algorithm is more useful for large search space. Informed search
algorithm uses the idea of heuristic, so it is also called Heuristic search.
Heuristics function: Heuristic is a function which is used in Informed Search, and it finds the
most promising path. It takes the current state of the agent as its input and produces the
estimation of how close agent is from the goal. The heuristic method, however, might not
always give the best solution, but it guaranteed to find a good solution in reasonable time.
Heuristic function estimates how close a state is to the goal. It is represented by h(n), and it
calculates the cost of an optimal path between the pair of states. The value of the heuristic
function is always positive.
Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost should be
less than or equal to the estimated cost.
Greedy best-first search algorithm always selects the path which appears best at that moment.
It is the combination of depth-first search and breadth-first search algorithms. It uses the
heuristic function and search. Best-first search allows us to take the advantages of both
algorithms. With the help of best-first search, at each step, we can choose the most promising
node. In the best first search algorithm, we expand the node which is closest to the goal node
and the closest cost is estimated by heuristic function, i.e.
1. f(n)= g(n).
o Best first search can switch between BFS and DFS by gaining the advantages of both
the algorithms.
o This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:
Example:
Consider the below search problem, and we will traverse it using greedy best-first search. At
each iteration, each node is expanded using evaluation function f(n)=h(n) , which is given in
the below table.
In this search example, we are using two lists which are OPEN and CLOSED Lists. Following
are the iteration for traversing the above example.
Time Complexity: The worst case time complexity of Greedy best first search is O(bm).
Space Complexity: The worst case space complexity of Greedy best first search is O(b m).
Where, m is the maximum depth of the search space.
Complete: Greedy best-first search is also incomplete, even if the given state space is finite.
A* search is the most commonly known form of best-first search. It uses heuristic function
h(n), and cost to reach the node n from the start state g(n). It has combined features of UCS
and greedy best-first search, by which it solve the problem efficiently. A* search algorithm
finds the shortest path through the search space using the heuristic function. This search
algorithm expands less search tree and provides optimal result faster. A* algorithm is similar
to UCS except that it uses g(n)+h(n) instead of g(n).
In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence we
can combine both costs as following, and this sum is called as a fitness number.
At each point in the search space, only those node is expanded which have the lowest value of
f(n), and the algorithm terminates when the goal node is found.
Algorithm of A* search:
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation function
(g+h), if node n is goal node then return success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list. For each
successor n', check whether n' is already in the OPEN or CLOSED list, if not then compute
evaluation function for n' and place into Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the back
pointer which reflects the lowest g(n') value.
Advantages:
Disadvantages:
o It does not always produce the shortest path as it mostly based on heuristics and
approximation.
o A* search algorithm has some complexity issues.
o The main drawback of A* is memory requirement as it keeps all generated nodes in the
memory, so it is not practical for various large-scale problems.
Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value
of all states is given in the below table so we will calculate the f(n) of each state using the
formula f(n)= g(n) + h(n), where g(n) is the cost to reach any node from start state.
Here we will use OPEN and CLOSED list.
Solution:
Iteration3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}
Iteration 4 will give the final result, as S--->A--->C--->G it provides the optimal path with
cost 6.
Points to remember:
o A* algorithm returns the path which occurred first, and it does not search for all
remaining paths.
o The efficiency of A* algorithm depends on the quality of heuristic.
o A* algorithm expands all nodes which satisfy the condition f(n)<="" li="">
o Admissible: the first condition requires for optimality is that h(n) should be an
admissible heuristic for A* tree search. An admissible heuristic is optimistic in nature.
o Consistency: Second required condition is consistency for only A* graph-search.
If the heuristic function is admissible, then A* tree search will always find the least cost path.
Time Complexity: The time complexity of A* search algorithm depends on heuristic function,
and the number of nodes expanded is exponential to the depth of solution d. So the time
complexity is O(b^d), where b is the branching factor.
Best-first search is what the AO* algorithm does. The AO* method divides any given
difficult problem into a smaller group of problems that are then resolved using the AND-
OR graph concept. AND OR graphs are specialized graphs that are used in problems that
can be divided into smaller problems. The AND side of the graph represents a set of tasks
that must be completed to achieve the main goal, while the OR side of the graph
represents different methods for accomplishing the same main goal.
AND-OR Graph
In the above figure, the buying of a car may be broken down into smaller problems or tasks
that can be accomplished to achieve the main goal in the above figure, which is an example of
a simple AND-OR graph. The other task is to either steal a car that will help us accomplish the
main goal or use your own money to purchase a car that will accomplish the main goal. The
AND symbol is used to indicate the AND part of the graphs, which refers to the need that all
subproblems containing the AND to be resolved before the preceding node or issue may be
finished.
The start state and the target state are already known in the knowledge-
based search strategy known as the AO* algorithm, and the best path is identified by heuristics.
The informed search technique considerably reduces the algorithm’s time complexity. The
AO* algorithm is far more effective in searching AND-OR trees than the A* algorithm.
here,
g(n) = the cost from the initial node to the current node.
h(n) = estimated cost from the current node to the goal state.
STEP1:
STEP2:
STEP3:
f(C⇢H+I) is selected as the path with the lowest cost and the heuristic is also left
unchanged
because it matches the actual cost. Paths H & I are solved because the heuristic for those
paths is 0,
but Path A⇢D needs to be calculated because it has an AND.
as we can see that path f(A⇢C+D) is get solved and this tree has become a solved tree
now.
In simple words, the main flow of this algorithm is that we have to find firstly, the level 1
heuristic value and then level 2 and after that update the values with going upward means
towards the root node.
o Hill climbing algorithm is a local search algorithm which continuously moves in the
direction of increasing elevation/value to find the peak of the mountain or best solution
to the problem. It terminates when it reaches a peak value where no neighbor has a
higher value.
o Hill climbing algorithm is a technique which is used for optimizing the mathematical
problems. One of the widely discussed examples of Hill climbing algorithm is
Traveling-salesman Problem in which we need to minimize the distance traveled by the
salesman.
o It is also called greedy local search as it only looks to its good immediate neighbor state
and not beyond that.
o A node of hill climbing algorithm has two components which are state and value.
o Hill Climbing is mostly used when a good heuristic is available.
o In this algorithm, we don't need to maintain and handle the search tree or graph as it
only keeps a single current state.
o Generate and Test variant: Hill Climbing is the variant of Generate and Test method.
The Generate and Test method produce feedback which helps to decide which direction
to move in the search space.
o Greedy approach: Hill-climbing algorithm search moves in the direction which
optimizes the cost.
o No backtracking: It does not backtrack the search space, as it does not remember the
previous states.
On Y-axis we have taken the function which can be an objective function or cost function, and
state-space on the x-axis. If the function on Y-axis is cost then, the goal of search is to find the
global minimum and local minimum. If the function of Y-axis is Objective function, then the
goal of the search is to find the global maximum and local maximum.
Different regions in the state space landscape:
Local Maximum: Local maximum is a state which is better than its neighbor states, but there
is also another state which is higher than it.
Global Maximum: Global maximum is the best possible state of state space landscape. It has
the highest value of objective function.
Flat local maximum: It is a flat space in the landscape where all the neighbor states of current
states have the same value.
Simple hill climbing is the simplest way to implement a hill climbing algorithm. It only
evaluates the neighbor node state at a time and selects the first one which optimizes
current cost and set it as a current state. It only checks it's one successor state, and if it finds
better than the current state, then move else be in the same state. This algorithm has the
following features:
o Step 1: Evaluate the initial state, if it is goal state then return success and Stop.
o Step 2: Loop Until a solution is found or there is no new operator left to apply.
o Step 3: Select and apply an operator to the current state.
o Step 4: Check new state:
a. If it is goal state, then return success and quit.
b. Else if it is better than the current state then assign new state as a current state.
c. Else if not better than the current state, then return to step2.
o Step 5: Exit.
The steepest-Ascent algorithm is a variation of simple hill climbing algorithm. This algorithm
examines all the neighboring nodes of the current state and selects one neighbor node which is
closest to the goal state. This algorithm consumes more time as it searches for multiple
neighbors
o Step 1: Evaluate the initial state, if it is goal state then return success and stop, else
make current state as initial state.
o Step 2: Loop until a solution is found or the current state does not change.
a. Let SUCC be a state such that any successor of the current state will be better
than it.
b. For each operator that applies to the current state:
a. Apply the new operator and generate a new state.
b. Evaluate the new state.
c. If it is goal state, then return it and quit, else compare it to the SUCC.
d. If it is better than SUCC, then set new state as SUCC.
e. If the SUCC is better than the current state, then set current state to
SUCC.
o Step 5: Exit.
Stochastic hill climbing does not examine for all its neighbor before moving. Rather, this search
algorithm selects one neighbor node at random and decides whether to choose it as a current
state or examine another state.
1. Local Maximum: A local maximum is a peak state in the landscape which is better than
each of its neighboring states, but there is another state also present which is higher than the
local maximum.
Solution: Backtracking technique can be a solution of the local maximum in state space
landscape. Create a list of the promising path so that the algorithm can backtrack the search
space and explore other paths as well.
2. Plateau: A plateau is the flat area of the search space in which all the neighbor states of the
current state contains the same value, because of this algorithm does not find any best direction
to move. A hill-climbing search might be lost in the plateau area.
Solution: The solution for the plateau is to take big steps or very little steps while searching,
to solve the problem. Randomly select a state which is far away from the current state so it is
possible that the algorithm could find non-plateau region.
3. Ridges: A ridge is a special form of the local maximum. It has an area which is higher than
its surrounding areas, but itself has a slope, and cannot be reached in a single move.
Solution: With the use of bidirectional search, or by moving in different directions, we can
improve this problem.
Simulated Annealing:
A hill-climbing algorithm which never makes a move towards a lower value guaranteed to be
incomplete because it can get stuck on a local maximum. And if algorithm applies a random
walk, by moving a successor, then it may complete but not efficient. Simulated Annealing is
an algorithm which yields both efficiency and completeness.
o We have studied the strategies which can reason either in forward or backward, but a
mixture of the two directions is appropriate for solving a complex and large problem.
Such a mixed strategy, make it possible that first to solve the major part of a problem
and then go back and solve the small problems arise during combining the big parts of
the problem. Such a technique is called Means-Ends Analysis.
o Means-Ends Analysis is problem-solving techniques used in Artificial intelligence for
limiting search in AI programs.
o It is a mixture of Backward and forward search technique.
o The MEA technique was first introduced in 1961 by Allen Newell, and Herbert A.
Simon in their problem-solving computer program, which was named as General
Problem Solver (GPS).
o The MEA analysis process centered on the evaluation of the difference between the
current state and goal state.
The means-ends analysis process can be applied recursively for a problem. It is a strategy to
control search in problem-solving. Following are the main Steps which describes the working
of MEA technique for solving a problem.
a. First, evaluate the difference between Initial State and final State.
b. Select the various operators which can be applied for each difference.
c. Apply the operator at each difference, which reduces the difference between the current
state and goal state.
Operator Subgoaling
In the MEA process, we detect the differences between the current state and goal state. Once
these differences occur, then we can apply an operator to reduce the differences. But sometimes
it is possible that an operator cannot be applied to the current state. So we create the subproblem
of the current state, in which operator can be applied, such type of backward chaining in which
operators are selected, and then sub goals are set up to establish the preconditions of the
operator is called Operator Subgoaling.
Algorithm for Means-Ends Analysis:
Let's we take Current state as CURRENT and Goal State as GOAL, then following are the
steps for the MEA algorithm.va Program for BeginnersNext
o Step 1: Compare CURRENT to GOAL, if there are no differences between both then
return Success and Exit.
o Step 2: Else, select the most significant difference and reduce it by doing the following
steps until the success or failure occurs.
a. Select a new operator O which is applicable for the current difference, and if
there is no such operator, then signal failure.
b. Attempt to apply operator O to CURRENT. Make a description of two states.
i) O-Start, a state in which O?s preconditions are satisfied.
ii) O-Result, the state that would result if O were applied In O-start.
c. If
(First-Part <------ MEA (CURRENT, O-START)
And
(LAST-Part <----- MEA (O-Result, GOAL), are successful, then signal
Success and return the result of combining FIRST-PART, O, and LAST-PART.
The above-discussed algorithm is more suitable for a simple problem and not adequate for
solving complex problems.
Let's take an example where we know the initial state and goal state as given below. In this
problem, we need to get the goal state by finding differences between the initial state and goal
state and applying operators.
Solution:
To solve the above problem, we will first find the differences between initial states and goal
states, and for each difference, we will generate a new state and will apply the operators. The
operators we have for this problem are:
o Move
o Delete
o Expand
1. Evaluating the initial state: In the first step, we will evaluate the initial state and will
compare the initial and Goal state to find the differences between both states.
2. Applying Delete operator: As we can check the first difference is that in goal state there is
no dot symbol which is present in the initial state, so, first we will apply the Delete operator to
remove this dot.
3. Applying Move Operator: After applying the Delete operator, the new state occurs which
we will again compare with goal state. After comparing these states, there is another difference
that is the square is outside the circle, so, we will apply the Move Operator.
4. Applying Expand Operator: Now a new state is generated in the third step, and we will
compare this state with the goal state. After comparing the states there is still one difference
which is the size of the square, so, we will apply Expand operator, and finally, it will generate
the goal state.
Adversarial search is a search, where we examine the problem which arises when we try
to plan ahead of the world and other agents are planning against us.
o In previous topics, we have studied the search strategies which are only associated with
a single agent that aims to find the solution which often expressed in the form of a
sequence of actions.
o But, there might be some situations where more than one agent is searching for the
solution in the same search space, and this situation usually occurs in game playing.
o The environment with more than one agent is termed as multi-agent environment, in
which each agent is an opponent of other agent and playing against each other. Each
agent needs to consider the action of other agent and effect of that action on their
performance.
o So, Searches in which two or more players with conflicting goals are trying to
explore the same search space for the solution, are called adversarial searches,
often known as Games.
o Games are modeled as a Search problem and heuristic evaluation function, and these
are the two main factors which help to model and solve games in AI.
o Perfect information: A game with the perfect information is that in which agents can
look into the complete board. Agents have all the information about the game, and they
can see each other moves also. Examples are Chess, Checkers, Go, etc.
o Imperfect information: If in a game agents do not have all information about the game
and not aware with what's going on, such type of games are called the game with
imperfect information, such as tic-tac-toe, Battleship, blind, Bridge, etc.
o Deterministic games: Deterministic games are those games which follow a strict
pattern and set of rules for the games, and there is no randomness associated with them.
Examples are chess, Checkers, Go, tic-tac-toe, etc.
o Non-deterministic games: Non-deterministic are those games which have various
unpredictable events and has a factor of chance or luck. This factor of chance or luck is
introduced by either dice or cards. These are random, and each action response is not
fixed. Such games are also called as stochastic games.
Example: Backgammon, Monopoly, Poker, etc.
Zero-Sum Game
A game can be defined as a type of search in AI which can be formalized of the following
elements:
Game tree:
A game tree is a tree where nodes of the tree are the game states and Edges of the tree are the
moves by players. Game tree involves initial state, actions function, and result Function.
The following figure is showing part of the game-tree for tic-tac-toe game. Following are some
key points of the game:
o From the initial state, MAX has 9 possible moves as he starts first. MAX place x and
MIN place o, and both player plays alternatively until we reach a leaf node where one
player has three in a row or all squares are filled.
o Both players will compute each node, minimax, the minimax value which is the best
achievable utility against an optimal adversary.
o Suppose both the players are well aware of the tic-tac-toe and playing the best play.
Each player is doing his best to prevent another one from winning. MIN is acting
against Max in the game.
o So in the game tree, we have a layer of Max, a layer of MIN, and each layer is called
as Ply. Max place x, then MIN puts o to prevent Max from winning, and this game
continues until the terminal node.
o In this either MIN wins, MAX wins, or it's a draw. This game-tree is the whole search
space of possibilities that MIN and MAX are playing tic-tac-toe and taking turns
alternately.
o It aims to find the optimal strategy for MAX to win the game.
o It follows the approach of Depth-first search.
o In the game tree, optimal leaf node could appear at any depth of the tree.
o Propagate the minimax values up to the tree until the terminal node discovered.
In a given game tree, the optimal strategy can be determined from the minimax value of each
node, which can be written as MINIMAX(n). MAX prefer to move to a state of maximum
value and MIN prefer to move to a state of minimum value then:
Initial call:
Minimax(node, 3, true)
o The working of the minimax algorithm can be easily described using an example.
Below we have taken an example of game-tree which is representing the two-player
game.
o In this example, there are two players one is called Maximizer and other is called
Minimizer.
o Maximizer will try to get the Maximum possible score, and Minimizer will try to get
the minimum possible score.
o This algorithm applies DFS, so in this game-tree, we have to go all the way through the
leaves to reach the terminal nodes.
o At the terminal node, the terminal values are given so we will compare those value and
backtrack the tree until the initial state occurs.
o Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist),
in the finite search tree.
o Optimal- Min-Max algorithm is optimal if both opponents are playing optimally.
o Time complexity- As it performs DFS for the game-tree, so the time complexity of
Min-Max algorithm is O(bm), where b is branching factor of the game-tree, and m is
the maximum depth of the tree.
o Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS
which is O(bm).
The main drawback of the minimax algorithm is that it gets really slow for complex games
such as Chess, go, etc. This type of games has a huge branching factor, and the player has lots
of choices to decide. This limitation of the minimax algorithm can be improved from alpha-
beta pruning
Alpha-Beta Pruning
The main condition which required for alpha-beta pruning is: α>=β
o An intelligent agent needs knowledge about the real world for taking decisions
and reasoning to act efficiently.
o Knowledge-based agents are those agents who have the capability of maintaining an
internal state of knowledge, reason over that knowledge, update their knowledge
after observations and take actions. These agents can represent the world with
some formal representation and act intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.
The above diagram is representing a generalized architecture for a knowledge-based agent. The
knowledge-based agent (KBA) take input from the environment by perceiving the
environment. The input is taken by the inference engine of the agent and which also
communicate with KB to decide as per the knowledge store in KB. The learning element of
KBA regularly updates the KB by learning new knowledge.
Knowledge-base is required for updating knowledge for an agent to learn with experiences and
take action as per the knowledge.
Inference system
Inference means deriving new sentences from old. Inference system allows us to add a new
sentence to the knowledge base. A sentence is a proposition about the world. Inference system
applies logical rules to the KB to deduce new information.
Inference system generates new facts so that an agent can update the KB. An inference system
works mainly in two rules which are given as:
o Forward chaining
o Backward chaining
Following are three operations which are performed by KBA in order to show the
intelligent behavior:
1. TELL: This operation tells the knowledge base what it perceives from the
environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
1. function KB-AGENT(percept):
2. persistent: KB, a knowledge base
3. t, a counter, initially 0, indicating time
4. TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
5. Action = ASK(KB, MAKE-ACTION-QUERY(t))
6. TELL(KB, MAKE-ACTION-SENTENCE(action, t))
7. t=t+1
8. return action
The knowledge-based agent takes percept as input and returns an action as output. The agent
maintains the knowledge base, KB, and it initially has some background knowledge of the real
world. It also has a counter to indicate the time for the whole process, and this counter is
initialized with zero.
Each time when the function is called, it performs its three operations:
The MAKE-ACTION-QUERY generates a sentence to ask which action should be done at the
current time.
MAKE-ACTION-SENTENCE generates a sentence which asserts that the chosen action was
executed.
A knowledge-based agent can be viewed at different levels which are given below:
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we need to
specify what the agent knows, and what the agent goals are. With these specifications, we can
fix its behavior. For example, suppose an automated taxi agent needs to go from a station A to
station B, and he knows the way from A to B, so this comes at the knowledge level.
2. Logical level:
At this level, we understand that how the knowledge representation of knowledge is stored. At
this level, sentences are encoded into different logics. At the logical level, an encoding of
knowledge into logical sentences occurs. At the logical level we can expect to the automated
taxi agent to reach to the destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation level agent
perform actions as per logical and knowledge level. At this level, an automated taxi agent
actually implement his knowledge and logic so that he can reach to the destination.
Humans are best at understanding, reasoning, and interpreting knowledge. Human knows
things, which is knowledge and as per their knowledge they perform various actions in the
real world. But how machines do all these things comes under knowledge representation
and reasoning. Hence we can describe Knowledge representation as following:
Knowledge representation and reasoning (KR, KRR) is the part of Artificial intelligence
which concerned with AI agents thinking and how thinking contributes to intelligent
behavior of agents.
It is responsible for representing information about the real world so that a computer can
understand and can utilize this knowledge to solve the complex real world problems such
as diagnosis a medical condition or communicating with humans in natural language.
It is also a way which describes how we can represent knowledge in artificial intelligence.
Knowledge representation is not just storing data into some database, but it also enables an
intelligent machine to learn from that knowledge and experiences so that it can behave
intelligently like a human.
What to Represent:
Object: All the facts about objects in our world domain. E.g., Guitars contains strings,
trumpets are brass instruments.
Facts: Facts are the truths about the real world and what we represent.
Types of knowledge
1. Declarative Knowledge:
2. Procedural Knowledge
It describes relationships between various concepts such as kind of, part of, and grouping
of something.
Knowledge of real-worlds plays a vital role in intelligence and same for creating artificial
intelligence. Knowledge plays an important role in demonstrating intelligent behavior in
AI agents. An agent is only able to accurately act on some input when he has some
knowledge or experience about that input.
Let's suppose if we met some person who is speaking in a language which we don't know,
then how we will able to act on that. The same thing applies to the intelligent behavior of
the agents. Figure shows that , there is one decision maker which act by sensing the
environment and using knowledge. But if the knowledge part will not present then, it cannot
display intelligent behavior.
AI knowledge cycle:
An Artificial intelligence system has the following components for displaying intelligent
behavior:
Perception
Learning
Knowledge Representation and Reasoning
Planning
Execution
The above diagram is showing how an AI system can interact with the real world and what
components help it to show intelligence. AI system has Perception component by which it
retrieves information from its environment. It can be visual, audio or another form of
sensory input. The learning component is responsible for learning from data captured by
Perception comportment. In the complete cycle, the main components are knowledge
representation and Reasoning. These two components are involved in showing the
intelligence in machine-like humans. These two components are independent with each
other but also coupled together. The planning and execution depend on analysis of
Knowledge representation and reasoning.
There are mainly four approaches to knowledge representation, which are given below:
It is the simplest way of storing facts which uses the relational method, and each fact about
a set of the object is set out systematically in columns.
Player1 65 23
Player2 58 18
Player3 75 24
2. Inheritable knowledge:
In the inheritable knowledge approach, all data must be stored into a hierarchy of classes.
This approach contains inheritable knowledge which shows a relation between instance and
class, and it is called instance relation.
Every individual frame can represent the collection of attributes and its value.
3. Inferential knowledge:
It guaranteed correctness.
1. Marcus is a man
2. All men are mortal
man(Marcus)
4. Procedural knowledge: Procedural knowledge approach uses small programs and codes
which describes how to do specific things, and how to proceed.
In this approach, one important rule is used which is If-Then rule.
In this knowledge, we can use various coding languages such as LISP language and Prolog
language.
But it is not necessary that we can represent all cases in this approach.
1. Representational Accuracy:
KR system should have the ability to represent all kind of required knowledge.
2. Inferential Adequacy:
KR system should have ability to manipulate the representational structures to produce new
knowledge corresponding to existing structure.
3. Inferential Efficiency:
The ability to direct the inferential knowledge mechanism into the most productive
directions by storing appropriate guides.
4. Acquisitional efficiency- The ability to acquire the new knowledge easily using
automatic methods.
There are mainly four ways of knowledge representation which are given as follows:
1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
1. Logical Representation
Logical representation is a language with some concrete rules which deals with propositions
and has no ambiguity in representation. Logical representation means drawing a conclusion
based on various conditions. This representation lays down some important communication
rules. It consists of precisely defined syntax and semantics which supports the sound inference.
Each sentence can be translated into logics using syntax and semantics.
Syntax:
o Syntaxes are the rules which decide how we can construct legal sentences in the logic.
o It determines which symbol we can use in knowledge representation.
o How to write those symbols.
Semantics:
o Semantics are the rules by which we can interpret the sentence in the logic.
o Semantic also involves assigning a meaning to each sentence.
Next
a. Propositional Logics
b. Predicate logics
1. Logical representations have some restrictions and are challenging to work with.
2. Logical representation technique may not be very natural, and inference may not be so
efficient.
2. Semantic Network Representation
Semantic networks are alternative of predicate logic for knowledge representation. In Semantic
networks, we can represent our knowledge in the form of graphical networks. This network
consists of nodes representing objects and arcs which describe the relationship between those
objects. Semantic networks can categorize the object in different forms and can also link those
objects. Semantic networks are easy to understand and can be easily extended.
Example: Following are some statements which we need to represent in the form of nodes and
arcs.
Statements:
a. Jerry is a cat.
b. Jerry is a mammal
c. Jerry is owned by Priya.
d. Jerry is brown colored.
e. All Mammals are animal.
In the above diagram, we have represented the different type of knowledge in the form of nodes
and arcs. Each object is connected with another object by some relation.
1. Semantic networks take more computational time at runtime as we need to traverse the
complete network tree to answer some questions. It might be possible in the worst case
scenario that after traversing the entire tree, we find that the solution does not exist in
this network.
2. Semantic networks try to model human-like memory (Which has 1015 neurons and
links) to store the information, but in practice, it is not possible to build such a vast
semantic network.
3. These types of representations are inadequate as they do not have any equivalent
quantifier, e.g., for all, for some, none, etc.
4. Semantic networks do not have any standard definition for the link names.
5. These networks are not intelligent and depend on the creator of the system.
3. Frame Representation
A frame is a record like structure which consists of a collection of attributes and its values to
describe an entity in the world. Frames are the AI data structure which divides knowledge into
substructures by representing stereotypes situations. It consists of a collection of slots and slot
values. These slots may be of any type and sizes. Slots have names and values which are called
facets.
Facets: The various aspects of a slot is known as Facets. Facets are features of frames which
enable us to put constraints on the frames. Example: IF-NEEDED facts are called when data
of any particular slot is needed. A frame may consist of any number of slots, and a slot may
include any number of facets and facets may have any number of values. A frame is also known
as slot-filter knowledge representation in artificial intelligence.
Frames are derived from semantic networks and later evolved into our modern-day classes and
objects. A single frame is not much useful. Frames system consist of a collection of frames
which are connected. In the frame, knowledge about an object or event can be stored together
in the knowledge base. The frame is a type of technology which is widely used in various
applications including Natural language processing and machine visions.
Example: 1
Slots Filters
Page 1152
Example 2:
Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and his age is
25, he lives in city London, and the country is England. So following is the frame representation
for this:
Slots Filter
Name Peter
Profession Doctor
Age 25
Weight 78
1. The frame knowledge representation makes the programming easier by grouping the
related data.
2. The frame representation is comparably flexible and used by many applications in AI.
3. It is very easy to add slots for new attribute and relations.
4. It is easy to include default data and to search for missing values.
5. Frame representation is easy to understand and visualize.
4. Production Rules
Production rules system consist of (condition, action) pairs which mean, "If condition then
action". It has mainly three parts:
The working memory contains the description of the current state of problems-solving and rule
can write knowledge to the working memory. This knowledge match and may fire other rules.
If there is a new situation (state) generates, then multiple production rules will be fired together,
this is called conflict set. In this situation, the agent needs to select a rule from these sets, and
it is called a conflict resolution.
Example:
o IF (at bus stop AND bus arrives) THEN action (get into the bus)
o IF (on the bus AND paid AND empty seat) THEN action (sit down).
o IF (on bus AND unpaid) THEN action (pay charges).
o IF (bus arrives at destination) THEN action (get down from the bus).
1. Production rule system does not exhibit any learning capabilities, as it does not store
the result of the problem for the future uses.
2. During the execution of the program, many rules may be active hence rule-based
production systems are inefficient.
Propositional logic (PL) is the simplest form of logic where all the statements are made by
propositions. A proposition is a declarative statement which is either true or false. It is a
technique of knowledge representation in logical and mathematical form.
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
The syntax of propositional logic defines the allowable sentences for the knowledge
representation. There are two types of Propositions:
a. Atomic Propositions
b. Compound propositions
Example:
Example:
Logical connectives are used to connect two simpler propositions or representing a sentence
logically. We can create compound propositions with the help of logical connectives. There are
mainly five connectives, which are given as follows:
Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible
scenarios. We can combine all the possible combination with logical connectives, and the
representation of these combinations in a tabular format is called Truth table. Following are
the truth table for all logical connectives:
Truth table with three propositions:
We can build a proposition composing three propositions P, Q, and R. This truth table is made-
up of 8n Tuples as we have taken three proposition symbols.
Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors or logical
operators. This order should be followed while evaluating a propositional problem. Following
is the list of the precedence order for operators:
Precedence Operators
o We cannot represent relations like ALL, some, or none with propositional logic.
Example:
a. All the girls are intelligent.
b. Some apples are sweet.
o Propositional logic has limited expressive power.
o In propositional logic, we cannot describe statements in terms of their properties or
logical relationships.
Inference:
In artificial intelligence, we need intelligent computers which can create new logic from old
logic or by evidence, so generating the conclusions from evidence and facts is termed as
Inference.
Inference rules:
Inference rules are the templates for generating valid arguments. Inference rules are applied to
derive proofs in artificial intelligence, and the proof is a sequence of the conclusion that leads
to the desired goal.
In inference rules, the implication among all the connectives plays an important role. Following
are some terminologies related to inference rules:.2M576Difference between JK,
o
o 2. Modus Tollens:
o The Modus Tollens rule state that if P→ Q is true and ¬ Q is true, then ¬ P will also
true. It can be represented as:
o
o Statement-1: "If I am sleepy then I go to bed" ==> P→ Q
Statement-2: "I do not go to the bed."==> ~Q
Statement-3: Which infers that "I am not sleepy" => ~P
o Proof by Truth table:
o
o 3. Hypothetical Syllogism:
o The Hypothetical Syllogism rule state that if P→R is true whenever P→Q is true, and
Q→R is true. It can be represented as the following notation:
o Example:
o Implication: It is one of the logical connectives which can be represented as P → Q. It
is a Boolean expression.
o Converse: The converse of implication, which means the right-hand side proposition
goes to the left-hand side and vice-versa. It can be written as Q → P.
o Contrapositive: The negation of converse is termed as contrapositive, and it can be
represented as ¬ Q → ¬ P.
o Inverse: The negation of implication is called inverse. It can be represented as ¬ P →
¬ Q.
From the above term some of the compound statements are equivalent to each other, which we
can prove using truth table:
Hence from the above truth table, we can prove that P → Q is equivalent to ¬ Q → ¬ P, and
Q→ P is equivalent to ¬ P → ¬ Q.
1. Modus Ponens:
The Modus Ponens rule is one of the most important rules of inference, and it states that if P
and P → Q is true, then we can infer that Q will be true. It can be represented as:
Example:
Statement-1: If you have my home key then you can unlock my home. P→Q
Statement-2: If you can unlock my home then you can take my money. Q→R
Conclusion: If you have my home key then you can take my money. P→R
The Disjunctive syllogism rule state that if P∨Q is true, and ¬P is true, then Q will be true. It
can be represented as:
Example:
5. Addition:
The Addition rule is one the common inference rule, and it states that If P is true, then P∨Q
will be true.
Example:
Proof by Truth-Table:
6. Simplification:
The simplification rule state that if P∧ Q is true, then Q or P will also be true. It can be
represented as:
Proof by Truth-Table:
7. Resolution:
The Resolution rule state that if P∨Q and ¬ P∧R is true, then Q∨R will also be true. It can be
represented as
Proof by Truth-Table:
In the topic of Propositional logic, we have seen that how to represent statements using
propositional logic. But unfortunately, in propositional logic, we can only represent the facts,
which are either true or false. PL is not sufficient to represent the complex sentences or natural
language statements. The propositional logic has very limited expressive power. Consider the
following sentence, which we cannot represent using PL logic.
First-Order logic:
The syntax of FOL determines which collection of symbols is a logical expression in first-order
logic. The basic syntactic elements of first-order logic are symbols. We write statements in
short-hand notation in FOL.
Variables x, y, z, a, b,....
Equality ==
Quantifier ∀, ∃
Atomic sentences:Atomic sentences are the most basic sentences of first-order logic. These
sentences are formed from a predicate symbol followed by a parenthesis with a sequence of
terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Complex Sentences:
Consider the statement: "x is an integer.", it consists of two parts, the first part x is the
subject of the statement and second part "is an integer," is known as a predicate.
Universal quantifier is a symbol of logical representation, which specifies that the statement
within its range is true for everything or every instance of a particular thing.
o For all x
o For each x
o For every x.
Example:
Let a variable x which refers to a cat so all x can be represented in UOD as below:
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within its
scope is true for at least one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a
predicate variable then it is called as an existential quantifier.
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
Example:
It will be read as: There are some x where x is a boy who is intelligent.
Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
The quantifiers interact with variables which appear in a suitable way. There are two types of
variables in First-order logic which are given below:
Free Variable: A variable is said to be a free variable in a formula if it occurs outside the scope
of the quantifier.
Example: ∀x ∃(y)[P (x, y, z)], where z is a free variable.
Bound Variable: A variable is said to be a bound variable in a formula if it occurs within the
scope of the quantifier.
Example2:
Alternative Notes on resolution:-
Steps for Resolution:
To better understand all the above steps, we will take an example in which we will apply
resolution.
Example:
In the first step we will convert all the given statements into its first order logic.
In First order logic resolution, it is required to convert the FOL into CNF as CNF form makes
easier for resolution proofs.
o Eliminate all implication (→) and rewrite
a. ∀x ¬ food(x) V likes(John, x)
b. food(Apple) Λ food(vegetables)
c. ∀x ∀y ¬ [eats(x, y) Λ ¬ killed(x)] V food(y)
d. eats (Anil, Peanuts) Λ alive(Anil)
e. ∀x ¬ eats(Anil, x) V eats(Harry, x)
f. ∀x¬ [¬ killed(x) ] V alive(x)
g. ∀x ¬ alive(x) V ¬ killed(x)
h. likes(John, Peanuts).
o Move negation (¬)inwards and rewrite
a. ∀x ¬ food(x) V likes(John, x)
b. food(Apple) Λ food(vegetables)
c. ∀x ∀y ¬ eats(x, y) V killed(x) V food(y)
d. eats (Anil, Peanuts) Λ alive(Anil)
e. ∀x ¬ eats(Anil, x) V eats(Harry, x)
f. ∀x ¬killed(x) ] V alive(x)
g. ∀x ¬ alive(x) V ¬ killed(x)
h. likes(John, Peanuts).
o Rename variables or standardize variables
a. ∀x ¬ food(x) V likes(John, x)
b. food(Apple) Λ food(vegetables)
c. ∀y ∀z ¬ eats(y, z) V killed(y) V food(z)
d. eats (Anil, Peanuts) Λ alive(Anil)
e. ∀w¬ eats(Anil, w) V eats(Harry, w)
f. ∀g ¬killed(g) ] V alive(g)
g. ∀k ¬ alive(k) V ¬ killed(k)
h. likes(John, Peanuts).
o Eliminate existential instantiation quantifier by elimination.
In this step, we will eliminate existential quantifier ∃, and this process is known
as Skolemization. But in this example problem since there is no existential quantifier
so all the statements will remain same in this step.
o Drop Universal quantifiers.
In this step we will drop all universal quantifier since all the statements are not
implicitly quantified so we don't need it.
a. ¬ food(x) V likes(John, x)
b. food(Apple)
c. food(vegetables)
d. ¬ eats(y, z) V killed(y) V food(z)
e. eats (Anil, Peanuts)
f. alive(Anil)
g. ¬ eats(Anil, w) V eats(Harry, w)
h. killed(g) V alive(g)
i. ¬ alive(k) V ¬ killed(k)
j. likes(John, Peanuts).
In this statement, we will apply negation to the conclusion statements, which will be written as
¬likes(John, Peanuts)
Now in this step, we will solve the problem by resolution tree using substitution. For the above
problem, it will be given as follows:
Hence the negation of the conclusion has been proved as a complete contradiction with the
given set of statements.
o In the first step of resolution graph, ¬likes(John, Peanuts) , and likes(John, x) get
resolved(canceled) by substitution of {Peanuts/x}, and we are left with ¬
food(Peanuts)
o In the second step of the resolution graph, ¬ food(Peanuts) , and food(z) get resolved
(canceled) by substitution of { Peanuts/z}, and we are left with ¬ eats(y, Peanuts) V
killed(y) .
o In the third step of the resolution graph, ¬ eats(y, Peanuts) and eats (Anil,
Peanuts) get resolved by substitution {Anil/y}, and we are left with Killed(Anil) .
o In the fourth step of the resolution graph, Killed(Anil) and ¬ killed(k) get resolve by
substitution {Anil/k}, and we are left with ¬ alive(Anil) .
o In the last step of the resolution graph ¬ alive(Anil) and alive(Anil) get resolved.
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence, which
applies logical rules to the knowledge base to infer new information from known facts. The
first inference engine was part of the expert system. Inference engine commonly proceeds in
two modes, which are:
a. Forward chaining
b. Backward chaining
Horn clause and definite clause are the forms of sentences, which enables knowledge base to
use a more restricted and efficient inference algorithm. Logical inference algorithms use
forward and backward chaining approaches, which require KB in the form of the first-order
definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive literal is
known as a definite clause or strict horn clause.
Horn clause: A clause which is a disjunction of literals with at most one positive literal is
known as horn clause. Hence all the definite clauses are horn clauses.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when
using an inference engine. Forward chaining is a form of reasoning which start with atomic
sentences in the knowledge base and applies inference rules (Modus Ponens) in the forward
direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are
satisfied, and add their conclusion to the known facts. This process repeats until the problem is
solved.
Properties of Forward-Chaining:
Consider the following famous example which we will use in both approaches:
Example:
"As per the law, it is a crime for an American to sell weapons to hostile nations. Country
A, an enemy of America, has some missiles, and all the missiles were sold to it by Robert,
who is an American citizen."
o solve the above problem, first, we will convert all the above facts into first-order definite
clauses, and then we will use a forward-chaining algorithm to reach the goal.
o It is a crime for an American to sell weapons to hostile nations. (Let's say p, q, and r are
variables)
American (p) ∧ weapon(q) ∧ sells (p, q, r) ∧ hostile(r) → Criminal(p) ...(1)
o Country A has some missiles. ?p Owns(A, p) ∧ Missile(p). It can be written in two
definite clauses by using Existential Instantiation, introducing new Constant T1.
Owns(A, T1) ......(2)
Missile(T1) .......(3)
o All of the missiles were sold to country A by Robert.
?p Missiles(p) ∧ Owns (A, p) → Sells (Robert, p, A) ......(4)
o Missiles are weapons.
Missile(p) → Weapons (p) .......(5)
o Enemy of America is known as hostile.
Enemy(p, America) →Hostile(p) ........(6)
o Country A is an enemy of America.
Enemy (A, America) .........(7)
o Robert is American
American(Robert). ..........(8)
In the first step we will start with the known facts and will choose the sentences which do not
have implications, such as: American(Robert), Enemy(A, America), Owns(A, T1), and
Missile(T1). All these facts will be represented as below.
Step-2:
At the second step, we will see those facts which infer from available facts and with satisfied
premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which infers
from the conjunction of Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers from
Rule-(7).
Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1, r/A}, so
we can add Criminal(Robert) which infers all the available facts. And hence we reached our
goal statement.
B. Backward Chaining:
Example:
In backward-chaining, we will use the same above example, and will rewrite all the rules.
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal(Robert), and
then infer further rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer other facts,
and at last, we will prove those facts true. So our goal fact is "Robert is Criminal," so following
is the predicate of it.
Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So as we
can see in Rule-1, the goal predicate Criminal (Robert) is present with substitution {Robert/P}.
So we will add all the conjunctive facts below the first level and will replace p with Robert.
Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as it
satisfies Rule-(5). Weapon (q) is also true with the substitution of a constant T1 at q.
Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which
satisfies the Rule- 4, with the substitution of A in place of r. So these two statements are proved
here.
Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies Rule- 6.
And hence all the statements are proved true using backward chaining.
1. Forward chaining starts from known Backward chaining starts from the goal
facts and applies inference rule to and works backward through inference
extract more data unit it reaches to the rules to find the required facts that
goal. support the goal.
5. Forward chaining tests for all the Backward chaining only tests for few
available rules required rules.
6. Forward chaining is suitable for the Backward chaining is suitable for
planning, monitoring, control, and diagnostic, prescription, and debugging
interpretation application. application.
9. Forward chaining is aimed for any Backward chaining is only aimed for the
conclusion. required data.
Observational learning, also known as social learning or vicarious learning, is a type of
learning in which individuals acquire new behaviors or knowledge by watching and imitating
others.
Definition: It is a process where humans or animals learn from each other’s actions, gestures,
and behavior. It is a psychological theory that explains the way people learn through
observation, imitation, and modeling.
Let's take an example to further explain the concept of observational learning. If a child sees
their parents performing a task in a particular manner, then the child is likely to learn that task
and mimic their parents. This is how observational learning works.
Explanation: The theory of observational learning was first proposed by the psychologist
Albert Bandura in the 1960s. According to Bandura, there are four key elements that are
involved in observational learning:
Attention – the individual needs to pay attention to the behavior that is being
demonstrated.
Retention – the individual needs to remember or retain the information or behavior
they have observed.
Reproduction – the individual needs to be able to reproduce the behavior that they
have observed.
Motivation – the individual needs to be motivated to reproduce the behavior that they
have observed.
For example, an AI system can learn to play a game by observing the behavior of a human
player. The AI system can observe the human's movements, decision-making process, and
strategies, and then use that information to improve its own performance in the game.
Another example is autonomous vehicles. Autonomous vehicles can learn by observing the
driving behavior of human drivers. The vehicles can observe how humans navigate through
different traffic scenarios, make decisions, and respond to different road conditions.
Autonomous vehicles can then use this observational learning to improve their own driving
capabilities.
Uncertainty:
Till now, we have learned knowledge representation using first-order logic and propositional
logic with certainty, which means we were sure about the predicates. With this knowledge
representation, we might write A→B, which means if A is true then B is true, but consider a
situation where we are not sure about whether A is true or not then we cannot express this
statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we need
uncertain reasoning or probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
Probabilistic reasoning:
In the real world, there are lots of scenarios, where the certainty of something is not confirmed,
such as "It will rain today," "behavior of someone for some situations," "A match between two
teams or two players." These are probable sentences for which we can assume that it will
happen but not sure about it, so here we use probabilistic reasoning.
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:
o Bayes' rule
o Bayesian Statistics
Probability: Probability can be defined as a chance that an uncertain event will occur. It is the
numerical measure of the likelihood that an event will occur. The value of probability always
remains between 0 and 1 that represent ideal uncertainties.
We can find the probability of an uncertain event by using the below formula.
Sample space: The collection of all possible events is called sample space.
Random variables: Random variables are used to represent the events and objects in the real
world.
Prior probability: The prior probability of an event is probability computed before observing
new information.
Posterior Probability: The probability that is calculated after all evidence or information has
taken into account. It is a combination of prior probability and new information.
Conditional probability:
Conditional probability is a probability of occurring an event when another event has already
happened.
Let's suppose, we want to calculate the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:
Bayes' theorem:
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which
determines the probability of an event with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two
random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian
inference is an application of Bayes' theorem, which is fundamental to Bayesian statistics.
Application of Bayes' theorem in Artificial intelligence:
o It is used to calculate the next step of the robot when the already executed step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.
Dumpster-Shafer theory(DST)
Uncertainty is a pervasive aspect of AI systems, as they often deal with incomplete or
conflicting information. Dempster–Shafer Theory, named after its inventors Arthur P.
Dempster and Glenn Shafer, offers a mathematical framework to represent and reason with
uncertain information. By utilizing belief functions, Dempster–Shafer Theory in Artificial
Intelligence systems enables them to handle imprecise and conflicting evidence, making it a
powerful tool in decision-making processes. While traditional probability theory is limited to
assigning probabilities to mutually exclusive single events, DST extends this to sets of events
in a finite discrete space. This generalization allows DST to handle evidence associated with
multiple possible events, enabling it to represent uncertainty in a more meaningful way. DST
also provides a more flexible and precise approach to handling uncertain information without
relying on additional assumptions about events within an evidential set.
Three crucial points illustrate the nature of uncertainty within this theory:
3. Mass Function: The mass function, denoted as m(K), quantifies the belief assigned to a set
of hypotheses, denoted as K. It provides a measure of uncertainty by allocating probabilities
to various hypotheses, reflecting the degree of support each hypothesis has from the available
evidence.
Inductive Learning
An technique of machine learning called inductive learning trains a model to generate
predictions based on examples or observations. During inductive learning, the model picks up
knowledge from particular examples or instances and generalizes it such that it can predict
outcomes for brand-new data.
When using inductive learning, a rule or method is not explicitly programmed into the model.
Instead, the model is trained to spot trends and connections in the input data and then utilize
this knowledge to predict outcomes from fresh data. Making a model that can precisely
anticipate the result of subsequent instances is the aim of inductive learning.
In supervised learning situations, where the model is trained using labeled data, inductive
learning is frequently utilized. A series of samples with the proper output labels are used to
train the model. The model then creates a mapping between the input data and the output data
using this training data. The output for fresh instances may be predicted using the model after
it has been trained.
Inductive learning is used by a number of well-known machine learning algorithms, such as
decision trees, k-nearest neighbors, and neural networks. Because it enables the development
of models that can accurately anticipate new data, even when the underlying patterns and
relationships are complicated and poorly understood, inductive learning is an essential method
for machine learning.
Advantages
Because inductive learning models are flexible and adaptive, they are well suited
for handling difficult, complex, and dynamic information.
Finding hidden patterns and relationships in data: Inductive learning models are
ideally suited for tasks like pattern recognition and classification because they
can identify links and patterns in data that may not be immediately apparent to
humans.
Huge datasets − Inductive learning models are suitable for applications requiring
the processing of massive quantities of data because they can efficiently handle
enormous volumes of data.
Appropriate for situations where the rules are ambiguous − Since inductive
learning models may learn from examples without explicit programming, they
are suitable for situations when the rules are not precisely described or
understood beforehand.
Disadvantages
May overfit to particular data − Inductive learning models that have overfit to
specific training data, or that have learned the noise in the data rather than the
underlying patterns, may perform badly on fresh data.
computationally costly possible − The employment of inductive learning models
in real-time applications may be constrained by their computationally costly
nature, especially for complex datasets.
Limited interpretability − Inductive learning models may be difficult to
understand, making it difficult to understand how they arrive at their predictions,
in applications where the decision-making process must be transparent and
explicable.
Inductive learning models are only as good as the data they are trained on,
therefore if the data is inaccurate or inadequate, the model may not perform
effectively.
Deductive Learning
Deductive learning is a method of machine learning in which a model is built using a series of
logical principles and steps. In deductive learning, the model is specifically designed to adhere
to a set of guidelines and processes in order to produce predictions based on brand-new,
unexplored data.
In rule-based systems, expert systems, and knowledge-based systems, where the rules and
processes are clearly set by domain experts, deductive learning is frequently utilized. The
model is trained to adhere to the guidelines and processes in order to derive judgments or
predictions from the input data.
Deductive learning begins with a set of rules and processes and utilizes these rules to generate
predictions on incoming data, in contrast to inductive learning, which learns from particular
examples. Making a model that can precisely adhere to a set of guidelines and processes in
order to generate predictions is the aim of deductive learning.
Deductive learning is used by a number of well-known machine learning algorithms, such as
decision trees, rule-based systems, and expert systems. Deductive learning is a crucial machine
learning strategy because it enables the development of models that can generate precise
predictions in accordance with predetermined rules and guidelines.
Advantages
More effective − Since deductive learning begins with broad concepts and
applies them to particular cases, it is frequently quicker than inductive learning.
Deductive learning can sometimes yield more accurate findings than inductive
learning since it starts with certain principles and applies them to the data.
Deductive learning is more practical when data are sparse or challenging to
collect since it requires fewer data than inductive learning.
Disadvantages
Deductive learning is constrained by the rules that are currently in place, which
may be insufficient or obsolete.
Deductive learning is not appropriate for complicated issues that lack precise
rules or correlations between variables, nor is it appropriate for ambiguous
problems.
Results that are biased − The quality of the rules and knowledge base, which
might add biases and mistakes to the results, determines how accurate deductive
learning is.
The Main Distinctions Between Inductive and Deductive Learning in Machine Learning are
Outlined in the Following Table
Model Find correlations and patterns in obey clearly stated guidelines and
Creation data. instructions
Goal Using fresh data, generalizing, and Make a model that precisely complies
making predictions. with the given guidelines and instructions.