BOTTOM-UP PARSING
Introduction to bottom up parser:
Introduction to bottom up parser:
LR(0) LR(0)
LR Parser items
SLR(1)
Bottom up LALR(1) LR(1)
items
parser CLR(1)
Operator
precedence
parser
Bottom-up Parsing
o Bottom-Up Parser : In bottom up parsing we should constructs a
parse tree for an input string beginning at the leaves(the bottom)
and working up towards the root(the top)
o We can think of this process as one of “reducing” a string w to the
start symbol of a grammar.
o Bottom-up parsing is also known as shift-reduce parsing because
its two main actions are shift and reduce.
At each shift action, the current symbol in the input string is
pushed to a stack.
At each reduction step, the symbols at the top of the stack (this
symbol sequence is the right side of a production) will replaced
by the non-terminal at the left side of that production.
Shift-Reduce Parsing
o A shift-reduce parser tries to reduce the given input string into
the starting symbol.
a string the starting symbol
reduced to
o At each reduction step, a substring of the input matching to the
right side of a production rule is replaced by the non-terminal at
the left side of that production rule.
o If the substring is chosen correctly, the right most derivation of
that string is created in the reverse order.
*
rm
Rightmost Derivation: S
rm rm
Shift-Reduce Parser finds: ... S
Shift–Reduce Parsing-Example
o Consider the grammar Input string : abbcde
S aABe aAbcde
A Abc | b aAde reduction
B d aABe
S
By a sequence of four reductions we are able to reduce abbcde to S
Shift–Reduce Parsing-Example
o These reductions in fact trace out the following right-most derivation in reverse
S aABe aAde aAbcde abbcde
rm rm rm rm
Right Sentential Forms
Handle
o Informally, a “handle” of a string is a substring that matches the
right side of the production, and whose reduction to nonterminal
on the left side of the production represents one step along the
reverse of a rightmost derivation
▫ But not every substring matches the right side of a production rule is
handle.
o Formally , a “handle” of a right sentential form γ ( ) is a
production rule A and a position of where the string may
be found and replaced by A to produce the previous right-
*
sentential form in a rightmost
rm rm derivation of .
S A
then Aβ in the position following α is a handle of αβω
o The string to the right of the handle contains only terminal
symbols.
Example
o Consider the example discussed in the beginning, abbcde is a
right sentential form whose handle is Ab at position
2.Likewise,aAbcde is a right sentential form whose handle is
AAbc at position 2.
o Sometimes we say “the substring β is a handle of αβω” if
the position of β and the production Aβ we have in mind
are clear.
Handle Pruning
o A rightmost derivation in reverse can be obtained by “handle pruning". That is, we
start with a string of terminals w that we wish to parse. If ω is a sentence of
grammar at hand, then ω = γ,where γn is the nth right-sentential form of some as
yet unknown rightmost derivation
S = 0 1 2 ... n-1 n=
rm rm rm rm rm
Input string
A Shift-Reduce Parser
E E+T | T Right-Most Derivation of id+id*id
T T*F | F E E+T E+T*F E+T*id E+F*id
F (E) | id E+id*id T+id*id F+id*id id+id*id
Right-Most Sentential HANDLE Reducing Production
form
id+id*id id Fid
F+id*id F TF
T+id*id T ET
E+id*id id Fid
E+F*id F TF
E+T*id Id Fid
E+T*F T*F TT*F
E+T E+T EE+T
E
A Stack Implementation of a Shift-Reduce
Parser
o There are four possible actions of a shift-parser action:
1.Shift : The next input symbol is shifted onto the top of the stack.
2.Reduce: Replace the handle on the top of the stack by the non-
terminal.
3.Accept: Successful completion of parsing.
4.Error: Parser discovers a syntax error, and calls an error
recovery routine.
o Initial stack just contains only the end-marker $.
o The end of the input string is marked by the end-marker $.
A Stack Implementation of A Shift-Reduce
Parser
Stack Input Action
$ id+id*id$ shift Parse
$id +id*id$ Reduce by Fid Tree
$F +id*id$ Reduce by TF E8
$T +id*id$
$E +id*id$
$E+ Id*id$ Reduce by ET +
E3 T 7
$E+id *id$
$E+F *id$
$E+T *id$ *
$E+T* id$ Shift T 2
T 5 F 6
$E+T*i $
d $ Shift
$E+T*F $ F 1 F 4
id
$E+T $ Reduce by Fid
$E Reduce by TF
Shift
Shift id id
Reduce by Fid
LR Parser:
• The LR parsing method is the most general non-backtracking shift-reduce
parsing method known.
• LR parsers can parse a strictly larger class of grammars than (top-down)
predictive parsers.
• LR parsers can usually recognize all programming language construct that
can be specified by context-free grammars.
• LR parsers detect errors fast.
• Drawback: it is too much work to construct an LR parser by hand.
• Fortunately, we can use an LR parser generator such as YACC.
Introduction to bottom up parser:
LR(0) LR(0)
LR Parser items
SLR(1)
Bottom up LALR(1) LR(1)
items
parser CLR(1)
Operator
precedence
parser
An LR-Parser uses-
• states to memorize information during the parsing process,
• an action table to make decision (such as shift or reduce) and to compute states
• a goto table to compute states
• These states are embedded with grammar symbols in a stack. More, precisely, after
eahc iteration, the stack stores a word of the form s0X1s1X2 ... sm-1Xmsm where s0 is the
end-of-stack symbol and sm is the state on top of the stack.For a state s and a
terminal a, the entry action[s, a] has one of the four following forms
• shift s' where s' is a state,
• reduce A ,
• accept,
• error.
LR Parsing model:
Reasons for attractiveness of LR parser
• LR parsers can handle a large class of context-free grammars.
• The LR parsing method is a most general non-back tracking shift-
reduce parsing method.
• An LR parser can detect the syntax errors as soon as they can occur.
• LR grammars can describe more languages than LL grammars..
Drawbacks of LR parsers
• It is too much work to construct LR parser by hand. It needs an automated
parser generator.
• If the grammar contains ambiguities or other constructs then it is difficult
to parse in a left-to-right scan of the input
LR Parser:
• Action This function takes as arguments a state i and a terminal a (or $, the
input end marker). The value of ACTION [i, a] can have one of the four
forms:
• i) Shift j, where j is a state.
• ii) Reduce by a grammar production A—> β.
• iii) Accept.
• iv) Error.
• Goto This function takes a state and grammar symbol as arguments and
produces a state.
• If GOTO [Ii ,A] = Ij, the GOTO also maps a state i and non terminal A to
state j.
LR(0) Items
• An LR(0) item of a grammar G is a production of G with a dot at some
position of the body.
• For example-
• A —> •XYZ
• A —> XeYZ
• A —> XYeZ
• A —> XYZ•
• One collection of set of LR(0) items, called the canonical LR(0) collection,
provides finite automaton that is used to make parsing decisions. Such an
automaton is called an LR(0) automaton.