UNIT-3
Bottom-up parsing:- In Bottom-up parsing, the parse tree for the input string is constructed beginning at the
leaves and working up towards the root. Bottom-up parsing involves "reducing an input string 'w' to the start
symbol of the grammar".
Sentential form:- If A=> β , β is said to be in sentential form if β contains terminals (or) non-terminals.
Here, the symbol '=>' means a number of derivation steps used.
Left-sentential form:-if A=>β ,using leftmost derivation then β is a left-sentential form of the grammar.
Right-sentential form:-if A=> β , using rightmost derivation and β may contain non-terminals, then β is a
right-sentential form of the grammar.
Handle:- A Handle is a sub-string of a string such that the sub-string matches the right side of a Production
and it‟s (sub-string) reduction to the non-terminal on the left side of the production indicates a step along the
reverse of a rightmost derivation.
Handle pruning:-A Handle in a parse tree is a complete leftmost sub tree that consists of interior node & it's
children. The act of removing the children of non-terminal 'A' from the sub tree is called as 'Pruning the
Handle'. If 'W' is a string of some grammar G, then the RMD of 'W' is
S => α 0 => α 1 => . . . . α n-1 => α n
where αn is nothing but the string 'W'. The RMD in reverse order i.e.; 'α n' through 'S' is constructed by first
selecting a handle ' βn' within αn and then replacing βn with the non-terminal An. (An--> βn)
This process is repeated until we successfully reduce the input string to start symbol of the grammar. The
sequence of productions applied in reverse is the rightmost derivation for the input string.
Example: EE+E | E*E | (E) | id W= " id* id + id"
Right sentential form substring(handle) Reducing production
id * id + id id E->id
E * id + id id E->id
E * E + id E*E E->E*E
E + id id E->id
E+E E+E E->E+E
E start symbol
Types of Bottom up Parsing:-
Shift-Reduce parsing: Shift-Reduce parser uses the bottom-up parsing technique to construct a syntax tree
for an input string W. The construction of the syntax for an input string W of a grammar G begins from the
leaf nodes and ends with the starting node or root node.
To implement a shift-reduce parser is to use a stack to hold grammar symbols and an input buffer to hold the
string W to be parsed. We use $ to mark the bottom of the stack and also the right end of the input. Initially,
the stack is empty, and the string W is on the input, as follows
STACK INPUT
$ W$
The parser operators by shifting zero (or) more input symbols onto the stack until a handle β is on top of the
stack. The parser then reduces β to the left side of the appropriate production. The parser repeats this cycle
until it has detected an error (or) until the stack contains the start symbol and the input is empty.
A Shift-reduce parser uses 4-functions while passing a string, they are
Shift: In a Shift action, the input symbol is shifted onto the top of the stack.
Reduce: In a Reduce action, replacing the right side of a production with its left side.
Accept: In an Accept action, the parser announces successful completion of parsing.
Error: In an Error action, the parser discovers that a syntax error has occurred and calls an error
recovery routine.
Example: E E + E | E * E | ( E ) | id
we can take input string = id * id + id
Stack input Action
$ id * id + id$ shift
$id *id + id$ reduce Eid
$E *id + id$ shift
$E* id + id$ shift
$E*id +id$ reduce Eid
$E*E +id$ reduce EE*E
$E +id$ shift
$E+ id$ shift
$E +id $ reduce Eid
$E+E $ reduce EE+E
$E $ Accepted
Conflicts during Shift-Reduce parsing: Shift-reduce parser has two conflicts. They are,
1. Shift/reduce conflict
2. Reduce/reduce conflict
For the following grammar
E E + E | E * E | ( E ) | id
Shift/reduce conflict:- When the stack and the input buffer contains the contents as shown below
Stack input
$E*E +id$
The parser can take the action “Shift +” or “Reduce E E*E “. So it known as Shift/Reduce conflict. To
Solve this problem it gives first preference to “Shift + “.
Reduce/reduce conflict:- When the stack and the input buffer contains the contents as shown below
Stack input
$E*E+E $
The parser can take the action “Reduce EE+E” or “Reduce EE*E “. So it known as Reduce/Reduce
conflict. As Bottom-up parsing uses right most derivation in reverse it gives first preference to “Reduce
EE*E “.
Operator precedence parsing:
It is a bottom up parsing. A very small no of grammars are parsed in this technique. “The grammar that do not
have production right side is ε or has two adjacent non terminals. The grammar is called operator grammar”.
Example: EEAE|(E)|-E|id
A+|-|*|/|↑
The above grammar is not an operator grammar, because the right side EAE has two consecutive non-
terminals.
Example: EE+E|E-E|E*E|E/E|E↑ E|(E)|-E|id
The above grammar is operator grammar.
In operator precedence parsing, we define three disjoint precedence relations,<.,=,and.>.between certain pair
of terminals.
Relation Meaning
a<·b a “yields precedence to “b
a=b A ”has the same precedence
as” b
a·>b A ”takes precedence over” b
1. <∙ When used between two terminals, the <.relation indicates that the first terminal has less
precedence when compared to the second terminal.
2. ∙> This solution is used between two terminals only when the first terminal has more precedence than
the second one.
3. = This relation is used between two terminals only which same precedence.
Rules for operator precedence relation:
1. $ is always less than any operator or symbol
2. All operators has less precedence than the identifier
3. For two operators with same precedence, we can apply left associative or right associative rule.
4. Non terminal is less than any symbol expect $.
Construction of precedence Parse table:- The parse table is a table of size n x n where n is the no of
terminal symbols in the defined grammar.
1. Fill each cell with the relation between the vertical terminal symbol and the horizontal terminal symbol.
2. If the relation can‟t be defined, then it is marked as an error.
Operator precedence parser:- The operator precedence parser uses a parsing table called operator
precedence parsing table for making informed decisions to identify the correct handle during the reduction
step. As operator grammar doesn‟t have ε productions the selection of right production for reduction is
simplified. The operator precedence parser has the following components.
1. An input buffer that contains a string to be parsed followed by a $.
2. A stack containing a sequence of grammar symbols with a $ at the bottom of the stack.
3. An operator precedence parsing table
Example:- Refer Class notes.
Operator precedence parsing algorithm:
Input: An input string w and a table of precedence relations
Output: If w is in L(G) otherwise error.
Method:
Initially the stack contains $ and the input buffer the string w$.
Set input pointer to point to the first symbol of w$;
Let „a‟ be the top of the stack symbol and let „b‟ be the current input symbol;
If a<.b or a=b begin
Push b onto the stack and advance the input pointer;
End;
Else if a.>b then
Pop the stack
Until the top stack terminal is related by <.
Else error()
Operator Precedence function: Compilers using operator precedence parsing need not store the table of
precedence relations. In most cases, the table can be encoded by two precedence functions „f‟ and „g‟ that
maps the terminal symbols to integers.
1. f(a)<g(b) whenever a<.b
2. f(a)=g(b) whenever a=b
3. f(a)>g(b) whenever a.>b
Using these functions, we can determine the precedence relation between „a‟ and „b‟ comparing the integer
values of f(a) and f(b).
Algorithm:
Input: An operator precedence parsing table.
Output: Operator Precedence functions.
Method:
1. Create symbols fa and ga , where „a‟ is a terminal or $.
2. Partition the created symbols into as many groups as possible. Such that if there is a relation in table
marked as a=b then both fa and gb are in the same group.
3. Create a directed graph whose nodes are the groups found in step2.
1 .if a<.b, place an edge from gb to fa.
2. If a.>b , place an edge from fb to gb.
4. If the group constructed in step 3 has a cycle, there no precedence functions exist. If there are no cycles,
let f(a) be the length of the longest path beginning at the g(a) be the length of the longest path from the group
of ga.
Example: Refer the Class Notes
Advantages of operator precedence parser:-
1. It is very simple.
2. Operator precedence parser eliminates the shift reduce parser conflicts by defining the relation between the
operators.
3. By using the operator precedence parsing table these parsers can be very fast.
4. Easy to identify the next handle.
Disadvantages of operator precedence parser:-
1. Unary operators are difficult to resolve.
2. Operator precedence parsing table can be constructed only for operator grammar.
3. It works only for small class of grammar.
Difference between LR and LL parsers: -
LR Parsers LL Parsers
1. These are bottom up parsers. These are top down parsers.
2. This is complex to implement. This is simple to implement.
3. These are efficient parsers. These are less efficient parsers.
4. It is applied to a large class It is applied to small class of languages.
of programming languages.
LR(K)/LR parsing: Bottom-up syntax analysis technique that can be used to parse a large class of context-
free grammars. This technique is called LR(k) parsing. A general notation for the representation of LR
grammars is given by LR(K).
Where, L- left to right scanning of the input
R- R for constructing a rightmost derivation in reverse
K- The no. of input symbols of look head that are used in making parsing decisions.
WHY LR PARSERS or Advantages of LR Parser:
LR parsers can be constructed to recognize virtually all programming language constructs for which
context free grammars can be written.
The LR-parsing method is the most general non-backtracking shift-reduce parsing method known,
yet it can be implemented as efficiently as other, more primitive shift-reduce methods.
An LR parser can detect a syntactic error as soon as it is possible to do soon a left-to-right scan of the
input.
The class of grammars that can be parsed using LR methods is a proper superset of the class of
grammars that can be parsed with predictive or LL methods.
Construct LR Parsing:
The above fig, it consists of an input,an output, a stack and a parsing table that has two parts(action and goto ).
The parsing program reads characters from an input buffer one at a time. The program uses a stack to store
a string of the form S0X1S1X2S2....XmSm. where Sm is on top of the stack. Each Xi is a grammar symbol and Si
is a symbol called a state. The combination of the state symbol on top of the stack and the current input
symbol are used to index the parsing table and determine the shift reduce parsing decision.
Behavior of the LR Parser:- The next move of the parser is determined by reading ai, the current input
symbol, and Sm, the state on top of the stack, and then consulting the entry ACTION[Sm, ai ]in the parsing
action table. The action resulting after each of the four types of move are as follows
1. If ACTION[Sm, ai ]= shift Si, the parser executes a shift move; it shifts the next state „s‟ onto the stack and
advance the input pointer.
2. If ACTION[Sm, ai ]= reduce A β, then the parser executes a reduce move using the production A β.
If β has r symbols, then the parser first popped r symbols off the stack.
3. If ACTION[Sm, ai ]= accept, parsing is completed.
4. If ACTION[Sm, ai ]= error, the parser has discovered an error and calls an error recovery routine.
LR Parsing Algorithm: Refer class Notes
LR parsing table can be constructed in 3 ways:
1. SLR (LR(0))
2. CLR (Canonical LR)
3. LALR (Look ahead LR)
SLR≤LALR≤CLR This means, CLR can parse highest no. of grammars than LALR and SLR.
SLR(Simple LR Parsing):
step 1: An augmented grammar:
If 'G' is a grammar with start symbol S, then G', the augmented grammar for G, is G with a new start
symbol S' and production S'S. The purpose of this new starting production is to indicate to the parser when
it should stop parsing and announce acceptance of the input. That is, acceptance occurs when and only when
the parser is about to reduce by S'S.
step 2: LR(0) items: If we add dot(.) at some position on the right side of the production, then it is an item.
A.xyz
Ax.yz
Axy.z
Axyz.
Item identify the current position in the scanning process. The production A->ε generates only one item, A.
Closure operation: If 'I' is a set of items for a grammar G, then Closure(I) is the set of items constructed
from 'I' by the following rules:
i. Every item in 'I' is in Closure of (I)
ii. If Aα.Bβ is in Closure(I) and Bϒ is a production, then add the item B.ϒ to 'I', if it is not
already there. We apply this rule until no more new items can be added to Closure of (I)
goto operation: It is also a function which takes two arguments.
I (set of items)
X (grammar symbol)
goto(I,X) is defined to be the closure of the set of items [AαX.β] such that [Aα.Xβ ] is in 'I'.
Algorithm: Refer class Notes
Example: Refer Class notes
CLR: It is similar to SLR parsing table definition of item changes in SLR & CLR.Items in CLR have two
parts and are of the form A α.Bβ, a|b|c|$
One part is Aα.Bβ Other part is one (or) terminals
Aα.Bβ,a
Aα.Bβ,b
Aα.Bβ,c
we must add the item as it is
Aα.Bβ,a
B.Ɣ,First(βa)
Algorithm: Refer class Notes
LALR: The CLR parser avoids conflicts in the parse table. But it produces more no. of states when compared
to SLR parser. Hence more space is occupied by the table in memory. So LALR parsing can be used. Here,
the tables obtained are smaller than CLR parse tables. But it is also efficient as CLR parsers. This is because
LALR parse tables are constructed from LR(1) collection of items. Here LR(1) items that have same
productions but different look-a-heads are combined to form a single set of items.
Algorithm: Refer class Notes
The "Dangling-Else" Ambiguity: Refer Class Notes
Error Recovery in LR Parsing Refer Class Notes