Unit-4 Syntax Analysis
Role of parser
Token Parse
Source Lexical Parse tree Rest of front IR
Parser
tree
program analyzer end
Get next
token
Symbol table
▪ Parser obtains a string of token from the lexical analyzer and
reports syntax error if any otherwise generates syntax tree.
▪ There are two types of parser:
1. Top-down parser
2. Bottom-up parser
Context Free Grammar
Context Free Grammars
▪ A context free grammar (CFG) is a 4-tuple 𝐺 = (𝑉, Σ, 𝑆, 𝑃) where,
𝑉 is finite set of non terminals,
Σ is finite set of terminals,
𝑆 is an element of 𝑉 and it’s a start symbol,
𝑃 is a finite set formulas of the form 𝐴 → 𝛼 where 𝐴 ∈ 𝑉 and
𝛼 ∈ (𝑉 ∪ Σ)∗
▪ Example:
expr → expr op expr | (expr) | -expr | id
op → + | - | * | / | ↑
Terminals: id + - * / ↑ ( ) Non terminals: expr, op Start symbol: expr
Derivation & Ambiguity
Derivation
▪ Derivation is used to find whether the string belongs to a given
grammar or not.
▪ Types of derivations are:
1. Leftmost derivation
2. Rightmost derivation
Leftmost derivation
• A derivation of a string 𝑊 in a grammar 𝐺 is a left most derivation
if at every step the left most non terminal is replaced.
• Grammar: E→E+E | E-E | E*E | E/E | a Output string: a*a-a
E E
→E-E
E - E
→E*E-E
Parse tree represents the
→a*E-E structure of derivation
E * E a
→a*a-E
a a
→a*a-a
Parse tree
Leftmost Derivation
Rightmost derivation
• A derivation of a string 𝑊 in a grammar 𝐺 is a right most
derivation if at every step the right most non terminal is replaced.
• It is all called canonical derivation.
• Grammar: E→E+E | E-E | E*E | E/E | a Output String: a*a-a
E E
→E*E
E * E
→E*E-E
→E*E-a a E - E
→E*a-a
a a
→a*a-a
Parse Tree
Rightmost Derivation
Ambiguous Grammar
Precedence & associativity of operators
Operator Precedence Associative
↑ 1 right
*, / 2 left
+, - 3 left
Ambiguous grammar
• Ambiguous grammar is one that produces more than one leftmost
or more then one rightmost derivation for the same sentence.
• Grammar: E→E+E | E*E | (E) | a Output string: a+a*a
E E
E E
→E*E →E+E
E * E E + E
→E+E*E →a+E
→a+E*E E + E a →a+E*E a E * E
→a+a*E →a+a*E
→a+a*a a a →a+a*a a a
Here, Two leftmost derivation for string a+a*a is possible because
Rule of precedence is not maintained.
Ambiguous grammar
• Ambiguous grammar is one that produces more than one leftmost
or more then one rightmost derivation for the same sentence.
• Grammar: E→E+E | (E) | a Output string: a+a+a
E E
E E
→E+E →E+E
E + E E + E
→E+E+E →a+E
→a+E+E E + E a →a+E+E a E + E
→a+a+E →a+a+E
→a+a+a a a →a+a+a a a
Here, Two leftmost derivation for string a+a+a is possible because
Rule of associativity is not maintained.
Eliminating Ambiguity
▪ There are two causes of ambiguity in grammar.
1. Precedence is not maintained in grammar.
2. Associativity is not maintained in grammar.
Eliminating Ambiguity
▪ Example: E→E+E | E*E | (E) | a
Step-1
E→E+E | T
T→T*T | F
F→(E) | a
Step-2
E→E+T | T
Unambiguous Grammar
T→T*F | F
F→(E) | a
Left recursion
A grammar is said to be left recursive if it has a non terminal 𝐴 such
that there is a derivation 𝐴→𝐴𝛼 for some string 𝛼.
Types of Left recursion
Immediate Left recursion Indirect Left recursion
Left Recursion Elimination
𝐴 → 𝐴𝛼
𝛼 |𝛽 𝐴 → 𝛽𝐴’
𝐴’→ 𝛼𝐴’| 𝜖
Examples: Left Recursion Elimination
E→E+T | T
E→TE’
E’→+TE’ | ε
T→T*F | F
T→FT’
T’→*FT’ | ε
X→X%Y | Z
X→ZX’
X’→%YX’ | ε
Examples: Left Recursion Elimination
S→ Aa | b
A→ Ac | Sd | ε
Here, Non terminal S is left recursive because:
S→ Aa → Sda
S→ Aa | b
To remove indirect left A→ Ac
recursion replace S A→ Sd A→Ac | Aad | bd | ε
with productions of S
A→ ε
Now, remove left recursion
S→ Aa | b
S→ Aa | b
A→Ac | Aad | bd | ε A→ bdA’ | A’
A’→ cA’ | adA’ | ε
Left Factoring
Left factoring is a grammar transformation that is useful for producing a grammar
suitable for predictive parsing.
Algorithm to left factor a grammar
Input: Grammar G
Output: An equivalent left factored grammar.
Method:
For each non terminal A find the longest prefix 𝛼 common to two or more of its
alternatives. If 𝛼 ≠∈, i.e., there is a non trivial common prefix, replace all the A
productions 𝐴→ 𝛼𝛽1 𝛼𝛽2 … … … … . . 𝛼𝛽𝑛 𝛾 where 𝛾 represents all alternatives
that do not begin with 𝛼 by
𝐴 → 𝛼 𝐴′| 𝛾
𝐴′ → 𝛽1 | 𝛽2 | … … … … . |𝛽𝑛
Here A' is new non terminal. Repeatedly apply this transformation until no two
alternatives for a non-terminal have a common prefix.
Left Factoring
Left factoring is a grammar transformation that is useful for
producing a grammar suitable for predictive parsing.
𝐴 → 𝛼𝛽 | 𝛼δ A
Output string: 𝛼 δ
𝛼 𝛽 Backtrack
𝛼 δ
Left Factoring
𝐴→ 𝛼𝛽|𝛼 δ 𝐴 → 𝛼 𝐴′
𝐴′ → 𝛽 | δ
Example: Left Factoring
S→aAB | aCD
S→aS’
S’→AB | CD
A→ xByA | xByAzA | a
A→ xByAA’ | a
A’→ Є | zA
A→ aAB | aA |a
A→aA’
A’→AB | A | 𝜖
A’→AA’’ | 𝜖
A’’→B | 𝜖
Example: Left Recursion
Example: Left Factoring