KEMBAR78
Module 5 | PDF | Parsing | Metalogic
0% found this document useful (0 votes)
21 views56 pages

Module 5

The document discusses the theory of computation, focusing on normal forms of context-free grammars (CFGs), specifically Chomsky Normal Form (CNF) and Greibach Normal Form (GNF). It outlines the transformation processes for CFGs to achieve these normal forms, including the elimination of ε-productions, unit productions, and useless symbols. Additionally, it provides algorithms and examples for converting CFGs to CNF and GNF, highlighting their significance in parsing and computational efficiency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views56 pages

Module 5

The document discusses the theory of computation, focusing on normal forms of context-free grammars (CFGs), specifically Chomsky Normal Form (CNF) and Greibach Normal Form (GNF). It outlines the transformation processes for CFGs to achieve these normal forms, including the elimination of ε-productions, unit productions, and useless symbols. Additionally, it provides algorithms and examples for converting CFGs to CNF and GNF, highlighting their significance in parsing and computational efficiency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 56

Theory of Computation

Course Code:
CSE1008

By:
Dr. Monali Bordoloi,
Assistant Professor Senior Grade 2
1 SCOPE.
02/07/2025 08:18 AM
Module 5
Normal Forms of
CFG
Normal forms for CFGs: CNF and GNF, Closure properties of CFLs, Decision
Properties of CFLs: Emptiness, Finiteness and Membership, Pumping lemma
for CFLs

2 07/02/2025
Normal Forms of CFGs
CFGs can be transformed into certain standardized forms, known as Normal Forms
 simplify the CFG’s structure – less complexity
 make the CFGs easier to analyze or use in algorithms.

Others:
Popular Normal Kuroda Normal Form
Extended Chomsky Normal Form
Forms of CFG

Chomsky Greibach Normal


Normal Form Form

Each normal form imposes specific restrictions on the production rules of the grammar.

Every CFG that does not generate the empty string can be transformed into an
equivalent CNF or GNF.
 "Equivalent" here means that the two grammars generate the same language.
Chomsky Normal Forms
Goal
To show that every CFL (without ε) is generated by a CFG in which all productions are of the form:
A  BC or
A  a,
where A, B, C are variables, and a is a terminal such that
 The RHS consists only of variables
 The RHS is of length 2
 S does not occur in the RHS

Key Properties of CNF


 A single CFG can be converted into different equivalent CNF forms.
 CNF produces the same language as the original CFG.
 CNF is widely used in parsing algorithms such as:
 Cocke-Younger-Kasami (CYK) algorithm for membership checking i.e., checks if a
given string belongs to the language of a CFG in polynomial time..
 Bottom-up parsers in compilers.
 For a string of length n, a CNF derivation requires at most 2n-1 derivation steps.
 Any CFG that does not generate ε has an equivalent CNF .
Chomsky Normal Forms
What are the different ways of simplifying CFGs? [Refer to Module 4 PPT]

Simplification of CFG
1.The elimination of useless symbols, variables, or terminals that
do not appear in any derivation of a terminal string from the start
symbol.
2.The elimination of ε-productions, those of the form A  ε for
some variable A.
3.The elimination of unit productions, those of the form A  B for
variables A and B.

Conclusion of all three elimination stages: CNF


Given CFG, G = (V, Σ, S, P),
 CNF, G1 = (V ′, Σ, S, P′) is the grammar obtained after eliminating ε-productions, unit
productions, and useless symbols from G.

Note: The eliminations are applied in a fixed order. Applying the eliminations in a different order may
result in a grammar not having all the desired properties.
Chomsky Normal Forms
Theorem: If G is a CFG which generates a language that consists of at least one string
along with ε, then there is another CFG G1 such that:
L{G1} = L{G} – {ε} , “no ε-productions”,
and G1 has neither unit productions
nor useless symbols
Proposition
For any non-empty context-free language L, there is a grammar G, such that L(G ) = L and each rule in
G follows any of the forms as shown below:
1. S → where S is the start symbol (iff ∈ L)
2. A non-terminal generating a terminal A → a where a ∈ Σ if |a|=1,
(if |a|=0, A must be the start symbol)
3. A non-terminal generating two non-terminals A → BC, where neither B nor C is the start
symbol, (Start symbol cannot be in the RHS)
Also, G doesn’t contain any useless symbols.

Example:
1. G1 = {S → AB, S → c, A → a, B → b}
2. G2 = {S → aA, A → a, B → c}
 The production rules of Grammar G1 satisfy the rules specified for CNF, so the grammar G1 is in
CNF.
 The production rule of Grammar G2 does not satisfy the rules specified for CNF as S → aZ contains
a terminal followed by a non-terminal. So, the grammar G2 is not in CNF.
CFG to CNF
Algorithm:
Step 1: Eliminate the start symbol from the RHS.
If the start symbol S is at the right-hand side of any production, create a new production
as:
S1 → S
where S1 is the new start symbol.

Step 2: In the grammar, remove the null, unit and useless productions. [Strictly follow
the order of eliminations] [Refer to Module 4 ppt]

Step 3: Eliminate terminals from the RHS of the production if they exist with other
non-terminals or terminals.
For example, production S → aA can be decomposed as:
S → RA
R→a

Step 4: Eliminate RHS with more than two non-terminals.


For example, S → ASB can be decomposed as:
S → RB
R → AS
CFG to CNF
Example 1: Production Rules of the CFG are :
S → ASB
A → aAS|a|ε
B → SbS|A|bb

Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS. The
grammar will be:
S1->S
S → ASB
A → aAS|a|ε
B → SbS|A|bb

Step 2: As grammar G1 contains A → ε null production, its removal from the grammar yields:
S1->S
S → ASB|SB
A → aAS|aS|a
B → SbS| A|ε|bb
Now, it creates null production B→ ε, its removal from the grammar yields:
S1->S
S → AS|ASB| SB| S
A → aAS|aS|a
B → SbS| A|bb
CFG to CNF
Example 1: Production Rules of the CFG are :
S → ASB
A → aAS|a|ε
B → SbS|A|bb

Step 2: Now, we have unit production B->A. Its removal from the grammar yields:
S1->S
S → AS|ASB| SB| S
A → aAS|aS|a
B → SbS|bb|aAS|aS|a
Also, remove the unit production S1 → S. Its removal from the grammar yields:
S1-> AS|ASB| SB| S
S → AS|ASB| SB| S
A → aAS|aS|a
B → SbS|bb|aAS|aS|a
Also, removal of unit production S->S and S1->S from the grammar yields:
S1-> AS|ASB| SB
S → AS|ASB| SB
A → aAS|aS|a
B → SbS|bb|aAS|aS|a

Here, we do not have any useless symbols or variables.


CFG to CNF
Example 1: Production Rules of the CFG are :
S → ASB
A → aAS|a|ε
B → SbS|A|bb

Step 3: In production rule A->aAS |aS and B-> SbS|aAS|aS, terminals a and b exist on RHS
with non-terminals. Removing them from the RHS yields:
S1-> AS|ASB| SB
S → AS|ASB| SB
A → XAS|XS|a
B → SYS|bb|XAS|XS|a
X →a
Y→b

Also, B->bb can’t be part of CNF, removing it from grammar yields:


S1-> AS|ASB| SB
S → AS|ASB| SB
A → XAS|XS|a
B → SYS|VV|XAS|XS|a
X→a
Y→b
V→b
CFG to CNF
Example 1: Production Rules of the CFG are : Similarly, solving S->ASB yields:
S → ASB S1-> AS|PB| SB
A → aAS|a|ε S → AS|QB| SB
B → SbS|A|bb A → XAS|XS|a
B → SYS|VV|XAS|XS|a
Step 4: In production rule S1->ASB, RHS has more X→a
than two symbols, removing it from grammar yields: Y→b
S1-> AS|PB| SB V→b
S → AS|ASB| SB P → AS
A → XAS|XS|a Q → AS
B → SYS|VV|XAS|XS|a
X→a Similarly, solving A->XAS yields:
Y→b S1> AS|PB| SB
V→b S → AS|QB| SB
P → AS A → RS|XS|a
B → SYS|VV|XAS|XS|a
X→a
Y→b
V→b
P → AS
Q → AS
R → XA
CFG to CNF
Example 1: Production Rules of the CFG are :
S → ASB
A → aAS|a|ε
B → SbS|A|bb

Step 4: Similarly, B->SYS has more than two Similarly, B->XAX has more than two
symbols, removing it from the grammar symbols, removing it from the grammar
yields: yields:
S1 -> AS|PB| SB S1-> AS|PB| SB
S → AS|QB| SB S → AS|QB| SB
A → RS|XS|a A → RS|XS|a
B → TS|VV|XAS|XS|a B → TS|VV|US|XS|a
X→a X→a
Y→b Y→b
V→b V→b
P → AS P → AS
Q → AS Q → AS
R → XA R → XA
T → SY T → SY
U → XA
This is the required CNF
CFG to CNF
Example 2: Production Rules of the CFG are : Step 1: Eliminate the start symbol from the RHS.
Step 2: In the grammar, remove the null, unit and useless
S → a | aA | B productions.
A → aBB | ε Step 3: Eliminate terminals from the RHS of the
production if they exist with other non-terminals or
B → Aa | b terminals.
Step 4: Eliminate RHS with more than two non-terminals.

Step 1: S is not present in the RHS.


Elimination is not required.
Step 2: As grammar G1 contains A → ε null production, its removal from the grammar yields:
S → a | aA | B
A → aBB
B → Aa | b | a
Now, as grammar G1 contains Unit production S → B, its removal yields:
S → a | aA | Aa | b
A → aBB
B → Aa | b | a
Step 3: In the production rules, S → aA | Aa, A → aBB and B → Aa, terminal a exists on RHS
with non-terminals. So, we will replace terminal a with X:
S → a | XA | AX | b
A → XBB
B → AX | b | a
X→a
CFG to CNF
Example 2: Production Rules of the CFG are : Step 1: Eliminate the start symbol from the RHS.
Step 2: In the grammar, remove the null, unit and useless
S → a | aA | B productions.
A → aBB | ε Step 3: Eliminate terminals from the RHS of the
production if they exist with other non-terminals or
B → Aa | b terminals.
Step 4: Eliminate RHS with more than two non-terminals.

Step 4: In the production rule A → XBB, the RHS has more than two symbols, removing it from the
grammar yields:
S → a | XA | AX | b
A → RB
B → AX | b | a
X→a
R → XB

This is the required CNF


CFG to CNF
Step 1: Eliminate the start symbol from the RHS.
Practice Problems: Convert the given CFG to CNF Step 2: In the grammar, remove the null, unit and useless
productions.
Q 3: Production Rules of the CFG are : Step 3: Eliminate terminals from the RHS of the
S → AACD production if they exist with other non-terminals or
terminals.
A → aAb| ε Step 4: Eliminate RHS with more than two non-terminals.
C → ac|a
D → aDa|bDb| ε
Q 4: Production Rules of the CFG are :
S → ASA | aB,
A → B | S, Try Yourself!!!
B→b|ε
Q 5: Production Rules of the CFG are :
S → aAD
A → aB / bAB
B→b
D→d
Q 6: Production Rules of the CFG are :
S → 1A / 0B
A → 1AA / 0S / 0
B → 0BB / 1S / 1
Greibach Normal Form
A CFG is in GNF if all the production rules satisfy one of the following conditions:
1. A start symbol generating ε.
For example, S → ε.
2. A non-terminal generating a terminal.
For example, A → a.
3. A non-terminal generating a terminal which is followed by any number of non-terminals.
For example, S → aASB.
Theorem:
If G = (V, T, R, S) is a CFG, then we can construct another CFG G1 = (V1, T, R1, S) in GNF such that
L(G1) = L(G) - {ε} and R1 is of the form A → aα, where a ∈ T and α ∈ V ∗, i.e., α is a string of zero
or more variables.

For example:
1.G1 = {S → aAB | aB, A → aA| a, B → bB | b}
2.G2 = {S → aAB | aB, A → aA | ε, B → bB | ε}
 The production rules of Grammar G1 satisfy the rules specified for GNF, so the grammar G1 is in
GNF.
 The production rule of Grammar G2 does not satisfy the rules specified for GNF as A → ε and B →
ε contains ε(only start symbol can generate ε). So the grammar G2 is not in GNF.
 For a given grammar, there can be more than one GNF.
 Every CFG can be converted to an equivalent grammar in GNF. GNF produces the same language as
generated by CFG.
Greibach Normal Form
Importance of GNF

1. Simplified Grammars: GNF transforms CFGs into a simpler form where all rules
start with a terminal symbol, followed by zero or more non-terminal symbols.
2. Efficient Parsing: This structure makes it easier to build top-down parsers, which
analyze the input string from the start symbol, making it suitable for compiler design
and other parsing algorithms.
3. Real-time PDA: The conversion to GNF is crucial in proving that every CFL can be
recognized by a real-time PDA, a specific type of automaton that reads an input
symbol for every transition.
4. Elimination of Left Recursion: GNF, by its nature, eliminates left recursion, a
common problem encountered in parsing.
5. Understanding Language Structure: By standardizing the grammar structure, GNF
helps in understanding the language's characteristics and facilitates analysis of the
language's properties.
6. Foundation for Compiler Design: GNF provides a solid foundation for compiler
design by facilitating the development of efficient parsing algorithms, ensuring easier
analysis and optimization of compilers.
Greibach Normal Form
To transform a CFG to GNF, we have to eliminate left recursion

Left Recursive Grammar:


A grammar which have the left-recursive pair of productions-
A → Aα / β
where β does not begin with an A.

Elimination of Left Recursion:

Original Formula:
We can eliminate left recursion by replacing the pair of productions with-
A → βZ
Z→ αZ/ ∈
(Right Recursive Grammar)

This right-recursive grammar functions the same as the left-recursive grammar.

Formula to be followed: (as null productions will be removed)


If A → Aα1|Aα2| . . .|Aαr|β1|β2| . . .|βs, then replace the above rules by
A → βi|βiZ, 1 ≤ i ≤ s
Z → αi|αiZ, 1 ≤ i ≤ r
Greibach Normal Form
Algorithm:

Step 1: Eliminate null productions, unit productions and useless symbols from the grammar G and then
construct a G0 = (V0, T, R0, S) in CNF generating the language L(G0) = L(G) − {ε}

Step 2: Rename the variables like A1, A2, . . . An starting with S = A1

Step 3: Modify the rules in R0 so that if Ai → Aj γ ∈ R0 then j > i

Step 4: Starting with A1 and proceeding to An this is done as follows:


(a) Assume that productions have been modified so that for 1 ≤ i ≤ k, Ai →Aj γ ∈ R0 only if j > i
(b) If Ak → Ajγ is a production with j < k, generate a new set of productions substituting for the Aj
the body of each Aj production.
(c) Repeating (b) at most k − 1 times we obtain rules of the form Ak →Ap γ, p ≥ k
(d) Replace rules Ak → Akγ by removing left-recursion.

Step 5: Modify the Ai → Ajγ for i = n−1, n−2, ., 1 in the desired form and at the same time change the Z
production rules.
Greibach Normal Form
Example 1: Convert the following grammar G into GNF.
S → XA|BB
B → b|SB
X→b
A→a

1. Rewrite G in CNF
It is already in CNF

2. Re-label the variables


S with A1
X with A2
A with A3
B with A4

After re-labeling, the grammar looks like:


A1 → A2A3|A4A4
A4 → b|A1A4
A2 → b
A3 → a
Greibach Normal Form
Example 1: Convert the following grammar G into GNF.
S → XA|BB
B → b|SB
X→b
A→a

3. Identify all productions which do not conform to any of the types listed below:
Ai → Aj such that j > i
Zi → Aj such that j ≤ n
Ai → such that ∈ V ∗ and ∈ T

A4 → A1A4 ................ identified

A4 → A1A4|b.

To eliminate A1, we will use the substitution rule A1 → A2A3|A4A4.


Therefore, we have A4 → A2A3A4|A4A4A4|b

The above two productions still do not conform to any of the types in step 3

Substituting for A2 → b
A4 → bA3A4|A4A4A4|b
Greibach Normal Form
Example 1: Convert the following grammar G into GNF.
S → XA|BB
B → b|SB
X→b
A→a

3. Now we have to remove left recursive production A4 → A4A4A4


A4 → bA3A4|b|bA3A4Z|bZ
Z → A4A4|A4A4Z

At this stage our grammar now looks like


A1 → A2A3|A4A4
A4 → bA3A4|b|bA3A4Z|bZ
Z → A4A4|A4A4Z
A2 → b
A3 → a
All rules now conform to one of the types in step 3.
But the grammar is still not in GNF!

All productions for A2, A3 and A4 are in GNF


for A1 → A2A3|A4A4
Substitute for A2 and A4 to convert it to GNF
Greibach Normal Form
Example 1: Convert the following grammar G into GNF.
S → XA|BB
B → b|SB
X→b
A→a

3. Substitute for A2 and A4 to convert it to GNF


A1 → bA3|bA3A4A4|bA4|bA3A4ZA4|bZA4
for Z → A4A4|A4A4Z
Substitute for A4 to convert it to GNF
Z → bA3A4A4|bA4|bA3A4ZA4|bZA4|bA3A4A4Z|bA4Z|bA3A4ZA4Z|bZA4Z

4. Finally the grammar in GNF is


A1 → bA3|bA3A4A4|bA4|bA3A4ZA4|bZA4
A4 → bA3A4|b|bA3A4Z|bZ
Z → bA3A4A4|bA4|bA3A4ZA4|bZA4|bA3A4A4Z|bA4Z|bA3A4ZA4Z|bZA4Z
A2 → b
A3 → a
Greibach Normal Form
Example 2: Convert the following grammar G into GNF.
S → AB
A → BS | b
B → SA | a

The given CFG ‘G’ is in the form of CNF.

Re-labelling the variables, we get,


A1 = S
A2 = A
A3 = B

After re-labeling, the grammar looks like:


A1 → A2 A3 i < j = True
A2 → A3 A1 | b i < j = True
A3 → A1 A2 | a i < j = False
To eliminate A1, we will use the substitution rule using A1 → A2 A3
A3 → A2 A3 A2 | a i < j = False
Again, substitute A2 , by its production rule in variable A3
A3 → A3 A1 A3 A2 | b A3 A2 | a i=j
Greibach Normal Form
Example 2: Convert the following grammar G into GNF.
S → AB
A → BS | b
B → SA | a

Now, we have to eliminate the left recursion for


A3 → A3 A1 A3 A2 | b A3 A2 | a
After elimination we get,
A3 → b A3 A2 | a | b A3 A2 B3 | a B3
B3 → A1 A3 A2 | A1 A3 A2 B3

All the productions of A3 in the form of GNF. Substitute A3 productions in A2


A2 → b A3 A2 A1 | a A1 | b A3 A2 B3 A1 | a B3 A1 | b
All the productions of A2 in the form of GNF. Substitute A2 productions in A1
A1 → b A3 A2 A1 A3 | a A1 A3| b A3 A2 B3 A1 A3 | a B3 A1 A3 | b A3
All the productions of A1 in the form of GNF. Substitute A1 productions in B3
B3 → b A3 A2 A1 A3 A3 A2 | a A1 A3 A3 A2 | b A3 A2 B3 A1 A3 A3 A2 | a B3 A1 A3 A3 A2 | b
A3 A3 A2
B3 → b A3 A2 A1 A3 A3 A2 B3 | a A1 A3 A3 A2 B3 | b A3 A2 B3 A1 A3 A3 A2 B3 | a B3 A1 A3
A3 A2 B3| b A3 A3 A2 B3
Greibach Normal Form
Example 2: Convert the following grammar G into GNF.
S → AB
A → BS | b
B → SA | a

Final GNF:
A1 → b A3 A2 A1 A3 | a A1 A3| b A3 A2 B3 A1 A3 | a B3 A1 A3 | b A3

A2 → b A3 A2 A1 | a A1 | b A3 A2 B3 A1 | a B3 A1 | b

A3 → b A3 A2 | a | b A3 A2 B3 | a B3

B3 → b A3 A2 A1 A3 A3 A2 | a A1 A3 A3 A2 | b A3 A2 B3 A1 A3 A3 A2 | a B3 A1 A3 A3
A2 | b A3 A3 A2 | b A3 A2 A1 A3 A3 A2 B3 | a A1 A3 A3 A2 B3 | b A3 A2 B3 A1 A3
A3 A2 B3 | a B3 A1 A3 A3 A2 B3 | b A3 A3 A2 B3
Greibach Normal Form
Algorithm:
Example 3: Convert the following grammar G into GNF.
S → AA | 0 Step 1: Eliminate null productions, unit productions and useless symbols
from the grammar G and then construct a G0 = (V0, T, R0, S) in CNF
A → SS | 1 generating the language L(G0) = L(G) − {ε}

Step 2: Rename the variables like A1, A2, . . . An starting with S = A1

Step 3: Modify the rules in R0 so that if Ai → Ajγ ∈ R0 then j > i


Example 4: Convert the following grammar G into GNF. Step 4: Starting with A1 and proceeding to An this is done as follows:
S → a | CD | CS (a) Assume that productions have been modified so that for 1 ≤ i ≤
k, Ai →Ajγ ∈ R0 only if j > i
A → a | b | SS (b) If Ak → Ajγ is a production with j < k, generate a new set of
productions substituting for the Aj the body of each Aj production.
C→a (c) Repeating (b) at most k − 1 times we obtain rules of the form Ak
→Apγ, p ≥ k
D → AS (d) Replace rules Ak → Akγ by removing left-recursion.

Step 5: Modify the Ai → Ajγ for i = n−1, n−2, ., 1 in the desired form and at
Example 5: Convert the following grammar G into GNF. the same time change the Z production rules.

S→SS|(S)|a

Example 6: Convert the following grammar from CNF, into GNF


S → AB | ε
A → AB | CB | a
B → AB | b
C → AC | c

Try Yourself!!!
Context-Free Languages- Properties
Closure Properties
Closed Under: Not Closed Under:
1. Union Operation 1. Intersection
2. Concatenation 2. Complement
3. Kleene closure 3. Subset
4. Reversal operation 4. Superset
5. Homomorphism 5. Infinite Union
6. Inverse Homomorphism 6. Difference, Symmetric difference (xor,
7. Substitution Nand, nor or any other operation which
8. init or prefix operation gets reduced to intersection and
9. Quotient with regular language complement
10. Cycle operation
11. Union with regular language
12. Intersection with regular language
13. Difference with regular language

Decision Properties:
1. Test for Membership: Decidable. The rest of the decision properties
2. Test for Emptiness: Decidable as compared to a RL, are
undecidable in CFL
3. Test for finiteness: Decidable
28
Context-Free Languages- Properties
Closure Properties
Suppose = (, ) and = (, ).
Example: For we have
→ab|ε
For we have
→c d|ε
Then L( ) = { : n ≥ 0}. Also, L( ) = { : n ≥ 0}.

Union:
Context-Free Languages- Properties
Closure Properties
Suppose = (, ) and = (, ).
Example: For we have
→ab|ε
For we have
→c d|ε
Then L( ) = { : n ≥ 0}. Also, L( ) = { : n ≥ 0}.

Concatenation:
Context-Free Languages- Properties
Closure Properties
Suppose = (, ) and = (, ).
Example: For we have
→ab|ε
For we have
→c d|ε
Then L( ) = { : n ≥ 0}. Also, L( ) = { : n ≥ 0}.

Kleen Star:
Context-Free Languages- Properties
Closure Properties
Suppose = (, ) and = (, ).
Example: For we have
→ab|ε
For we have
→c d|ε
Then L( ) = { : n ≥ 0}. Also, L( ) = { : n ≥ 0}.

Context-free languages are not closed under intersection or complement.


Context-Free Languages- Properties
Closure Properties
Suppose = (, ) and = (, ).
Example: For we have
→ab|ε
For we have
→c d|ε
Then L( ) = { : n ≥ 0}. Also, L( ) = { : n ≥ 0}.

Intersection with a regular language


Context-Free Languages- Properties
Decision Properties
Decision properties are the properties derived to check whether the problem is decidable or
not.

CFG is undecidable for ambiguity, equality and regularity of CFG.

1. Emptiness Problem
 Check whether the CFG can generate a language or not
OR
 Check whether the given CFG can generate strings or not

If the Grammar cannot derive or produce any string from it, then that Grammar is said to be an
Empty Grammar.

Procedure:
1. Simplify the CFG.
2. If you find the Start symbol in the set of useless symbols, then that Grammar is empty.
3. If you cannot find the start symbol in the set of useless symbols, try to generate any of the strings
from that Grammar after removing all useless symbols.
4. If it can generate a string, then that Grammar is non-empty; otherwise, it is said to be an empty
grammar.
Context-Free Languages- Properties
Decision Properties
1. Emptiness Problem
Example 1: Check whether the given CFG is empty or not.
S → AB | a
A→ a
B → bB
C→ a

Solution : Here, we would separate the set of terminals and non-terminals.


T={a,b}
NT or V = { S , A , B , C }

Let us check one by one starting from downwards -


C→ a
C derives a terminal that is ‘ a ' but is not reachable from the start symbol; hence, symbol C is useless.
B → bB
B → bbB
B → bbbB
Here we can conclude that it cannot derive any string; hence symbol B is useless.
A→ a
A can be useful; let us check further.
Context-Free Languages- Properties
Decision Properties
1. Emptiness Problem
Example 1: Check whether the given CFG is empty or not.
S → AB | a
A→ a
B → bB
C→ a

Solution : S → AB
S → aB
We can conclude here that B is a useless symbol, and from the starting symbol, A is arriving with B,
and B is not reaching any terminal; hence, symbol A is also a useless symbol.

S → a
It is only a useful symbol.

Let us make a set of useful symbols and a set of useless symbols.


Useful symbols = { S }
Useless symbols = { A, B , C }
Final grammar generated is – S → a

Here we see that symbol S is a starting symbol and does not belong to the set of useless symbols;
hence, this Grammar is non-empty.
Context-Free Languages- Properties
Decision Properties
2. Finiteness Problem
Procedure:
1. Convert the Grammar into CNF.
2. After converting the Grammar into the CNF form then draw the CNF graph.
3. Make all Non-terminals or variables independent nodes of the graph.
4. After making nodes, then make the edges from the nodes that are directed towards
another node.
5. Please do not repeat the edges once you have marked them in the graph.
6. After constructing the whole graph, then check whether the cycle is present in the
graph or not.
7. If there is any cyclic-like structure in the graph, then the language generated by that
Grammar is not finite.
Context-Free Languages- Properties
Decision Properties
2. Finiteness Problem
Example: Check whether the given Grammar is finite or not?
S → AB/ a
A → BC / a
B → CC / b
C→a

Solution:
There is no epsilon ( Є ) and unit productions. All the Non-terminals or variables are present in the
above CFG is useful, not a single variable is useless. The given grammar is in CNF.

Now, drawing the CNF Graph by converting the non-terminals or variables into nodes and deriving
arrows behaves as an edge of the graph.

NO LOOP or CYCLES!!
It is a finite CFG.
Context-Free Languages- Properties
Decision Properties
3. Membership Problem
 Check whether a given string of any CFG is a member of the grammar or not.

 Use the CYK (Cocke-Younger-Kasami) algorithm.


Note: . CYK Algorithm is only applicable to the CFG if it is in the CNF

 After applying the CYK Algorithm, match the last field of the table with the CNF form
of the given CFG. Find whether one of the variables or non-terminals from the obtained
set is the Grammar's start symbol.

 If one of the Variables is the starting symbol, then conclude that the given string is a
member of the given CFG. Otherwise, the given string is not a member of the given
CFG.
Context-Free Languages- Properties
Decision Properties
c. Membership Problem
CYK Algorithm
1. The CYK Algorithm is a bottom-up parsing algorithm.
2. As the height of the table increases, the number of productions increases.
3. For the nth row, we are required to apply n-1 productions.

Procedure:
1. Check the length of a given string.
2. Construct the table using that length; let say the length of the string is ‘ n ’.
3. Make ' n ' number of columns and ' n ' number of rows in the table, but consider one thing that
height of the table is ' n ' for the first column, height will be ' n-1 ' for the second column, and so
on, in the last column, the height of the table will be one.
4. After constructing the outlines of the table, then write the corresponding terminals of the strings
on the top of the table.
5. Start from the first row and check the string's first terminal in the CFG of CNF form.
6. If you find that terminal in the Grammar, check its corresponding variable present on the left-hand
side. Mark the variables in the first field of the first row.
Context-Free Languages- Properties
Decision Properties
c. Membership Problem

Procedure:
7. If you find that, two Variables contain the same terminal. Now, mark the whole as a set in that field.
8. Fill the first row in the same way as mentioned above.
9. For the first field of the second row, multiply the two fields, i.e., the first field is just above the
current field, and the second field is adjacent to the first field.
10. Do not change the order of variables after multiplication. Check each multiplied value in the CFG
of CNF form and find the variables present in the Grammar.
11. Then, mark all the Corresponding variables from the left-hand side in CFG of CNF form in the
specific field of the second row.
12. Similarly, in the way mentioned above, fill all the fields of the second row in the table.
13. You need to multiply two times for the first field of the third row because two levels are increased.
14. For first multiplication, multiply the first field elements of the first row with the second field
elements of the second row.
15. For second multiplication, multiply the elements of the first field of the second row with the
element of the second field of the first row, moving in a diagonal direction.
16. After that, match all the non-terminals with the CNF form of CFG and if you find that any of the
elements matches, then write the corresponding variables in the specified field.
17. Similarly, in this way, we will fill the second third.
Context-Free Languages- Properties
Decision Properties
c. Membership Problem

Procedure:
18. For second multiplication, multiply the elements of the first field of the second row with the
element of the second field of the first row, moving in a diagonal direction.
19. After that, match all the non-terminals with the CNF form of CFG and if you find that any of the
elements matches, then write the corresponding variables in the specified field.
20. Similarly, in this way, we will fill the second third.
21. As you move downwards, as the level increase, you have to fill all the fields in the same way.
22. As shown in the figure, fill all the fields of the respective rows.
23. In the circles, you need to place the respective results after the multiplication of corresponding
fields
Context-Free Languages- Properties
Decision Properties
c. Membership Problem
Example 1: Verify whether the string ' bab ' is a member of the given CFG or not?
S → AB / a
A → BC / a
B → CC / b
C→a

The grammar is already is in the CNF. Now, applying the CYK Algorithm –

1) Constructing the table –

2) Let us fill all the fields of corresponding rows using the


CNF form of given CFG
S → AB / a
A → BC / a
B → CC / b
C→a
Context-Free Languages- Properties
Decision Properties
c. Membership Problem
Example 1:

For Row 1 –
• Now for first field x11, finding ' b ' in the CNF form of given CFG. We can find the ‘ b ' on the right-hand side
with the corresponding variable ' B ' present on the left-hand side.
• Now similarly for second field x12, finding ' a ' in the above CFG. We can find the ' a ' on the right-hand side
with corresponding variables ' S ', ' A ' and ' C ' present on the left-hand side.
• And same for third field x13, finding ' b 'in the above CFG. Similarly, as the first field, we were able to find the
corresponding Variable, i.e., ‘ B ’.

For Row 2-
• For the first field, i.e., x21, we need to multiply the just above field, i.e., the first field of the first row ( x11 ),
with the adjacent field, i.e., the second field of the first row
• { B } x { S , A , C } results out as –
• { BS, BA, BC }
• Now finding the above set of elements in the above Grammar, we observe that only BC matches the
corresponding variable ‘ A ’ on the left-hand side.
• Similarly, for the second field, i.e., x22, we need to multiply the just above field, i.e., the second field of the
first row ( x12 ), with the adjacent field, i.e., the third field of the first row ( x13 ).
• { S , A , C } x { B } results out as –{ SB, AB, CB }
• Now finding the above set of elements in the above Grammar, we observe that only AB matches the
corresponding variable ‘ S ’ on the left-hand side.
Context-Free Languages- Properties
Decision Properties
c. Membership Problem
Example 1:

For Row 3 –
For the remaining field, i.e., x31, we need two productions, first is the multiplication of the first field of the first
row, i.e., x11, with the second field of just above the row of row 3, i.e., second row, i.e., x22. The second is the
multiplication of the first field of the second row, i.e., x21, with the third field of the first row, i.e., x13.
Two productions are as follows –
{ B } x { S } = { BS }
{ A } x { B } = { AB }
Now, finding the above set of elements in the above Grammar, we observe that only AB matches the corresponding
variable ‘ S ’ on the left-hand side.

Here we find that the variable present at the last field of the last row is the start symbol.
Hence the given string, i.e., ‘ bab ', is the member of the Given Context-free Grammar.
Pumping Lemma for CFL
 The pumping lemma gives us a technique to show that certain languages are not context free.
• Just like we used the pumping lemma to show certain languages are not regular
• But the pumping lemma for CFL’s is a bit more complicated than the pumping lemma for
regular languages

 The pumping lemma can be used to construct a refutation by contradiction that a specific
language is not context-free.

 Conversely, the pumping lemma does not suffice to guarantee that a language is context-free;
there are other necessary conditions, such as Ogden's lemma, or the Interchange lemma.

Informally
• The pumping lemma for CFL’s states that for sufficiently long strings in a CFL, we can
find two, short, nearby substrings that we can “pump” in tandem and the resulting string
must also be in the language.

Consequences of Pumping Lemma:


• If L is context-free then L satisfies the pumping lemma.
• If L satisfies the pumping lemma that does not mean L is context-free
• If L does not satisfy the pumping lemma then L is not context-free.
Pumping Lemma for CFL
Formal Definition:
Let L be a CFL. Then there exists a constant p such that i times i times
if z is any string in L, where
|z| ≥ p, then we can write z = uvwxy subject to the
u v w x y
following conditions:

1. |vwx| ≤ p. This says the middle portion is not larger |z| ≥ p


than p.
2. vx ≠ ε. We’ll pump v and x. One may be empty, but
both may not be empty.
3. For all i ≥ 0, uw y is also in L. That is, we pump both
v and x.

Game View
Game between Defender, who claims L satisfies the
pumping condition, and Challenger, who claims L
does not.

If L is CFL, then there is always a winning strategy


for the defender (i.e., challenger will get stuck).
(In contrapositive): If there is a winning strategy for
the challenger, then L is not CFL.
Pumping Lemma for CFL
How to use it?

1. Assume L is context-free:
Start by assuming the language you want to prove is not context-free is context-free.
2. Find a suitable string:
Choose a string 'w' in L where |w| >= p (the pumping length).
3. Divide into uvwxy:
Try to find a way to divide 'w' into five parts (uvwxy) that satisfy the conditions of the
pumping lemma.
4. Show that pumping/unpumping fails:
Demonstrate that for any possible way to divide 'w', pumping or unpumping (changing
the number of times v and x are repeated) will always result in a string that is not in L.
5. Contradiction:
Since the pumping lemma conditions cannot be satisfied, the initial assumption that L is
context-free must be false. Therefore, L is not context-free.
Pumping Lemma for CFL
Example 1:
Let L be the language {| n ≥ 1 }. Show that this language is not a CFL.

• Suppose that L is a CFL. Then some integer n exists and we pick z =


• Since z=uvwxy and |vwx| ≤ n, we know that the string vwx must consist of either:
– all a’s
– all b’s
– all c’s
– a combination of a’s and b’s
– a combination of b’s and c’s

• The string vwx cannot contain a’s, b’s, and c’s because the string is not large enough to span all
three symbols.

• Now “pump down” where i=0. This results in the string uwy and can no longer contain an equal
number of a’s, b’s, and c’s because the strings v and x contains at most two of these three symbols.

Therefore, the result is not in L and therefore L is not a CFL..


Pumping Lemma for CFL
Example 1:
Let L be the language {| n ≥ 1 }. Show that this language is not a CFL
.
Case 1: v and x each contain only one type of symbol. {we are considering only v and x because v
and x has power uv2wx2y}
Say, n=4, we have, aaaabbbbcccc

Now, aaaabbbbcccc =uvwxy

when i=2, =uviwxiy =aaaaaabbbbccccc


=a6b4c5

x6y4z5 ∉ L. Therefore, the resultant string is not satisfying the condition.

If one case fails then no need to check another condition.

Case 2: Either v or x has more than one kind of symbols


Let, aaaabbbbcccc = uvwxy
When i=2, uviwxiy =uv2xy2z =aaaabbaabbbbbcccc
=a4b2a2b5c4
This string is not following the pattern of our string x nynzn
Therefore, the resultant string is not satisfying the pumping lemma.

Hence, the language is not CFL.


Pumping Lemma for CFL
Example 2:
Show that the language L = {| 0 < i < j < k } is not a context-free language.

This language is similar to the previous one, except proving that it is not context free requires the
examination of more cases.

If L were context free, then the pumping lemma should hold.


Let z =
Given this string and knowing that |z| =n+(n+1)+(n+2)=3n+3 >= n,
we want to define z as uvwxy such that |vwx| <= n, |vx| >= 1.

As |vwx| <= n, there are five possible descriptions of uvwxy:


1. vwx is for some p<=n, p>=1
2. vwx is for some p+q<=n, p+q>=1
3. vwx is for some p<=n, p>=1
4. vwx is for some p+q<=n, p+q>=1
5. vwx is for someq<=n, q>=1

Note: As |vwx| <= n, vwx cannot contain both "a"s and "c".
Pumping Lemma for CFL
Example 2:
Show that the language L = {| 0 < i < j < k } is not a context-free language.
Case 1: vwx is entirely within the a’s (vwx = aᵖ)
Example: Let’s say vwx = a² ⇒ v = a, x = a Let z =
As |vwx| <= n, there are five possible
descriptions of uvwxy:
Then u = aⁿ⁻², w = ε, y = bⁿ⁺¹ cⁿ⁺² 1. vwx is for some p<=n, p>=1
2. vwx is for some p+q<=n, p+q>=1
Now pump i = 2: 3. vwx is for some p<=n, p>=1
Result: aⁿ⁺² bⁿ⁺¹ cⁿ⁺² 4. vwx is for some p+q<=n, p+q>=1
5. vwx is for some q<=n, q>=1
Now i = n+2, j = n+1, k = n+2 ⇒ i > j, which violates the condition i < j
Not in L
Case 2: vwx is in a’s and b’s (e.g., vwx = )
Suppose vwx = a¹b² ⇒ v = a, x = b

Then u = aⁿ⁻¹, w = b, y = bⁿ⁻1 cⁿ⁺²


Now pump i = 2:
Result: aⁿ⁺¹ bⁿ⁺² cⁿ⁺²

Now, i = n+1, j = n+2, k = n+2


Now j = k, which violates j < k
Not in L
Pumping Lemma for CFL
Example 2:
Show that the language L = {| 0 < i < j < k } is not a context-free language.
Case 3: vwx is entirely within the b’s (vwx = bᵖ)
Suppose vwx = b² ⇒ v = b, x = b Let z =
As |vwx| <= n, there are five possible
Then u = aⁿ bⁿ⁻¹, w = ε, y = b¹ cⁿ⁺²
descriptions of uvwxy:
Pump i = 2: 1. vwx is for some p<=n, p>=1
Result: aⁿ bⁿ⁺¹ cⁿ⁺² 2. vwx is for some p+q<=n, p+q>=1
→ Original string, still in L 3. vwx is for some p<=n, p>=1
4. vwx is for some p+q<=n, p+q>=1
Pump i = 3: aⁿ bⁿ⁺² cⁿ⁺² 5. vwx is for some q<=n, q>=1
⇒ i = n, j = n+2, k = n+2
⇒ j = k, again violates j < k
Not in L
Case 4: vwx is in b’s and c’s (e.g., vwx = bᵖc ᑫ)
Suppose vwx = b¹c¹ ⇒ v = b, x = c

Then u = aⁿ bⁿ w = ε, y = cⁿ⁺¹

Pump i = 0:
Result: aⁿ bⁿ⁻¹ cⁿ⁺¹
⇒ i = n, j = n-1, k = n+1
⇒ i > j, violates i < j
Not in L
Pumping Lemma for CFL
Example 2:
Show that the language L = {| 0 < i < j < k } is not a context-free language.
In case 1, if i=2 we will be adding an a to the string, Let z =
making the number of "a"s n+1 and thus the string is As |vwx| <= n, there are five possible
not in the language. descriptions of uvwxy:
E.g. aabbbcccc 1. vwx is for some p<=n, p>=1
2. vwx is for some p+q<=n, p+q>=1
The same argument holds for case 3 in which the 3. vwx is for some p<=n, p>=1
4. vwx is for some p+q<=n, p+q>=1
number of "b"s will be equal to the number of "c"s. 5. vwx is for some q<=n, q>=1
e.g. aabbbcccc

A similar argument holds in case 5. In case 5 if i=0 In case 4, when i=0 either the number of
then the number of "c"s will be less than or equal to "b"s will be less than or equal to number of
the number of "b"s. "a"s or the number of "c"s will be less than
E.g. aabbbcccc or equal to the number of "b"s (depending
on the distribution of v and x).
In case 2, when i=2 either the number of "a"s will be
greater than the number of "b"s or the number of "b"s For all of these cases, u w y does not
will be greater than the number of "c"s (depending on belong to the language L.
the distribution of v and x). This is a contradiction to our assumption.
E.g. aabbbcccc So, our assumption is wrong.
L is not a CFL.
Pumping Lemma for CFL
Practice Questions:

Example3:
Show that the language L = {: i is a prime} is not a context-free language.

Example 4:
Is the language L = { : w is in {a,b}*} a context-free language? Prove or disprove your
answer.

Example 5:
Show that the language L = {|n} is not a context-free language.
By: Dr. Monali Bordoloi, Asst. Prof. Sr. Gra 56 02/07/2025 08:19 AM
de 1, VIT-AP

You might also like