FLAT-Regular Expression and Language
FLAT-Regular Expression and Language
UNIT –2
REGULAR EXPRESSIONS AND LANGUAGES
Regular Expressions serve s the input language for many systems that process strings
Regular expression uses a set of operators (like +,., *,….) to represent the language.
Search systems convert RE into DFA/NFA that stimulates the automation on the file
being searched.
Example: Lexical Analyzer accepts tokens as inputs and produces a DFA that
recognizes which token appears next on the input. The set of Stings accepted by finite
automata is known as regular language.
Representation
DFA Language Regular Expression
{Є} R.E = ε
{a} R.E = a
b {a, b} R. E = a + b
a b
{ab} R.E = a.b = ab
Theory of Computation
Formal Definition of a RE
Let ∑ be a given alphabet. Then,
(i) , , and a belongs to ∑, are all regular expressions. [primitive Regular
Expressions]
(ii) If r1 and r2 are regular expressions, then r1+r2, r1r2, r1*, (r1) are also regular
expression if and only if it can be derived from the primitive RE by the finite
number of applications of the rules in (ii).
Operators of Regular Expressions
The union of t wo languages, L and M is denoted by LUM which are a set of strings
that are either in L or M or both.
Example:
L = {001, 10, 111}
M = {, 001}
LUM = {, 00, 10,111}
The concatenation of languages L and M, denoted L.M which are a set of strings that
can be formed by taking any string in L and concatenating with any string in M.
Example:
L = {001, 10, 111}
M = {, 001}
LM = {001, 001001, 10, 1000, 111, 111001}
The closure of a language, L is denoted as L* represent the set of those strings that can
be formed by taking any number of strings from L, possibly with repetitions and
concatenating all of them.
Example:
L = {0, 1}
L0 = {}
L1 = {0, 1}
L2 = {01, 00, 11, 10}
PROBLEMS
1) What is {10, 11}*? Write atleast first seven terms. AU DEC 2009
(10, 11)* = (10, 11)) U (10, 11)‟U (10, 11) 2 U (10, 11) 3 U…..
= {} U {10, 11} U {1011, 1110} U {101110, 101011,…}
= { , 10, 11, 1011, 1110, 101110, 101011,….}
Regular Expressions and Languages 2.3
2) Write a RE for the language accepting all combinations of a‟s over the set ∑ = {a}
L = {all combination of a‟s over {a}}
= {Є, a, aa, aaa,…….}
R = a*
3) Design the RE for the language accepting all combinations of a‟s except the null string
over
{}
L = {all combination of a over {a }except }
= {a, aa, aaa,aaaa…….}
R = a+
1) Design a RE for the languages containing all thes strings containing any number of
a‟s and b‟s .
2) Construct the RE for the language contai9ning all strings having any number of a‟s
and b‟s except the null string.
L = {a, ab, b, aa, abb, aabb ,..}
R = (a+b)+
3) Construct a RE for the language accepting all strings which are ending with 00 over
the set, ∑ = {0,1}
L = {00, 000, 100, 0100, 1000, 01000, 01000, 11100,..}
R = (0+1)* 00.
4) Write a RE for the language accepting the strings which are starting with 1 and
ending with 0, over the set ∑ = {0,1}
L = {10, 100, 110, 1000, 10101, 111000 ….s,}
R = 1(0+1)*0
5) Write a RE to denote a language, L over ∑* where ∑ = {a, b} that the third character
from right end of the string is always a.
L = aab, aba, aaa, abb, babb, aaaa,…
R = (a+b)* a(a+b)(a+b)
4.2 Theory of Computation
6) Construct a RE for the language, L which accepts all the strings with at least 2 b‟s
over the alphabets ∑ = {a,b}
L = {bb, abb, bba, bab,bbb,bbaa, ababa,…}
R =(a+b)*b(a+b)*b(a+b)*
7) Construct a RE for the language which consist of exactly 2 b‟s over the set, ∑ = {a, b}
L = {bb, abb, baba, bba, aaaabab,….}
R = a*ba*ba*
8) Construct a RE which denotes a language, L over the set , ∑ = {0}, having even length
of string.
R = (00)*
9) Write a RE which denotes a language, L overt the set, ∑ = {1}, having odd length of
strings
L = {1, 111, 11111, 1111111,…}
R = (11)*1
10) Write the RE to denote the language, L over ∑ = {a, b}, such that all the strings do
not contain the substring ab.
L = {bbaa, ba, bbbaaa, bb, aa,….}
R = b* a*
11) Write a RE which contains L having the strings which have at least one 0 and 1.
L = {01,001,101,010,011, 10101,10,..}
→ R = (0+1)* 0(0+1)* 1(0+1)* + (0+!)* 1(0+1)*0(0+1)*
12) Describe the following by RE
a) L1 = the set of all strings of 0‟s and 1‟s ending in 00.
b) L2 = the set of all strings of 0‟s and 1‟s beginning with 0 and ending with 1
AU MAY 2004
a) R1 = (0+1)* 00
b) R2 = 0(0+1)* 1
Regular Expressions and Languages 2.5
b) R2 = 1* 0 1* 0 (1+0)*
c) R3 = (1+0)* (00 + 11 + 10)
CONVERTING DFA TO RE
THEOREM
If L is a language defined by some DFA, then there is a regular expression defining L.
(or)
If L = L(A) for some DFA, then there is a RE, R such that L = L [R]
PROOF
Let a be a DFA which defines a language L Assume that A has „n‟ states.
Let us construct the RE of the form, R(k)ij, whose language is the set of strings, w that
takes the DFA from state I to j without going to any state, numbered greater than k.
We construct R(k)ij by induction, k as follows
Basis
Let k = 0 [since all the states are numbered 1]. The Restriction on paths is that the path
must have no intermediate states at all.
There are only two kinds of paths that need such a condition,
An arc from state I to state j.
A path of length zero, that consists of only some state i.
Case 1:- if i≠j
Only (i) is possible. We must examine the DFA, A and find those input symbol, a
such that there is a transition from state, I to j on symbol, a
c) If there are symbols a1, a2,….ak that label arcs from state i to j then R(0)
ij
= a1 + a2
+ a3 + …. + ak
Case 2:- if i = j
The legal paths are the path of length zero and all loops from I to itself.
The path of length zero is represented by the regular expression, since the path has
no symbols along it. Thus we add, Є to the various expressions derived in (a) through (c)
above.
i.e., Case (a) → No symbol „a‟ → The expression becomes .
Case (b) → One symbol „a‟→ The expression becomes +a
Case (c) → Multiple symbols „a‟→ The expression becomes + a1 + a2 +a3 +….+ ak
Induction
Suppose there is a path from state i to state j that goes through no state higher than K.
There are two possible cases to consider.
Case 1
The path does not go through state k at all. In this case, the label of the path is in the
language of Rk1
ij
Case 2
The path goes through state k at least once. Then we can break the path into several
pieces. The first goes from state I to state k without passing through k.
The Last piece goes from k to j without passing through k and all the pieces in the
middle go from k to itself without passing through k. If a path goes through state k only once,
then there are not middle pieces just a path from I to k and a path from k to I.
The set of labels for all paths of this type is represented by the regular expression
R (R )* Rk1kj
k1
ik
k1
kk
, i.e., the first expression represents the part of the path that gets to state k,
the first time, the second represents the quotient that goes from k to itself „0„ times, once or
more than once. And the third expression represents the part of the path that leaves k for the
last time and goes to state, j.
When we combine the expressions for the paths of the two types, we have the
expression
PROBLEMS
1) Obtain the RE that denotes the language as accepted by the following DFA.
AU DEC 2004
1 0,1
0
1 2
i → start state = 1
j → Final state = 2
Basis Step
Step 1:
R(0) a a a ...... a ;i j
ij 1 2 3 k
a1 a2 ak ; i j
R11(0) 1 [i j]
R12(0) 0 [i j]
R12(0) [i j]
R22 0 1 [i j]
Inductive Step
ij R ij
R (k) R k1
(k1) k1 k1
ik (R kk )*R kj
k takes the values 1 and 2. For k =0, basis step has already been calculated.
Step 2:
k=1
ij R ij R i1 (R11 )*R 1j
R (1) (0) (0) (0) (0)
(1) [(1)(1)*(1)]
(1)1*(1) (1)*1*
(1)1* 1*(1)*1*
i=1; j=2 R
21
(1)
R
12
(0)
R11(0) (R11(0) )*R(0)
12
0 [(1)(1)*(0)]
0 [((1)1*(0)] (1)*1*
0 [1*0] (1)*1*
[ (1)*(1)]
01 Somethingsomething
Step 3:
k=2
To find R(2)
22
R ij(2) R (1)
ij Ri2(R )22
(1) (1)
*R (1)
2j
1 0
q2 q3
q1
1
0 1
Solution:
Here;
i=1
j=1
k=3
To find: R(k)
ij
R(3)11
Basis
Step 1:
a a ..... a ;i j
ij a
R(0) 1 2 3 k
a1 a2 ..... ak ; i j
(0)
R 11 0 (0)
R 22 (0)
R 31 0
12 1
R(0) 22 1
R(0) 32 1
R(0)
R 13(0)
23 0 33
R(0) R(0)
Inductive step
ij R ij
R (k) R (k1)
(k1)
ik (R (k1) (k1)
kk )*R kj
k = 0 computed in step 1
Step 2:
k=1
0* ( 0) 0*0*
(1)
R12 R 12
(0)
R 11
(0) (0)
(R11 (0)
)*R 12
[ ( 0) ( 0) *.]
. a
22 R 22 R 21 (R11 )*R 12
R (1) (0) (0) (0) (0)
[.( 0) *.( 0) . a
22 R 22 R 21 (R11 )*R 12
R (1) (0) (0) (0) (0)
(0)
. a
(1) aa
12. Theory of Computation
23 R 23 R 21 (R11 )*R 13
R (1) (0) (0) (0) (0)
0 [.( 0)*.]
0
. a
0 a a
31 R 31 R 31 (R11 )*R 11
R (1) (0) (0) (0) (0)
0[0.0*.] 0 00*00*
(1)
R 32 R 32
(0)
R (0) (0) (0)
31 (R11 )*R 12
1[( 0)*1]
1 00*1
(1)
R 33 R 33
(0)
R 31
(0) (0)
(R11 (0)
)*R 13
[0.( 0)*
Step 3:
k =2
R(2)
ij
R(1)
ij
Ri2(1) (R22(21) )*R2(2)
j
R11(2) R11(1) R (1)
12
(R(1)
22
)*R (1)21
R(2)
11
0* . a
0* a a
(2)
R13 R13
(1)
R 12
(1)
(R )(122
*) R (1)
23
[ 0 *1)(1)*.0]
[(0*1)(1*)] ( 1)* 1*
0 *11* 0 a a
33 R 33 R 32
R (2) (1) (1)
(R )(122
*) R (1)
23
[(1 00*1)(1)*(0)]
q1 0 q1 1 q3
0
0,1
14. Theory of Computation
Solution:
To find: Rkij
i = 1 ←Initial State.
j = 2 ← Final state
k = 3 ← No of states.
R1z
(3)
R(2)
12 R 13(R 33
(2) (2)
)*R(2)32
Basis
Step 1:
(0)
Rij a1 a2 ... ak ;i j
a1 a2 ak ;i j
R(0)
11
R(0)
21
0 R(0)
31
12 0
R(0) 22
R(0) 32 0 1
R(0)
13 1
R(0) 23 1
R(0) 33
R(0)
Inductive step
Step 2:
k=1
ij R ij R i1 (R11 )*R 1j
R (1) (0) (0) (0) (0)
(1)
R11 R11
(0)
R 11
(0) (0)
(R11 (0)
)*R 11
00 0 0 0
0
(1)
R13 R13
(0)
R 11
(0) (0)
(R11 (0)
)*R 13
11 111
1
21 R 21 R 21 (R 22 )*R 11
R (1) (0) (0) (0) (0)
0 [0()*]
0 0 .* None
0 0 0 0
22 R 22 R 21 (R 11 )*R 12
R (1) (0) (0) (0) (0)
[0()*.0]
00 *.0 0
R (1)
23
R (0)
23
R (0)
21
(0)
(R11 (0)
)*R 13
1[0()*1]
1 01 *.1 1
(1)
R 31 R (0)
31
R 31
(0) (0)
(R11 (0)
)*R 11
[ () *.]
.a
(0 1)
. a
(0 1) . a
(1)
R 33 R (0)
33
R 31
(0) (0)
(R11 (0)
)*R 13
[ ()*1]
0
. a
Step 3:
k=2
ij R ij R i2( R 22
R (2) (1) (1) (1) (1)
)*R 2j
(2)
R12 R 12
(1)
R 12
(1)
( R (22
)1)*R (1)
22
0 [0(00)*] 1*(1)1*
0(00)*
(2)
R13 R13
(1)
R 12
(1)
( R (1) (1)
22 )* R 23
33 R 33 R 32 ( R 22 )* R 23
R (2) (1) (1) (1) (1)
(2)
R 32 R 32
(1)
R 32
(1)
(R (1) (1)
22 )* R 22
(01)[(01)( 00)*(00)
Step 4:
k=3
(3)
R12 R 12
(2)
R 13
(1) (2)
(R 33 )* R (2)
32
4. Find the RE, R23(2) from the following DFA. AU DEC 2007
1
1
1 2 1 3
Solution:
To find: R23(2) R(1)
23
R(1)22(R(1)22)* R(1)23
Basis
Step 1:
a a ...a ; i j
ij
R(0) 1 2 k
a1 a2 ...ak ; i j
18. Theory of Computation
R(0)
11
21 0
R(0) R(0)
31
R(0)
j2
1 22
R(0) R(0)
32
1 1
33
(0) (0)
R 13 R 23 R(0)
Inductive Step
We Require R(1)
23
,R(1)22
Step 2:
k=1
Rij(1) R ij(0) R (0) (0) (0)
i1 (R11 )* R 1j
23 R 23 R 21 (R 11 )* R 13
R (1) (0) (0) (0) (0)
1 01
22 R 22 R 21 (R 11 )* R 12
R (1) (0) (0) (0) (0)
[ 0() *1]
01 * None
Step 3:
k=2
23 R 23 R 22 (R 22 )* R 23
R (2) (1) (1) (1) (1)
q1 1 q2 0
q3
0,1
0 1
Regular Expressions and Languages 2.19
Solution:
The language accepted by the above DFA is R(2)
11
R(2)12 [since there are two final states]
The state q3 is eliminated since it is a trap/dead state.
0 1
q1 1 q2
Basis
Step 1:
R(0) a a a ...... a ;ij
ij 1 2 3 k
ij R ij R 12 (R 22 )* R 21
R (1) (1) (1) (1) (1)
(2)
R12 R (1)
12 R 12 (R 22 )*R 22
(1) (1) (1)
(1)
R11 R11
(0)
R11
(0) (0)
(R11 (0)
)*R 11
0* (1) 1* 1*
(1)
R12 R12
(0)
R11
(0) (0)
(R11 (0)
)*R 12
0 *1 1 0 * 1 0 *1
21 R 21 R 21 (R 11 )*R 11
R (1) (0) (0) (0) (0)
.a
22 R 22 R 21 (R 11 )*R 12
R (1) (0) (0) (0) (0)
(1) .a
(1) a a
Step 3:
k =2
(2)
R11 R11
(1)
R12
(1) (1)
(R 22 )*R (1)
21
0* .a
0* a a
12 R 12R 12
R(2) (1) (1)
(R(1) )*R
22
(1)
22
0 *11* 0 1*01*
Rij
qi pj
Qj
Pj
S
Assume that the state, qi is predecessors of the state S. Pj is the successor of s. When we
eliminate the state s, the regular expression, is labeled to R from Qi to Pj. Note that Qi and Pj
may also be same.
Rij Qi S*Pj
q0 S
q
T
We can represent the accepting string of this two-state automaton in many ways.
In general, we take the string accepted by this automaton, of the form;
( R SU*T)* SU*
(iii) If the start state and the accepting state are the same, then we are left with one –
state automaton.
R
q0
(iv) The derived regular expression is the sum of all expressions from the reduced
automaton for each accepting states by rules (ii) and (iii)
PROBLEMS
1) Construct the transition diagram for the given DFA and give a R.E by using state
elimination technique.
∑
0 1
Q
q1 {q2} {q1}
q2 {q3} {q1}
* q3 {q3} {q2}
1 0
0
q1 1 q2 0
q3
1
Solution:-
Here, q1 → Start State
q3 → Final State
The state to be eliminated is q2.
We use the Regular Expression: Rij Qi S*Pj is labeled for the arcs from q1 to q3.
To be found
q1-q2-q3
q3-q2-q1
q1-q2-q1
q3-q2-q1
q1 q2 q3
Qi Pj
S
Regular Expressions and Languages 2.23
Here,
Rij =
Qi = 0
S =
Pj = 0
r1 0. * 0
00 * None
r1 00 aa
r4 = 0 + 1. * . 0
r4 = 0 + 10 ← * = None
00
1+01 0 + 10
R
S U
q1
q3 q1
q3
11
T
∑
0 1
Q
*p s p
q p s
r r q
s q r
1 0
P 0 q 1 r 1 s
Solution:-
[First eliminate the state that is far away from start state].
States to be eliminated: s, r, q
Regular Expressions and Languages 2.25
Rij Rij =
Q1 = 0
p - s -q S =
Pj = 0
Qi Pj
S
r1 0*0
s 00 * None
r1 00 a a
Rij Rij = 0
Qi = 1
q - s - p S =
Pj =
Qi Pj
S
26. Theory of Computation
r2 0 1*
0
xa
0 a a
Rij
Rij r3 Rij QiS* Pj
p - s - r Qi 0 0*1
S 01 * None
Qi S Pj
Pj 1 r3 01 a a
Qi
S Pj r3
Elimination of s from q – s – r (r5)
Rij R r R Q S* P
ij 5 ij i j
q - s -r Qi 1 1*1
r - s - q Qi 1 .*0
Qi Pj S 1 1 .a
S
P 0 r 1 a a
j 6
Regular Expressions and Languages 2.27
p - s - p Qi 0 1 .*
Qi Pj S 1 .a
S
P r 1 a a
j 7
q - s - q Qi 1 1*0
Qi Pj S 1 0 None
S
r - s -r Qi 0 *1
Qi Pj
S 0 .a
S
Pj 1 r9 0 a a
Reduced Automaton
10
1 0
00 11
P q r
1
0
Pj
01
Qi Pj
s0 r 2 0 . a a
S Pj
Qi S 0 r3 1 a a
Pj
S Pj
q - r - q Qi 11
S 0
Qi Pj
S Pj 1
p q
0
Regular Expressions and Languages 2.29
The RE = r*
R = [ 1 + (00 + 010*1) ( 10 + 110*1) * 0]*
3) Convert the following NFA that accepts all strings of 0‟s and 1‟s such that either the
second or third position from right end has 1 to a R.E using state elimination
technique.
AU NOV/DEC 2013
0, 1
A 1 B 0, 1 0, 1
C D
Solution:-
Eliminate B. A → Start state cannot be eliminated
C, D → Final states cannot be eliminated first
To find:
Elimination of state B
r1 : A – B - C
r2 : C – B - A
r3 : A – B - D
r4 : D – B - A
r5 : A – B - A
r6 : C – B - C
30. Theory of Computation
r7 : D – B - D
r8 : C – B - D
r9 : D – B – C
Step 1:
Elimination of state – B can be done using Rij Q1 S*Pj
A - B - D Qi 1
Qi Pj
S r3
S Pj
D - B - A Qi
Qi S r4
Pj
S Pj
Regular Expressions and Languages 2.31
C - B- C Qi
Qi Pj S r6
S Pj 01
D - B- D Qi
S r7
Qi Pj Pj
S
Eliminating B from C – B – D (r8)
Rij Rij 0 1 r8 (0 1) *
C - B - D Qi r8 (0 1)
Qi Pj S r8 0 1
S
Pj
D - B -C Qi
Qi Pj S r9
S Pj 0 1
32. Theory of Computation
Reduced Automaton
0+1
1 ( 0 + 1)
0+1
A C D
Step 2:
D - C - A Qi
Qi S r2 0
Pj
S Pj
D-C- D Qi
S r4
Qi
Pj
Pj
S
Resulting Automaton
0, 1
1(0+1) (0+1) R U
S
A D
1 2
T
A - C - D
D - C - A
A-C- A
A-D- C Qi 0 1
S r2
Qi Pj
S
Pj
Eliminating D from A – D – A (r3)
A - D- A Qi ( 01)
S r3 0 1
Qi Pj Pj
S
Eliminating D from C – D – A (r4)
Rij
Rij r4 ( 0 1) *
C-D- A Qi 0 1
S r4
Qi Pj
S Pj
The Reduced Automaton
0+1 R
1 (0+1)
S U
A C
1 2
T
Regular Expression (RE1) = (R + Su*T)* SU*
RE1 = (0+ 1) + (1 (0 + 1)) * )* (1 (0 + 1)) *
RE2 = (0 + 1) * 1 (0 + 1)
Final Automaton
R = RE1 + RE2
= (0 + 1)* 1(0+1)(0 + 1) + ( 0 + 1)* 1( 0 + 1)
Regular Expressions and Languages 2.35
(or)
Let „r‟ be a regular expression, then there exists a NFA with Є – transition that accepts
L (r).
Proof
Suppose L = L[R] for a regular expression, R. we show that L = L[E] for some – NFA,
with
(i) Exactly one accepting state.
(ii) No arcs into the initial state.
Structural Induction on R
Basis
Є
(a) r =
(b) r =
a
(c) r = a
In (a) → the RE, is handled. The language = {}, since the only part from the start
state to the accepting state, labeled Є.
In (b) → It shows the construction of clearly there are no paths from start state
L = {}
Inductive Step
Assume that the theorem is true for the immediate sub-expressions of a given regular
expression. The languages of Є- NFA‟s with a single accepting state.
It starts at a new state, can go to the start state of either R or S, and then reach the
accepting state of one of these automata following a path labeled by some string L[R]
or L[S] respectively.
(a)
Є
Є
Є
R+S
Є
Є
Є
We can follow one of the Є – arcs to the accepting state of the new automaton.
Thus the language of this automaton is L [R] U L [S].
(b) R.S
R Є S
(c) R*
Є R Є
o Directly from the start state to the accepting state along a path labeled . That
path which is in L [R*] doesn‟t matter what expression R is.
o Goes to the start state of the automaton for R and processes through that
automaton one or more times and then to the accepting state. This set of paths
allows us to accept strings in Є, L[R], L[R]L[R], L[R]L[R]L[R and so on.
Thus covering all strings in L[R*] except perhaps which was covered by the direct
arc to the accepting state mentioned in (c)
Since the parenthesis does not change the language defined by the expression.
PROBLEMS
1) Convert the RE: (0 + 1)* 1(0+1) to -NFA
(0+1)
0
2 4 Є
Є
6
1
1
Є 3 5 Є
38. Theory of Computation
(0+1)*
0
Є
3 5
Є
7 8
Є
1 2
Є
1
4 6
Є
Є
(0+1)*1
0
Є
3 5
Є Є 1
8 9 10
Є 7
1 2
Є
1
4 6
Є
Є
(0+1)* 1(0+1)
0
0 1 1
Є
3 5 Є Є 1
Є 1
Є
Є 7 8 9 1 1
1 2 Є Є
1 1
4 6 Є 1 1 Є
Є
Regular Expressions and Languages 2.39
1 2
1*
Є
Є 1 Є
1 2 3
4
Є
01*
Є
Є Є Є 1 Є
1 2 3 4 5 6
1 Є 0
1 2 3
4
40. Theory of Computation
0+11
0
Є
Є 2 4
8
1 Є Є
1 Є 1
3 5 6 7
0*1
Є
Є Є Є 1
1 2 0 3 4 5 6
Є
10 + (0 + 11)0*1
1 Є 0
2 3 4 5
Є
Є
1
1
Є 0
8 9
Є
1
Є Є 0 Є
1 1 1 1
6 7 11
Є Є
1 Є 1 Є Є
1
1 1 1
(0+1)*
0
3 4 Є
Є
Є Є
1 2 7 8
Є Є
1
5 6
Є
00(0+1)*
Є
0
9
8
0 Є 0 Є Є Є Є
13
1 2 3 4 5 6 7 12
Є
Є
1
10 11
Є
(0+1)*
Є 0
3 4
Є Є
1 2 7 8
Є Є
1
5 6
Є
42. Theory of Computation
00+11:-
0 Є 0
2 3 4 5
Є
Є
1 10
Є
Є 1 Є 1
6 7 8 9
(0+1)*(00+11)(0+1)*
Є 0 Є 0
0
1 Є
Є
Є Є
Є Є Є
Є
1 Є
1 Є 1
Є
Є
Є Є
1
Є Є
Є
Regular Expressions and Languages 2.43
w = xyz
such that
(i) y ≠
(ii) xy n
PROOF
Suppose L is regular, then L = L [A] for some DFA, A. Suppose A has n states.
Consider any string, w of length „n‟or more, say w = a1, a2,… am where m ≥ n and ai is
an input symbol. For i = 0,1,2,….,n define state Pi (q0 ,( a1, a2 , .. an )) where is the
transition function of A and q0 is the start state of A. i.e, Pi is the state A after reading the
first I symbols of w.
Assume that, q0 = P0 = (q0, Є). Since we have only n states, it is easy to see that
every string consisting „n‟ or more symbols must cause at least a state to repeat. (This
principle is called as Pigeon hole principle).
Therefore, we can find two different integers I and j such that 0 ≤ i < j ≤ n with Pi = Pj.
By assuming the loop occurs are Pi, we can break the string w as,
w = xyz such that
(i) x =x a1, a2, ….ai
44. Theory of Computation
P0 Pi Pn-1
That is, x takes us from P0 to Pi once, y takes us from Pi back to Pi (since Pi is also Pj )
and z takes us from Pi to Pn-1. Note that x may be empty in the case that i = 0, also z may be
empty if j = n = m. However y cannot be empty since I is strictly less than j. Now consider
what happens if the automaton A receives the input xykz for any k ≥ 0.
Case 1:
If k = 0, then the automaton from the start state q0 (which is also Po) to Pi on input x.
Since Pi is also Pj it must be that A goes from Pi to the accepting state on input z. Thus
accepts xz.
Case 2:
If k > 0 , then A goes from q0 to Pi on input, x circles from Pi to Pi, k – times on input
yk and then goes to the accepting state on input, z Thus for any k ≥ 0, xykz is also accepted by
A that xykz is in L. Therefore we can say that L is a regular language.
PROBLEMS
1) By using pumping lemma, P.T the language, L{on on / n 1} is not a regular
language.
Solution:
Assume that L is a regular language Let n be a constant and w on on such that
w n . By pumping lemma, w can be splitted into 3 parts such that w = xy with
(i) y ≠ [ i.e, y 1 ]
(ii) xy n
Since y≠
x= 0 n-1 // since xy n
Regular Expressions and Languages 2.45
y=0 // Since y 1
z = 10n
Thus
(i) y≠
(ii) xy ( n 1) 1 xy n
(2) xy n
x = an-1 xy = an-1. a1 = an xy n
y=a y 1
z = bn+1 cn+2
At k = 0:-
xy0z = xz
= an-1.bn+1.cn+2 = an-1.bn+1.cn+2
At k = 1:-
xy1z = an-1.a. bn+1.cn+2
= an.bn+1.c n+2
At k = 2:-
xy2z = an-1.a.a. bn+1.cn+2
46. Theory of Computation
= an+1.bn+1.cn+2
3) Show that L = {ap | p is a prime number} is not regular. AU – DEC 2007, NOV 2011
Assume that L is regular and n be any integer. Let P ≥ n is a prime number and w = ap
Є L. By pumping lemma, w can be written as w = xyz such that
(i) y 1 y
(ii) xy n
(iii) xykz L k 0
Let,
Y = am where m ≥ 1 y m n
Consider xykz = xyz . yk-1 Let w= ak ; k → first prime
xykz = xyz.yk-1 k+1 → Next prime > P
= P + (k-1) y =k+k y
= P + Pm ← k = p+1 P = k -1
= P(1+m)
Both p and (1+m) are greater than 1. The product of P and (1 + m) cannot be prime
xykz L for k = p+1
L is not regular
4) Show that L = {0n 1n | n ≥ 1} is not regular. AU MAY 2004, DEC 2004, DEC 2005
Assume that L is a regular language. Let n be a constant and w = 0n 1n such
that w n . By pumping lemma, w can be splitted into 3 parts such that w = xyz with
(i) y ≠ (i.e y 1 )
(ii) xy n
(iii) xykz L k 0
Regular Expressions and Languages 2.47
Since xy n and y ≠ ,
x= 0n-1
y= 0
z = 1n
By pumping lemma, xykz ≥ L for k ≥ 0
When k = 0;
xy0z = xz
0n-1.1n
L [Comparing the power of 0 and 1 with w]
L is not a regular language.
w n2
At k = 2 xy2 z x y2 z n2 3
From 2 and 3,
x y z > x y2 z
xy2 z n2 4
Since xy n;
4 xy2 z \ n2
48. Theory of Computation
When y n and x
xy2 z n2 n
From 4 and 5,
n2 < xy2 z < (n 1)2
Example: n = 2; n2 = 22 = 4
(n+1) = 2+1 (n+1)2 = 32 = 9
There is no perfect square between 4 and 9. Henceforth, xy2 z is not a perfect square
since there is no perfect square between n and (n+1)
k
xyz L since xy z L
L is not regular.
xy n
x o m1
y=0
z = 1n om+n
(ii) xy n
(iii) xyk z L k 0
Let w be splitted as,
x = a n-1
y=a
z = b.b.an
At k = 0:-
xykz xy0z xz = an-1.b.b.an
Since the powers of a does not match, L is not regular.
L = {ww} = {anbanb}
Let w =xyz such that
(i) y > 0
(ii) xy n
(iii) xyk z L k 0
w can be splitted as
x = a n-1
y=a
50. Theory of Computation
z =b.an b.
At k = 0:-
Proof
Let, L = L [R]
M = L[S]
= L [R+S]
Example
Let m1 = ( S, ∑, 1 , So, F) and
Regular Expressions and Languages 2.51
Where r0 → New start state having two Є – moves from r0 → s0 and r0 → q0 being added,
R = SUQ U{r0}
H = FUG
3 1 U 2 U{(ro , , so ), (ro ,, qo )}
Hence, L [m3] = L[m1] U L [m2]
Thus, Regular Language is closed under Union.
Proof
Let L= L [A] for some DFA, where
A = (Q, ∑, , q0, F)
i.e., B is exactly like A, but the accepting states of A have not become the accepting
states of B and vice-versa.
0,1
0 1
q0 q0
q0
52. Theory of Computation
Here,
QN = {q0, q1, q2}
N → ∑
∑= {0, 1} QN 0 1
D
∑
0 1
QN
{q0} {q0, q1} {q0}
{q1} {q2}
* {q2}
{q0, q1} {q0, q1} {q0, q2}
* {q0, q2} {q0, q1} {q0}
* {q1, q2} {q2}
* {q0, q1, q2} {q0, q1} {q0, q2}
0
1
0 1
{q0,q1}
{q0} {q0,q2}
0
1
Regular Expressions and Languages 2.53
Complement of DFA
0
1
1
{q0} 0
{q0,q1} {q0,q2 }
Proof
We define A = (QL x QM, ∑, (qL , qm ), qL xqm , FL xFm )
where,
By induction on w ,
Example
Construct A B where A and B is givenby
54. Theory of Computation
1
0,11
0
A p
q
0 0,11
1
r s
B
Where
Here,
QA x QB = {p,q} x {r,s}
Z = {0, 1}
qA x qB = {p, r}
FA x FB = {q, s}
δ(qA, qB) ∑
Q↓ 0 1
Finite Automata
{p, r } 1 {p, s }
0 0
{q, r } 1
{q,s} 0, 1
Theorem
If L is a regular language, then L R is also regular.
Proof
Assume L i9s defined by regular expression. The proof is a structural induction on the
size of E. We show that there is another regular expression.
ER such that L(ER) = (L(E))R.
Basis
If = Є / / a, then
ER = / / a
E = ER
Induction
There are 3 cases depending on the form of E.
1) E = E1 + E2 then ER = E1R + E2 R
The justification is that the reversal of the union of two languages is obtained by
computing the reversals of the two languages and taking the union of those languages.
2) E = E1.E2 then ER = E2R. E1R
Example: L (E1) = {01,111}; L (E2) = {00, 10}
56. Theory of Computation
L(E1)R = {10,111}
L (E2)R = {00,01}
The justification is that any string w in L (E) can be written as w1, w2,…, wn where each
w1 is in L(E). But,
w R w Rn w Rn1.....................
w1R
R R
Each wR is in L (E ), so w is in ( ER ) Conversely any string in L ( ER )* is of the form
1 1 1
w1, w2,…wn, where each w1 is the reversal of a string in L(E1). The reversal of the string,
wR wR ........ wR is therefore a string in L ( E* ) which is L (E).
n n1 1 1
Example: If L = (0 + 1)0*
LML M
Proof
Let M1 ( s, , 1,so , F)
Theorem
If two languages, M1 and M2 are regular, then L (M1) . L (M2) is also regular.
Let M1 = (Q, ∑, δ, q0, F) be the given automata. Then M2 can be constructed such that,
L(M2) = L (M1)*.
M2 is constructed as:
(i) A new start state so is added with an – move is from so to qo.
Homomorphism
A string homomorphism is a function on strings that works by substituting a particular
string for each symbol.
h(1) =
h(0011) = h(0)h(0)h(1)h(1)
= abab
Theorem
Proof
To prove: L [h(E)] = hb [L(E)]
Basis
If E = or , then
H(E) = or
L[h(E)] = L (E)
Induction
There are 3 cases in induction part → Union, Concatenation and closure.
Let E = F + G
L[h(E)] = L[h(F + G)]
= L[h(F) U h (G)]
= L[h(F)] U L[h(G)] 1
L[h(G)] = h [L(G)]
Similarly concatenation and closure is also true,
Inverse homomorphism
Theorem
If h is a homomorphism from alphabet, ∑ to alphabet T and L is regular language over
T, then h1 (L) is also a regular language.
Proof
We construct a DFA, A for L, then a DFA for h1 (L) by using A and h. This DFA uses
the states of A but translates the input symbol according to h before deciding on the next state.
Let L be L(A) where DFA, A |(Q, ∑,δ, q0, F) define a DFA, B = (Q, ∑, γ, q0, F) where the
transition function γ is constructed by,
Since the accepting states of A and B are same, B accepts w if and only if A accepts
h(w) i.e., B accepts exactly those strings w that are in h1 (L)
= h-1(a)h-1(ba)h-1(ba)
= 022.
60. Theory of Computation
Two finite automata m1 and m2 are said to be equivalent if they accept the same
language.
Two FA m1 and m2 are not equivalent over ∑ if there exists a string w such that,
w L(m1 ) and w L(m2 )
(or)
wL(m1 ) and w L(m2 )
EXAMPLE:
1) Show that the automata given below are equivalent or not.
c c
d
q1 d q4 q5
d q1
d
c d
d c c
q1 c q6 q7
M1 M2
Step 1:
Here the start states are <q1> and <q4>
We shall start with <q1, q4> over ∑ ={c, d}
3 ( q1, q4 c) 1 (q1, c), 2 (q4 , c)
q1 , q4
3 ( q1, q4 d) 1 (q1, d), 2 (q4 , d)
q2 , q6
q3 , q7
3 ( q2 , q6 , d) 1 (q2 , d), 2 (q6 , d)
q1 , q4
Step 3:
State <q3,q7> over ∑ ={c, d}
3 ( q3 , q7 , c) 1 (q3 , c), 2 (q7 , c)
q2 , q5
3 ( q3 , q7 , d) 1 (q3 , d), 2 (q7 , d)
q3 , q7
Step 4:
State <q2, q5> over ∑ ={c, d}
3 ( q2 , q5 , c) 1 (q2 , c), 2 (q5 , c)
q3 , q7
3 ( q2 , q5 , d) 1 (q2 , d), 2 (q5 , d)
q1 , q4
Thus every state in m3 has been expanded.
The construction of combine DFA is as follows.
d
<q1 , q4> <q2 , q5>
d c
d c
c
62. Theory of Computation
Example: c, cccc, dd, ddcc, dcdcd, dcddcdcc, …. are accepted by both m1 and m2.
MINIMIZATION OF DFA
Minimization of DFA is a process of constructing an equivalent DFA with minimum
number of states.
If there exists a string of length k, which can distinguish qi and qj, the states qi and qj
are said to be k – distinguishable.
Algorithm
Algorithm for minimization divides states of a DFA into blocks such that:
1. States in a block are equivalent.
2. No two states taken from two different blocks are equivalent.
The first step is to partition the states of M into blocks such that all states in a block are o-
equivalent. This is done by placing,
a. Final states into one block
b. Non-final states into another block
c. The next step is to obtain the partition, P1 whose blocks consists of the set of states
which are 1- equivalent [length = 1]
Regular Expressions and Languages 2.63
PROBLEMS
0 1 0
q0 q1 q2 q3
0 1
1
0 1
1 1 0
q4 q5 q6 q7
1
0
0
Solution:
0 - Equivalence partitioning:
The given set of states in the DFA are classified into two,
1) States that are not final states bock 1
2) Final states block 2
P0
block1 = {q0, q1, q3, q4, q5, q6, q7}
block2 = {q2}
1 – Equivalence partitioning:
This divides the existing non final states based on the inputs that the states accept.
Processing input „0‟:
( q0 , 0) q1 [ block1]
(q1 , 0) q6 [ block1]
( q3 , 0) q2 [ block 2]
Equivalent states since the transition
(q4 ,0) q7 [ block1] result in same output.
64. Theory of Computation
( q5 , 0) q2 [ block 2]
(q6 ,0) q6 [ block1]
(q7 ,0) q6 [ block1]
From the above process of input o and 1, {q3, q5}-, {q1, q7} are a set of equivalent
states.
1 – equivalence partitioning
2 – Equivalence partitioning:
Block – 11
(q3 ,0) q2
Mapped to the same block, Block 2
(q5 ,0) q2
( q3 ,0) q6
Mapped to the same block, Block 12
(q5,0) q6
Block – 12
( q1 ,0) q6
Mapped to the same block, Block 13
(q7 ,0) q6
Regular Expressions and Languages 2.65
(q1 ,1)q2
(q7 ,1) q2 Mapped to the same block, Block 2
2 – Equivalence partitioning
P2 = (q3, q5), (q1, q7) (q0, q4) (q6) (q2)
Since further partitioning cannot be done, the final minimized DFA is given by
0 1
0 1 < q2>
< q0, q4> < q1, q7>
0
1
0
1
<q6>
< q1, q7 >
1 0
Input
a b
States
→ q0 q0 q3
q1 q2 q5
q2 q3 q4
q3 q0 q5
q4 q0 q6
q5 q1 q4
* q6 q1 q3
66. Theory of Computation
0- equivalence partitioning
P0 = (q0, q1, q2, q3, q4, q5) (q6)
block 1 block 2
1 – equivalence partitioning
( q0 , a ) q0 ( q0 , b) q3
( q1 , a) q2 ( q1 , b) q5
( q2 , a) q3 ( q2 , b) q4
( q3, a ) q0 ( q3 , b) q5
( q4 , a) q0 ( q4 , b) q6
( q5 ,a) q1 ( q5 , b) q4
There are no equivalent states. But still, q4 is the distinguishable state from the rest of
the states
2 – equivalence partitioning:
From the above transitions, there are no equivalent states but there are distinguishable
states, {q2, q5} since
( q2 , a ) q3
→ block 11
( q5 , a) q1
( q2 , b) q3
→ block 12
(q3 , b) q0
3- Equivalence partitioning:
( q0 , a) q0 block 112
( q1 , a) q2 block 111
( q3 , a) q0 block 112
( q0 , b) q0 block 112
( q1 , b) q0 block 111
( q3, b) q0 block 111
Since no states are equivalent, q0, q1, q3 are distinguishable from each other.
4 - Equivalence partitioning:
But again in 4 – equivalence partitioning, (q2, q5) is considered.
(q2, a) = q3 → block 1123
(q5, a) = q1 → block 1122
(q2, b) = q4 → block 12
(q3, b) = q4 → block 12
P4 = (q2) (q5) (q0) (q1) (q3) (q4) (q6)
Thus the given DFA cannot be minimized since no equivalent states are obtained.
Inputs
0 1
States ↓
→ q0 q1 q4
q1 q2 q5
* q2 q3 q7
q3 q4 q7
q4 q5 q8
* q5 q6 q1
q6 q7 q1
q7 q8 q2
* q8 q0 q4
68. Theory of Computation
0- equivalence Partitioning:-
P0 =(q0, q1, q3,q4,q6,q7) (q2,q5,q8)
Block 1 Block 2
1- equivalence partitioning:-
(q0 , 0) q1 Block 1
( q1 , 0) q2 Block 2
(q3 , 0) q4 Block 1
( q4 , 0) q5 Block 2
(q6 , 0) q7 Block 1
( q7 , 0) q8 Block 2
( q0 , 1)q4 block 1
( q1 , 1) q5 block 2
( q3 , 1)q7 block 1
( q4 , 1)q8 block 2
( q6 , 1) q1 block 1
( q7 , 1)q2 block 2
2- Equivalence partitioniong:
( q0 , 0) q1 ( q0 , 1)q4
( q3 , 0) q4 Block 12 ( q3 , 1)q7 Block 12
( q6 ,0) q7 ( q6 ,1) q1
( q1 , 0) q2 ( q1 , 1)q5
( q4 , 0) q5 Block 2 ( q4 , 1)q8 Block 2
( q7 ,0) q8 ( q7 , 1)q2
Regular Expressions and Languages 2.69
0, 1
0