Lecture 03: Theory of Automata:2011
Theory of Automata
Regular Expressions
Lecture 03: Theory of Automata:2011
Defining Languages by Another
New Method
Lecture 03: Theory of Automata:2011
Regular Expressions
• Defining Languages by Another New Method
• Formal Definition of Regular Expressions
• Languages Associated with Regular Expressions
• Finite Languages Are Regular
• How Hard It Is to Understand a Regular Expression
• Introducing EVEN-EVEN
Theory Of Automata 3
Lecture 03: Theory of Automata:2011
Language-Defining Symbols
• As discussed earlier that a* generates Λ, a, aa, aaa, …
And
a+ generates a, aa, aaa, aaaa, …,
so the language L1= {Λ, a, aa, aaa, …} and L 2= {a, aa,
aaa, aaaa, …} can simply be expressed by a* and a+,
respectively.
a*and a+ are called the regular expressions (RE) for L1
and L2 respectively.
Theory Of Automata 4
Lecture 03: Theory of Automata:2011
Plus Sign
• Let us introduce another use of the plus sign. By
the expression
x+y
where x and y are strings of characters from an
alphabet, we mean either x or y.
• Care should be taken so as not to confuse this
notation with the notation + (as an exponent).
Theory Of Automata 5
Lecture 03: Theory of Automata:2011
Example
• Consider the language L={Λ, x, xx, xxx,…} of strings,
defined over Σ = {x}.
• We can write this language as the Kleene star closure of
alphabet Σ or L=Σ*={x}* .
• This language can also be expressed by the regular
expression x*
Theory Of Automata 6
Lecture 03: Theory of Automata:2011
Example
• Similarly the language L={x, xx, xxx,…},
defined over Σ = {x}, can be expressed by the regular
expression x+
Theory Of Automata 7
Lecture 03: Theory of Automata:2011
Example
• Now consider another language L, consisting of all
possible strings, defined over Σ = {a, b}.
This language can also be expressed by the regular
expression (a + b)*.
• Now consider another language L, of strings having
exactly one a, defined over Σ = {a, b},
• then it’s regular expression may be b*ab*.
Theory Of Automata 8
Lecture 03: Theory of Automata:2011
Example
• Now consider another language L, of even length,
defined over Σ= {a, b}, then it’s regular expression may
be
((a+b)(a+b))*
• Now consider another language L, of odd length, defined
over Σ = {a, b}, then it’s regular expression may be
• (a+b)((a+b)(a+b))*
Or
((a+b)(a+b))*(a+b).
Theory Of Automata 9
Lecture 03: Theory of Automata:2011
Example
• Consider the language, defined over Σ = {a , b} of words
having at least one a, may be expressed by a regular
expression
(a+b)*a(a+b)*.
• Consider the language, defined over Σ = {a, b} of words
having at least one a and one b, may be expressed by a
• regular expression
(a+b)*a(a+b)*b(a+b)*+ (a+b)*b(a+b)*a(a+b)*.
Theory Of Automata 10
Lecture 03: Theory of Automata:2011
Example
Consider the language, defined over Σ ={a, b}, of words
starting with double a and ending in double b then its
regular expression may be
aa(a+b)*bb
•Consider the language, defined over Σ ={a, b} of words
starting with a and ending in b OR starting with b and
ending in a, then its regular expression may be
a(a+b)*b+b(a+b)*a
Theory Of Automata 11
Lecture 03: Theory of Automata:2011
Remark
• It may be noted that a language may be
expressed by more than one regular
expression, while given a regular expression
there exist a unique language generated by that
regular expression.
12
Lecture 03: Theory of Automata:2011
Note
• It is important to be clear about the difference of
the following regular expressions:
• r1 = a*+b*
• r2 = (a+b)*
Here r1 does not generate any string of
concatenation of a and b, while r2 generates
such strings.
13
Lecture 03: Theory of Automata:2011
• Given the alphabet = {a, b}, suppose we wish to define the
language L that contains all words of the form one a followed by
some number of b’s (maybe no b’s at all); that is
L = {a, ab, abb, abbb, abbbb, …}
• Using the language-defining symbol, we may write
L = language (ab*)
• This equation obviously means that L is the language in which the
words are the concatenation of an initial a with some or no b’s.
• From now on, for convenience, we will simply say some b’s to
mean some or no b’s. When we want to mean some positive
number of b’s, we will explicitly say so.
Theory Of Automata 14
Lecture 03: Theory of Automata:2011
Example
• Consider the language T over the alphabet
Σ = {a; b; c}:
• T = {a; c; ab; cb; abb; cbb; abbb; cbbb; abbbb;
cbbbb; …}
• In other words, all the words in T begin with
either an a or a c and then are followed by some
number of b’s.
• Using the above plus sign notation, we may
write this as
T = language((a+ c)b*)
Theory Of Automata 15
Lecture 03: Theory of Automata:2011
Example
• Consider a finite language L that contains all the
strings of a’s and b’s of length three exactly:
L = {aaa, aab, aba, abb, baa, bab, bba, bbb}
• Note that the first letter of each word in L is
either an a or a b; so are the second letter and
third letter of each word in L.
• Thus, we may write
L = language((a+ b)(a + b)(a + b))
• or for short,
L = language((a+ b)3)
Theory Of Automata 16
Lecture 03: Theory of Automata:2011
Example
• In general, if we want to refer to the set of all possible
strings of a’s and b’s of any length whatsoever, we could
write
language((a+ b)*)
• This is the set of all possible strings of letters from the
alphabet Σ = {a, b}, including the null string.
• This is powerful notation. For instance, we can describe
all the words that begin with first an a, followed by
anything (i.e., as many choices as we want of either a or
b) as
a(a + b)*
Theory Of Automata 17
Lecture 03: Theory of Automata:2011
18
Lecture 03: Theory of Automata:2011
An important example
• The Language EVEN-EVEN Language of
strings, defined over Σ={a, b} having even
number of a’s and even number of b’s. i.e.
• EVEN-EVEN = {Λ, aa, bb, aaaa, aabb, abab,
abba, baab, baba, bbaa, bbbb,…}, its regular
expression can be written as
(aa+bb+(ab+ba)(aa+bb)*(ab+ba))*
19
Lecture 03: Theory of Automata:2011
20
Lecture 03: Theory of Automata:2011
Equivalent Regular Expressions
• Two regular expressions are said to be
equivalent if they generate the same language.
• Example
• Consider the following regular expressions
• r1 = (a + b)* (aa + bb)
• r2 = (a + b)*aa + ( a + b)*bb then both regular
expressions define the language of strings
ending in aa or bb.
21
Lecture 03: Theory of Automata:2011
Example
• The language of all words that have at least two a’s can
be defined by the expression:
(a + b)*a(a + b)*a(a + b)*
• Another expression that defines all the words with at
least two a’s is
b*ab*a(a + b)*
• Hence, we can write
(a + b)*a(a + b)*a(a + b)* = b*ab*a(a + b)*
where by the equal sign we mean that these two
expressions are equivalent in the sense that they
describe the same language.
Theory Of Automata 22
Lecture 03: Theory of Automata:2011
Example
• Let V be the language of all strings of a’s and b’s in
which either the strings are all b’s, or else an a followed
by some b’s. Let V also contain the word Λ. Hence,
V = {Λ, a, b, ab, bb, abb, bbb, abbb, bbbb, …}
• We can define V by the expression
b* + ab*
where Λ is included in b*.
• Alternatively, we could define V by
(Λ + a)b*
which means that in front of the string of some b’s, we have
either an a or nothing.
Theory Of Automata 23
Lecture 03: Theory of Automata:2011
Example contd.
• Hence,
(Λ + a)b* = b* + ab*
• Since b* = Λ b*, we have
(Λ + a)b* = b* + ab*
which appears to be distributive law at work.
• However, we must be extremely careful in
applying distributive law. Sometimes, it is difficult
to determine if the law is applicable.
Theory Of Automata 24