FORMAL LANGUAGES AND AUTOMATA THEORY
UNIT - V
Decidability and Undecidability:
Decidable Problems:
A problem is said to be Decidable if we can always construct a
corresponding algorithm that can answer the problem correctly. We can
intuitively understand Decidable problems by considering a simple example.
Suppose we are asked to compute all the prime numbers in the range of
1000 to 2000. To find the solutionof this problem, we can easily devise an
algorithm that can enumerate all the prime numbers in this range.
Decidability in terms of a Turing machine, a problem is said to be a
Decidable problem if there exists a corresponding Turing machine
which halts on every input with an answer- yes or no. It is also important
to know that these problems are termed as Turing Decidable since a Turing
machine always halts on every input, accepting or rejecting it.
Semi- Decidable Problems:
Semi-Decidable problems are those for which a Turing machine halts on the
input accepted by it but it can either halt or loop forever on the input which
is rejected by the Turing Machine. Such problems are termed as Turing
Recognisable problems.
Examples: We will now consider few important Decidable problems:
Are two regular languages L and M equivalent?
We can easily check this by using Set Difference operation.
L-M =Null and M-L =Null.
Hence (L-M) U (M-L) = Null, then L,M are equivalent.
Membership of a CFL?
We can always find whether a string exists in a given CFL by using an
algorithm based on dynamic programming.
Emptiness of a CFL
By checking the production rules of the CFL we can easily state
whether the language generates any strings or not.
Undecidable Problems:
The problems for which we can’t construct an algorithm that can answer the
problem correctly in finite time are termed as Undecidable Problems. These
problems may be partially decidable but they will never be decidable. That is
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
there will always be a condition that will lead the Turing Machine into an
infinite loop without providing an answer at all.
Simply, A problem is undecidable if there is no Turing machine which
will always halt in finite amount of time to give answer as ‘yes’ or ‘no’. An
undecidable problem has no algorithm to determine the answer for a given
input.
We can understand Undecidable Problems intuitively by
considering Fermat’s Theorem, a popular Undecidable Problem which
states that no three positive integers a, b and c for any n>=2 can ever satisfy
the equation: a^n + b^n = c^n.
If we feed this problem to a Turing machine to find such a solution which
gives a contradiction then a Turing Machine might run forever, to find the
suitable values of n, a, b and c. But we are always unsure whether a
contradiction exists or not and hence we term this problem as
an Undecidable Problem.
Examples: These are few important Undecidable Problems:
Ambiguity of context-free languages: Given a context-free language,
there is no Turing machine which will always halt in finite amount of
time and give answer whether language is ambiguous or not.
Equivalence of two context-free languages: Given two context-free
languages, there is no Turing machine which will always halt in finite
amount of time and give answer whether two context free languages
are equal or not.
Everything or completeness of CFG: Given a CFG and input
alphabet, whether CFG will generate all possible strings of input
alphabet (∑*)is undecidable.
Regularity of CFL, CSL, REC and REC: Given a CFL, CSL, REC or
REC, determining whether this language is regular is undecidable.
Some more Undecidable Problems related to Turing machine:
Membership problem of a Turing Machine?
Finiteness of a Turing Machine?
Emptiness of a Turing Machine?
Whether the language accepted by Turing Machine is regular or CFL?
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
Note: Two popular undecidable problems are halting problem of TM and PCP
(Post Correspondence Problem).
Recursive and Recursive Enumerable Languages:
Recursive Enumerable (RE) or Type -0 Language:
RE languages or type-0 languages are generated by type-0 grammars. An
RE language can be accepted or recognized by Turing machine which means
it will enter into final state for the strings of language and may or may not
enter into rejecting state for the strings which are not part of the language.
It means TM can loop forever for the strings which are not a part of the
language. RE languages are also called as Turing recognizable languages.
Recursive Language (REC):
A recursive language (subset of RE) can be decided by Turing machine
which means it will enter into final state for the strings of language and
rejecting state for the strings which are not part of the language.
Ex: L= {anbncn|n>=1} is recursive because we can construct a turing
machine which will move to final state if the string is of the form anbncn else
move to non-final state. So the TM will always halt in this case.
REC languages are also called as Turing decidable languages.
The relationship between RE and REC languages can be shown in following
figure:
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
Closure Properties of Recursive Languages:
Union: If L1 and If L2 are two recursive languages, their union L1∪L2
will also be recursive because if TM halts for L1 and halts for L2, it will
also halt for L1∪L2.
Concatenation: If L1 and If L2 are two recursive languages, their
concatenation L1.L2 will also be recursive.
For Example:
L1= {anbncn|n>=0}
L2= {dmemfm|m>=0}
L3= L1.L2
= {anbncndm emfm|m>=0 and n>=0} is also recursive.
L1 says n no. of a’s followed by n no. of b’s followed by n no. of c’s. L2
says m no. of d’s followed by m no. of e’s followed by m no. of f’s. Their
concatenation first matches no. of a’s, b’s and c’s and then matches no.
of d’s, e’s and f’s. So it can be decided by TM.
Kleene Closure: If L1is recursive, its kleene closure L1* will also be
recursive.
For Example:
L1= {anbncn|n>=0}
L1*= { anbncn||n>=0}* is also recursive.
Intersection and complement: If L1 and If L2 are two recursive
languages, their intersection L1 ∩ L2 will also be recursive.
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
For Example:
L1= {anbncndm|n>=0 and m>=0}
L2= {anbncndn|n>=0 and m>=0}
L3=L1 ∩ L2
= { anbncndn |n>=0} will be recursive.
L1 says n no. of a’s followed by n no. of b’s followed by n no. of c’s and
then any no. of d’s. L2 says any no. of a’s followed by n no. of b’s
followed by n no. of c’s followed by n no. of d’s. Their intersection says
n no. of a’s followed by n no. of b’s followed by n no. of c’s followed by n
no. of d’s. So it can be decided by turing machine, hence recursive.
Similarly, complementof recursive language L1 which is ∑*-L1, will also
be recursive.
Note: As opposed to REC languages, RE languages are not closed under
complementon which means complement of RE language need not be RE.
Universal Turing Machine:
A Turing machine is said to be universal Turing machine if it can accept:
The input data, and
An algorithm (description) for computing.
This is precisely what a general purpose digital computer does. A digital
computer accepts a program written in high level language. Thus, a
general purpose Turing machine will be called a universal Turing machine
if it is powerful enough to simulate the behavior of any digital computer,
including any Turing machine itself.
More precisely, a universal Turing machine can simulate the behavior of an
arbitrary Turing machine over any set of input symbols. Thus, it is possible
to create a single machine that can be used to compute any computable
sequence.
Turing Machine Halting Problem:
Input − A Turing machine and an input string w.
Problem − Does the Turing machine finish computing of the string w in a
finite number of steps? The answer must be either yes or no.
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
Proof − At first, we will assume that such a Turing machine exists to solve
this problem and then we will show it is contradicting itself. We will call
this Turing machine as a Halting machine that produces a ‘yes’ or ‘no’ in a
finite amount of time. If the halting machine finishes in a finite amount of
time, the output comes as ‘yes’, otherwise as ‘no’. The following is the block
diagram of a Halting machine −
Now we will design an inverted halting machine (HM)’ as −
If H returns YES, then loop forever.
If H returns NO, then halt.
The following is the block diagram of an ‘Inverted halting machine’ −
Further, a machine (HM)2 which input itself is constructed as follows −
If (HM)2 halts on input, loop forever.
Else, halt.
Here, we have got a contradiction. Hence, the halting problem
is undecidable.
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
Rice Theorem:
Rice theorem states that any non-trivial semantic property of a language
which is recognized by a Turing machine is undecidable. A property, P, is
the language of all Turing machines that satisfy that property.
Formal Definition
If P is a non-trivial property, and the language holding the property, Lp , is
recognized by Turing machine M, then Lp = {<M> | L(M) ∈ P} is undecidable.
Description and Properties
Property of languages, P, is simply a set of languages. If any language
belongs to P (L ∈ P), it is said that L satisfies the property P.
A property is called to be trivial if either it is not satisfied by any
recursively enumerable languages, or if it is satisfied by all
recursively enumerable languages.
A non-trivial property is satisfied by some recursively enumerable
languages and are not satisfied by others. Formally speaking, in a
non-trivial property, where L ∈ P, both the following properties hold:
o Property 1 − There exists Turing Machines, M1 and M2 that
recognize the same language, i.e. either ( <M1>, <M2> ∈ L ) or (
<M1>,<M2> ∉ L )
o Property 2 − There exists Turing Machines M1 and M2, where
M1 recognizes the language while M2 does not, i.e. <M1> ∈ L
and <M2> ∉ L
Proof
Suppose, a property P is non-trivial and φ ∈ P.
Since, P is non-trivial, at least one language satisfies P, i.e., L(M0) ∈ P , ∋
Turing Machine M0.
Let, w be an input in a particular instant and N is a Turing Machine which
follows −
On input x
Run M on w
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
If M does not accept (or doesn't halt), then do not accept x (or do not
halt)
If M accepts w then run M0 on x. If M0 accepts x, then accept x.
A function that maps an instance ATM = {<M,w>| M accepts input w} to a N
such that
If M accepts w and N accepts the same language as M0, Then L(M) =
L(M0) ∈ p
If M does not accept w and N accepts φ, Then L(N) = φ ∉ p
Since ATM is undecidable and it can be reduced to Lp, Lp is also
undecidable.
Post Correspondence Problem:
The Post Correspondence Problem (PCP), introduced by Emil Post in 1946,
is an undecidable decision problem. The PCP problem over an alphabet ∑ is
stated as follows −
Given the following two lists, M and N of non-empty strings over ∑ −
M = (x1, x2, x3,………, xn)
N = (y1, y2, y3,………, yn)
We can say that there is a Post Correspondence Solution, if for some
i1,i2,………… ik, where 1 ≤ ij ≤ n, the condition xi1 …….xik =
yi1 …….yik satisfies.
Example 1
Find whether the lists M = (abb, aa, aaa) and N = (bba, aaa, aa) have a
Post Correspondence Solution?
Solution
x1 x2 x3
M Abb aaa
N Bba aaa aa
Here,
x2x1x3 = ‘aaabbaaa’
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
and y2y1y3 = ‘aaabbaaa’
We can see that
x2x1x3 = y2y1y3
Hence, the solution is i = 2, j = 1, and k = 3.
Example 2
Find whether the lists M = (ab, bab, bbaaa) and N = (a, ba, bab) have a
Post Correspondence Solution?
Solution
x1 x2 x3
M ab bab bbaaa
N a ba bab
In this case, there is no solution because −
| x2x1x3 | ≠ | y2y1y3 | (Lengths are not same)
Hence, it can be said that this Post Correspondence Problem
is undecidable.
Intractable problems
A problem is called tractable iff there is an efficient (i.e. polynomial-time)
algorithm that solves it.
A problem is called intractable iff there is no efficient algorithm that solves
it.
Most intractable problems have an algorithm – the same algorithm – that
provides a solution, and that algorithm is the brute-force search.
This algorithm, however, does not provide an efficient solution and is,
therefore, not feasible for computation with anything more than the
smallest input.
The reason there is no efficient algorithms for these problems is that these
problems are all in a category which I like to refer to as “slightly less than
random.” They are so close to random, in fact, that they do not as yet allow
for any meaningful algorithm other than that of brute-force.
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
If any problem were truly random, there would not be even the possibility of
any such algorithm.
Example:
1. SAT, the satisfiability problem to test whether a given Boolean
formula is satisfiable.
All sets of input must be tried systematically (brute-force) until a
satisfying case is discovered.
2. Integer partition: Can you partition n integers into two subsets
such that the sums of the subsets are equal?
As the size of the integers (i.e. the size of n) grows linearly, the size of the
computations required to check all subsets and their respective sums
grows exponentially. This is because, once again, we are forced to use
the brute-force method to test the subsets of each division and their
sums.
3. Graph coloring: How many colors do you need to color a graph
such that no two adjacent vertices are of the same color?
Given any graph with a large number of vertices, we see that we are
again faced with resorting to a systematic tracing of all paths,
comparison of neighboring colors, backtracking, etc., resulting in
exponential time complexity once again.
P, NP, NP-Complete and NP-Hard problems:
P is set of problems that can be solved by a deterministic Turing machine
in Polynomial time.
NP is set of decision problems that can be solved by a Non-deterministic
Turing Machine in Polynomial time.
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
P is subset of NP (any problem that can be solved by deterministic machine
in polynomial time can also be solved by non-deterministic machine in
polynomial time). Informally, NP is set of decision problems which can be
solved by a polynomial time via a “Lucky Algorithm”, a magical algorithm
that always makes a right guess among the given set of choices.
NP-complete problems are the hardest problems in NP set. A decision
problem L is NP-complete if:
1) L is in NP (Any given solution for NP-complete problems can be verified
quickly, but there is no efficient known solution).
2) Every problem in NP is reducible to L in polynomial time (Reduction is
defined below).
A problem is NP-Hard if it follows property 2 mentioned above, doesn’t need
to follow property 1. Therefore, NP-Complete set is also a subset of NP-Hard
set.
Vertex Cover Problem:
A vertex cover of an undirected graph is a subset of its vertices such that for
every edge (u, v) of the graph, either ‘u’ or ‘v’ is in vertex cover. Although the
name is Vertex Cover, the set covers all edges of the given graph. Given an
undirected graph, the vertex cover problem is to find minimum size
vertex cover.
Following are some examples.
Vertex Cover Problem is a known NP Complete problem, i.e., there is no
polynomial time solution for this unless P = NP. There are approximate
polynomial time algorithms to solve the problem though. Following is a
simple approximate algorithm adapted from CLRS book.
Department of Computer Science and Engineering R.Akshara
FORMAL LANGUAGES AND AUTOMATA THEORY
Approximate Algorithm for Vertex Cover:
1) Initialize the result as {}
2) Consider a set of all edges in given graph. Let the set be E.
3) Do following while E is not empty
a) Pick an arbitrary edge (u, v) from set E and add 'u' and 'v' to result
b) Remove all edges from E which are either incident on u or v.
4) Return result
Below diagram to show execution of above approximate algorithm:
3-SAT Problem:
A Boolean formula is in 3CNF if it is of the form C1 ∧ C2 ∧ · · · ∧ Ck where
each Ci is an ∨ of three or less literals.
A Boolean formula is in 3SAT if it in 3CNF form and is also SATisfiable.
3SAT is NP-complete and all NPC problems can be coded into SAT.
Department of Computer Science and Engineering R.Akshara