Automata Language, Grammar definition and Rules with
examples
A set of strings all of which are chosen form some ∑ *,
where ∑ is a particular alphabet, is called a language. If ∑
is an alphabet and L⊆∑*, then L named as language over
alphabet ∑. Informally a language is an equivalent
member of the power set of ∑* or any subsets of the
Kleene closure of an alphabet ∑ can be considered as
languages.
Rules of a languages: As informally any subsets of ∑*
over alphabet ∑ considered as languages but the vast
majority of strings there is not useful. So, to filter the
useful subset over ∑* language follows some rules that
make it useful and fully acceptable by automata. Useful
languages have following rules;
A language on the alphabets ∑ is that only the strings
having symbols over ∑ are member of that
language. For example if ∑ = {a, b, c} then string
‘abc’ is a member of that language.
A language over alphabet ∑ need not include strings
with all the symbols of ∑.
The alphabets included in language are strictly
different programming languages, but it generally
includes the upper- and lower-case letters, the
digits, punctuation and mathematically symbols.
A language consists all alphabets that are finite.
Generally, we have some symbols to be talk about
languages. Consider an example of a language L over
alphabet ∑ = {a, b, c}:
L = ab*(c|a)b
The language L accepts all the strings that start with a, are
followed by Kleene closure of b’s, then by ether single
occurrence of c’s or a’s, then end with single b’s. From
the following list of strings given below, which strings
belong to language L?
1. abbcb
2. abbb
3. bbb
4. aabbbab
5. abbbab
6. acb
7. bbcb
8. aab
9. abbbbbbbcab
10. abbbbcb
The strings number 1, 5, 6, 8 and 10 belong to the
language L. There are also some many languages that
appear when we study automata. Some are abstract
examples, such as:
The language of all stings consisting of n 0’s
followed by n 1’s, for some n ≥ 0: {ε, 01, 0011,
000111, .. .. ..}.
The set of strings of 0’s and 1’s with an equal number
of each:
{ε, 01, 10, 0011, 0101, 1001, 1010, .. .. .. ..}.
∅, the empty language, is a language over any
∑* is a language for any alphabet ∑.
alphabet.
{ε}, the language consisting of only the empty
Note: ∅ ≠ {ε}; the former has no strings while the latter
strings, is also a language over any alphabet.
has one string.
Set-formers to Define languages: Set-former is another
way to define a language. It is common to describe a
language. Format of set-formers is;
{x | something about x}.
This expression is read as, “The set of words x such that
(whatever is said about x to the right of the vertical bar).”
For example;
{x | x consists of an equal number of 0’s and 1’s}.
{x | x is a binary integer that is prime}.
{x | x is a syntactically correct C program}.
In this set-former it is common to replace x with some
expression with parameters and describe the strings in the
language by string conditions on the parameters. For
example;
{0n1n | n ≥1 }. This set-former is read as, “the set of 0
to the n 1 to the n such that n is greater than or equal
to 1,” This language consist of the stings {01, 0011,
000111, .. .. ..}.
i j
{0 1 | 0 ≤ i ≤ j}. This set-former define a language
that consists of strings with some 0’s (possibly
none) followed by at least as many 1’s.
Grammar:
A grammar is a set of rules for a strings generation in a
formal language. These rules describe how does strings
forms from the language that are valid according to the
language syntax. A grammar does not describe the
meaning of the generated string.
For example:
S → SbS
a → a|ε
b → b|ε