USN
Internal Assessment Test 1 – Oct 2022
Sub: Natural Language Processing Sub Code: 18CS743 Branch: CSE
Date: 21/10/2022 Duration: 90 mins Max Marks: 50 Sem / Sec: VII / A and B OBE
Answer any FIVE FULL Questions MARKS CO RBT
1 (a) Explain the components of transformational grammar. [06] CO1 L2
Transformational Grammar has 3 Components
1. Phrase Structure Grammar
2. Transformational Rules
3. Morphophonemic rules
All three components consist of a set of rules.
Phrase structure grammar:
Phrase structure grammar consists of rules that generate natural language sentences and assign a structural
description to them. Following are the set of rules:
S - sentence
VP - verb phrase
NP - noun phrase
V - verb
Det - determiner
Aux - Auxiliary verb
For example Transformational Grammar for the sentence - I ate apple
S -> Np+VP
NP -> I
VP -> V+NP
V -> ate
NP -> apple
Transformational grammar for the sentence - Pooja plays veena
S -> NP+VP
Np -> pooja
VP -> V+Np
V -> plays
NP -> veena
2. Transformational rules:
The second component of transformational grammar is a set of transformational rules, which transform one
phrase-maker into another phrase-maker. These rules are applied to the terminal string generated by phrase
structure rules. Transformational rules are heterogeneous and may have more than one symbol on their left-hand
side. These rules are used to transform one surface representation into another, for example, an active sentence
into a passive sentence. The rule relating to active and passive sentences is -
NP1- Aux – V - NP2 🡪 NP2 – Aux + be + en – V – by - NP1
Transformational Rules can be obligatory or optional. An obligatory transformation is one that ensures
agreement in number of subject and verb. An optional transformation is one that modifies the structure of a
sentence while preserving its meaning.
Example for transformational rules:
Consider the active sentence - The police will catch the snatcher. - (1)
The application of phrase structure rules will assign the structure as shown below
Fig: Parse structure of the sentence - ‘The police will catch the snatcher’
The passive transformation rules will convert the sentence into
The + snatcher + will + be + en + catch + by + the + police
Fig: Structure of sentence (1) after applying passive transformation
The Police will catch the snatcher. NP1-Aux-V-NP2
The snatcher will be caught by the police NP2-Aux-be+en-V+by+NP1
The rule relating to the conversion of active to passive sentence -
NP1- Aux – V - NP2 🡪 NP2 – Aux + be + en – V – by - NP1
3. Morphophonemic rules:
Morphophonemic rules match each sentence representation to a string of phonemes. Morphophonemics involves
an investigation of the phonological variations within morphemes, usually marking different grammatical
functions; e.g., the vowel changes in “sleep” and “slept,” “bind” and “bound,” “vain” and “vanity,” and the
consonant alternations in “knife” and “knives,” “loaf” and “loaves.”
The Transformational rule will reorder ‘en + catch’ to ‘catch + en’ and subsequently one of the morphophonemic
rules will convert ‘catch + en’ to ‘caught’.
‘en + catch’ ---> ’Catch + en’ ---> Caught
(b) Write the transformational grammar for the sentence “The boy hit the girl.” [04] CO1 L3
Transformational Grammar for the sentence - A boy hit the girl
S -> NP + VP
NP -> Det + Noun
VP -> V + NP
V -> Aux + V
Det -> A, the
Noun -> boy, girl
Verb -> hit
2 What is Government and Binding theory? Explain with an example. [10] CO1 L2
Chomsky was the first to put forward the Government and Binding (GB) theory. GB theory has four levels
1. s-level (surface level of transformational grammar is renamed as s-level)
2. d-level (deep root level of transformational grammar is renamed as d-level)
3. phonetic form
4. logical form
phonetic form and logical form are parallel to each other.
Language is the representation of some ‘meaning’ in a ‘sound’ form.
‘meaning’ logical form (LF) and
‘sound’ phonetic form (PF)
GB is concerned with the Logical Form rather than the Phonetic Form.
The vision of GB is, if we define rules for structural units at the deep level, it will be possible to generate any
language with fewer rules. These deep-level structures are common to all languages. The existence of
deep-level, language-independent, abstract structures, and the expression of these in surface-level,
language-specific structures with the help of simple rules is the main concern of GB theory.
Example to explain d-structure and s-structure:
Consider the sentence: Mukesh was killed
s-structure of the sentence - Mukesh was killed
d-structure of the sentence - Mukesh was killed.
Components of GB:
GB comprises a set of theories that map the structures from d-structure to s-structure and to logical form. The
phonetic form is not considered. A general transformational rule called “ Move ∝” is applied to the d-structure
level and s-structure level. This can move the constituents at any place if it does not violate the constraints put by
several theories and principles.
GB is represented as shown in fig.
Move ∝:
Chomsky reduced all the long-distance movements to one rule called move ∝. In the rule ‘Move ∝’, alpha
stands for any constituent. The rule does not specify where ∝ can be moved from, or where it should be
moved to: all this is left ‘free’. In effect, this rule says, ‘Move anything from anywhere to anywhere.
GB consists of, ‘ a series of modules that contain constraints and principles’ applied at various levels of its
representation and the transformation rule, Move ∝.
The modules include
– X-bar theory
– Projection Principle
– θ -Theory
– θ - Criterion
– C-Command and Government
– Case Study or Theory
– Empty Category Principle (ECP)
– Binding Theory
General Characteristics of GB:
X-Bar Theory:
The x bar theory is one of the central concepts in GB. Instead of defining several phrase structures and the
sentence structure with separate sets of rules, X bar theory defines them both as maximal projections of some
head. Hence the entities defined become language-independent.
Noun phrase (NP) is the maximal projection of Noun (N)
verb phrase (VP) is the maximal projection of Verb (V)
adjective phrase (AP) is the maximal projection of Adjective (A)
preposition phrase (PP) is the maximal projection of Preposition (P)
They can be represented as head X of their corresponding phrase (where X = {N, V, A, P}
Even the sentence structure S’ - which is a projection of sentence can be regarded as the maximal projection of
inflection (INFL).
GB envisages projection at 2 levels:
1. The projection of the head at semi-phrase level, denoted by X bar
2. The maximal projection at the phrase level, denoted by the X double bar
For sentences -
the first level projection is denoted as S
and the second level maximal projection is defined as S’.
General Phrase and Sentence Structure:
Projection principle
Projection principle places a constraint on the three syntactic representations and their mapping from one to the
other. The projection principle states that representations at all syntactic levels (i.e., d-level, s-level, LF-level) are
projection from the lexicon (head). Thus, lexical properties of categorical structure (sub-categorization) must be
observed at each level.
Ex: Suppose the ‘object’ is not present at the d-level, then another NP cannot take this position at the s-level.
Mukesh was killed OBJECT
Projection principle, in conjunction with the possibility of the presence of an empty category and other theories
(Binding theory) ensures correct movement and well-formed structure.
Theta Theory (θ-theory) or the Theory of Thematic Relations
GB puts restrictions on lexical heads through which it assigns certain roles to its arguments. These roles are
pre-assigned and cannot be violated at any syntactic level as per the projection principle. These role assignments
are called theta roles and are related to ‘semantic selection.
Theta-role and theta-criterion
There are certain thematic roles from which a head can select. These are called theta-roles and they are
mentioned in the lexicon.
Ex: The verb ‘eat’ can take arguments with θ-roles ‘(Agent-Theme)’.
Agent is a special type of role which can be assigned by a head to outside arguments (external arguments)
whereas other roles are assigned within its domain (internal arguments).
C-command and government
C-Command
C-command define the scope of maximal projection. It is a basic mechanism through which many constraints are
defined on move α. If any word or phrase (say ∝ or β) falls within the scope of and is determined by a maximal
projection, then it is dominated by maximal projection.
The Necessary and sufficient condition for C-command:
– If there are two structures α and β related in such a way that ‘every maximal projection
dominating α dominates β’, we say that α C-commands β.
– The definition of C-command does not include all maximal projections dominating β, but only
those dominating α.
Government:
Government is a special case of C-command.
– α governs β iff:
– α C-commands β
α is an X (head, eg: noun, verb, preposition, adjective, and inflection), and every maximal projection dominating
β dominates α. No maximal projection can intervene between the governor and governee.
Binding theory:
Binding theory applied at A-positions is given as -
(a) An anaphor (+a) is bound in its governing category.
(b) A pronominal (+p) is free in its governing category
(c) An R-expression (-a, -p) is free.
Example:
A: Mukeshi knows himselfi
B: Mukeshi believes that Amrita knows him i
C: Mukesh believes that Amrita j knows Nupurk
Empty category principle (ECP)
‘Proper government’
α properly governs β iff
α governs β and α is lexical (i.e, N, V, A or P) or
α locally A binds β.
Bounding and control theory:
• In English, the long distance movement for complement clause can be explained by bounding theory if NP
and S are taken to be bounding nodes.
• The theory says that the application of move α may not cross more than one bounding node.
• The theory of control involves syntax, semantics and pragmatics.
Case Theory and Case Filter
Case Theory:
In GB, case theory deals with the distribution of NPs and mentions that each NP must be assigned a case. In
English, we have Nominative, Objective, Genitive, etc., cases which are assigned to NPs at particular positions.
Case Filter:
An NP is ungrammatical if it has phonetic content or if it is an argument and is not case marked. Case filters
restrict the movement of NP at a position which has no case assignment.
3 Describe C-Structure and F-Structure in LFG. Write C-Structure and F-Structure for the sentence - ‘She saw [10] CO1 L3
stars’ using CFG rules
S -> NP VP
VP -> V {NP} {NP} PP* {S’}
PP -> P NP
NP -> Det N {PP}
S’ -> Comp S
The term ‘lexical functional’ is composed of two parts:
The lexical part - is derived from the fact that the lexical rules can be formulated to help define the structure of
a sentence
The functional part - is derived from grammatical functions such as subject and object or roles played by
various arguments in a sentence.
LFG represents sentences at two syntactic levels –
1. Constituent structure (c-structure)
2. Functional structure (f-structure).
C-Structure and F-structure in LFG
C-Structure
The c-structure is derived from the phrase and sentence structure syntax.. C-Structure is used for encoding linear
order constituency and hierarchical relations.
F-structure
f-structure encodes the information obtained from phrase and sentence structure rules and functional
specifications. As the grammatical functional role cannot be derived directly from phrase and sentence structure,
functional specifications are annotated as the nodes of c-structure, which when applied to sentences, results in
f-structure.
Example: She saw stars in the sky
CFG rules to handle this sentence are:
S -> NP VP
VP -> V {NP} {NP} PP* {S’}
PP -> P NP
NP -> Det N {PP}
S’ -> Comp S
where
S: sentence; V: verb;
P: preposition, N: noun;
S’: clause; Comp: complement
{} optional;
*: Phrase can appear any number of times including blank.
Lexical entries of the words in the sentence - She saw Stars
C-Structure of the sentence - She saw stars
Finally, the f-structure is the set of attribute-value pairs, represented as
4 (a) Describe the Paninian framework for Indian languages. Explain the layered representation of Paninian [06] CO1 L2
Grammar and Karaka theory.
Paninian grammar (PG) was written by Panini in 500 BC in Sanskrit. PG framework can be used for other Indian
languages and even for some Asian languages. Unlike English, Asian languages are SOV (Subject-Object-Verb)
ordered and inflectionally rich. The inflections provide important syntactic and semantic for language analysis
and understanding. The classical Paninian Grammar facilitates the task of obtaining the semantics through
syntactical framework. In PG, an extensive and perfect interpretation of Phonology, Morphology, Syntax, and
Semantics is available.
Layered representation in panini grammar:
Paninian Grammar (PG) framework is said to be syntactico–semantic that is one can go from the surface
layer to deep semantics by passing through immediate layers. PG works on various levels of language
analysis to achieve the meaning of the sentence from the hearer’s perspective. To achieve the desired meaning,
the grammar analysis is divided itself internally into various levels as shown in the figure below.
fig: Layered Representation in PG
Semantic Level:
• Represents the speaker’s actual intention, that is, his real thought for the sentence.
Surface Level:
• Surface level is the actual string or the sentence. It captures the written or the spoken sentences as it is.
Vibhakti Level:
• Vibhakti is the word suffix, which helps to find out the participants, gender as well as form of the
word.
• Vibhakti level is purely syntactic. At this level, the case endings are used to form the local groups of
the words. At the Vibhakti level, a noun is formed containing a noun, which contains the instances of
noun or pronoun, etc.
• Vibhakti for verbs includes the verb form and the auxiliary verbs.
• Vibhakti gives Tense Aspect Modality details of the word which is popularly known as TAM.
Karaka Level:
• At the Karaka level, the relation of the participant noun, in the action, to the verb is determined.
• Karaka relations are Syntactico-semantic.
• These relations are established in between the verb and other constituent nouns that are present in the
sentences. Through, these relations, the Karakas try to capture the information from the semantics of
the texts.
• Kakara level processes the semantics of the language but represents it at the syntactic level. Hence it
acts as a bridge between semantic and syntactic analysis of a language.
Karaka Theory:
• The etymological meaning of the word Karaka is ‘one who does something’, i.e. one who performs an
action.
• The Karaka and the Kriya, i.e. the cases and verb are bounded with the sense of mutual requirement.
• The one who performs an action, accepts an action, or otherwise helps to perform an action is known
as a Karaka.
• There is a mutual expectancy in between the action i.e. Kriya and the adjuncts i.e. Karaka.
• The presence of one calls for the existence of the other. In other words Kirya and Karaka are mutually
exclusive.
Various Karakas -
1. Karta - subject
2. Karma - object
3. Karana - instrument
4. Sampradhana - beneficiary
5. Apadana - separation
6. Adhara or Adhikarana - locus
7. Sambandh - relation
8. Tadarthya - purpose
The Karta, Karma, and Karana are considered the foremost Karakas while Sampradana, Apadana, and
Adhikarana Karakas are known as the influenced Karakas.
Karta Karaka:
The Karta Karaka is the premier one according to action and it is used to perform an action independently of its
own. An action indicated in a sentence is entirely dependent upon the Karta-Karaka. Activity either resides in or
rises from the Karta only.
Eg. Tiger killed the goat
Tiger - karta.
Karma Karaka:
Karma is the Aashraya (locus) of the result. The most desirable nominative by the Karta Karaka is the Karma
Karaka. When the Karta carries out any activity, the result of that activity rests in the Karma. As the Karma
(object) is the basis of outcome of the primary action, it is one of the important prominent Karaka.
Eg. Tiger killed the goat.
goat is karma.
Karana Karaka:
Karaka which helps in carrying out the Kriya is known as Karana Karaka. Karana Karaka helps in attaining the
desired result ascertained by the Kriya. Karana is the most important tool by means of which the action is
achieved. Karana is the only direct participant in the action. Karta and Karma also are directly dependent on the
Karana for performing the action. When the Karana Karaka executes its auxiliary actions then only the main
action is executed by the Karta Karaka. This is why the Karana is considered as the efficient mean in action
accomplishment.
Example:
The man cut the wood with an axe
Ram cuts the apple with knife.
karana - axe and knife
Sampradhana Karaka:
The word Sampradana can be interpreted as ‘He to whom something is given properly’. Sampradana Karaka
receives or gets benefited from the action. It can also be said that, the person/object for which the Karma is
intentional, is known as Sampradana. In this regard, the Sampradana is the final destination of the action.
Example:
Dipti gave chocolates to Shambhavi
Shambhavi is sampradhana.
Ram gave me a book.
me is sampradhana
He gave flowers for Shanbhavi
Shambhavi is sampradhana
Apadana Karaka:
About Apadana Karaka Panini stated that, as when separation is affected by a verbal action, the point of
separation is called Apadana. During the execution of the action whenever the task of separation from a certain
entity is executed then whatever remains unmoved or constant is known as Apadana. Thus, an Apadana
denotes the starting point of an action of separation. The entity from which something gets separated or is
separated out is known as Apadana.
Example:
Shambhavi tore the page from the book with a scissor.
From the book is apadana
Adhikarana Karaka:
‘Adhikarana’ is the place or thing, which is the location of the action existing in the agent or the object.
Adhikarana is assigned to the locus of the action i.e. Kriya. Adhikarana may indicate the place at which the
Kriya (the action) is taking place or the time at which the Kriya is carried out. Any action i.e. the Kriya is either
bounded by space (place) or by time.
Example:
‘Yesterday Shambhavi hit the dog with the stick in front of the shop.’
The Karaka annotation of the above sentence can be given as
hit : verb (root)
Yesterday : Kala-Adhikarana (time)
Shambhavi : Agent i.e. Karta
Dog : Karma
Stick : Karana
Shop : Desh-Adhikarana (location i.e. Place)
(b) Identify different karaka’s in the following Hindi sentence “Maan putree ko angan mein haath se roti khilaathi [4] CO1 L3
hai”
Karaka role:
khilaathi - Verb (root)
Maan - Karta
roti - Karma
haath - Karana
putree - Sampradhana
angan - Adhikaran
5 Explain Statistical language model and find the probability of the test sentence – P(“They play in a big [10] CO1 L3
garden”) in the following training set using the bi-gram model
<S>There is a big garden.
Children play in the garden.
They play inside beautiful garden. </S>
A statistical language model is a probability distribution P(s) over all possible word sequences (or any other
linguistic unit like words, sentences, paragraphs, documents, or spoken utterances). A number of statistical
language models have been proposed in the literature. The dominant approach in modeling is the n-gram
model:
n-gram Model:
The goal of a statistical language model is to estimate the probability of a sentence. This is achieved by
decomposing sentence probability into a product of conditional probabilities using the chain rule as follows:
P(s) = P (w1, w2, w3, …,wn)
= P (w1) P (w2/w1) P (w3/w1 w2) P (w4/w1 w2 w3) … P (wn/ w1 w2 … wn-1)
n
= ∏ P (wi / hi)
i=1
where hi is the history of the word wi defined as
w1 w2 …wi-1
To find the sentence probability, we need to calculate the probability of a word, given the sequence of words
preceding it. The n-gram model simplifies the task by approximating the probability of a word given all the
previous words by the conditional probability given previous n-1 words only. An n-gram model calculates the
probability of a word by modeling language as the Markov model of order n-1, that is, by looking at previous
n-1 words only. A model that limits the history to the previous one word only is termed a bi-gram (n = 1)
model. A model that conditions the probability of a word to the previous two words, is called a tri-gram (n = 2)
model. Using bi-gram and tri-gram estimates, the probability of a sentence can be calculated as
n
P(s) ⩰ ∏ P (wi / wi-1)
i=1
and
n
P(s) ⩰ ∏ P (wi /wi-2 wi-1)
i=1
A special word (pseudoword) <s> is introduced to mark the beginning of the sentence in bi-gram estimation. The
probability of the first word in the sentence is conditioned on <s>. Similarly, in tri-gram estimation, two pseudo words
<s1> <s2> are introduced.
Estimation of probabilities is done by training the n-gram model on the training corpus. n-gram parameters are
estimated using the Maximum Likelihood Estimation (MLE) technique, that is, using relative frequencies. We count a
particular n-gram in the training corpus and divide it by the sum of all n-grams that share the same prefix.
P (wi / wi-n+1, …, wi-1 ) = C (wi-n+1, …, wi-1, wi)
----------------------------
C (wi-n+1, …, wi-1)
Find the probability of the test sentence – P(“They play in a big garden”) in the following training set using the
bi-gram model
<S>There is a big garden.
Children play in the garden.
They play inside beautiful garden. </S>
<S> There is a big garden. </S>
<S> Children play in the garden. </S>
<S> They play inside a beautiful garden. </S>
Training Set:
<S> There is a big garden. </S>
<S> Children play in the garden. </S>
<S> They play inside a beautiful garden. </S>
Test Sentence:
They play in the big garden
Bi-gram model:
P (They play in the big garden)
= P (They / <S>) *
P (play / they) *
P (in / play) *
P (the / in) *
P (big / the) *
P (garden / big)
= ⅓ * 1/1 * ½ * 1/1 * 0/1 * 1/1
=0
Applying Add-one Smoothing:
Number of unique words in the training set, V = 12
P (They play in the big garden)
= (1+1) / (3 +12) * (1+1) / (1+12) * ( 1+1) / (2+12) * (1+1) /(1+12) * (0+1) / (1+12) * (1+1) / (1+12)
= 2/15 * 2/13 * 2/14 * 2/13 * 1/13 * 2/13
= 0.133 * 0.154 * 0.143 * 0.154 * 0.077 * 0.154
= 5.348 * 10-6
6 What is morphological parsing? Explain the 2-level morphological model with an example. Write a simple FST CO2 L2
for mapping English nouns. [10]
Morphology is a sub-discipline of linguistics. It studies word structure and the formation of words from smaller
units (morphemes). The goal of morphological parsing is to discover the morphemes that build a given word.
Morphemes are the smallest meaning-bearing units in a language. The morphological parser should be able
to tell us that the word ‘eggs’ is the plural form of the noun stem ‘egg’.
Morphological parsing takes as input the inflected surface form of each word in a text and produces the
parsed form consisting of a canonical form (or lemma) of the word and a set of tags showing its syntactical
category and morphological characteristics, eg., possible part of speech and / or inflectional properties
(gender, number, person, tense, etc.)
Two-level Morphological Model:
The two-level morphological model was proposed by koskenniemi (1983), and can be used for highly inflected
languages. In a two-level morphological model, a word is represented as correspondence between its lexical
level form and its surface level form.
1) The surface level represents the actual spelling of the word.
2) The lexical level form represents the concatenation of its constituent morphemes
Morphological parsing is viewed as a mapping from the surface level into morpheme and feature sequences on
the lexical level.
Example:
1. surface form 'playing’ is represented in the lexical form as play + V + PP
The lexical form consists of the stem 'play' followed by the morphological information + V + PP, which tells us
that ‘playing’ is the present participle form of the verb.
surface level: playing
lexical: play + V + PP
2. books
surface level: books
lexical level: book +N +PL
Where the first component is the stem (book) and the second component (N + PL) is the morphological
information, which tells us that the surface level form is a plural noun.
This model is usually implemented with a kind of finite state automata called Finite State Transducer (FST)
As shown in the figure, two steps are involved in the conversion from the surface form of a word to its
morphological analysis.
First Step:
Split the words into possible components
Ex: birds = bird + s, boxes = box + s (or) boxe + s
Second Step:
Use a lexicon to look up the categories of the stems and the meaning of the affixes.
Ex: bird + s = bird + N + PL
box + s = box + N + PL,
We can find out now that boxe is not a legal stem.
Spouses = spouse + s ✔
Parses = parse + s => ✔
Orthographic rules are used to handle spelling variations. One of the rules says -
Add e after -s, -z, -x, -ch, -sh before the s.
e.g., dish -> dishes, box -> boxes
Two steps are implemented with the help of a transducer. Thus, we need two transducers
1. To map the surface form to the intermediate form
2. To map the intermediate form to the lexical form
Develop an FST-based morphological parser for singular and plural nouns in English.
The plural form of regular nouns usually ends with -s or -es. But, a word ending with ‘s’ need not necessarily be
the plural form of the word. e.g., miss, ass, etc. One of the required translations is the deletion of e when
introducing a morpheme boundary. This deletion is required for words ending in xes, ses, zes
Eg: suffixes, boxes
The transducer shown in the figure below maps English nouns to the intermediate form
The figure below shows the possible sequences of states that the transducer undergoes for the surface forms
birds and boxes as input.
The second step is to develop a transducer that does the mapping from the intermediate level to the lexical
level.
The input to the transducer has one of the following forms:
1. Regular noun stem, Eg. bird, cat
2. Regular noun stem + s, eg. Bird + s
3. Singular irregular noun stem, eg. Goose
4. Plural irregular noun stem, eg. Geese
Regular noun stem, Eg. bird, cat
The transducer has to map all symbols of the stem to themselves and then output N and sg.
2. Regular noun stem + s, eg. Bird + s
The transducer has to map all the symbols of the stem to themselves and then output N and replaces PL with S
3. Singular irregular noun stem, eg. Goose
The transducer has to map all symbols of the stem to themselves and then output N and sg.
4. Plural irregular noun stem, eg. Geese
The transducer has to map the irregular plural noun stem to the corresponding singular stem. (eg: geese to
goose) and then add N and PL.