Natural Language Processing: Stages, Ambiguity,
Applications, and Challenges
August 8, 2025
Contents
1 Stages of the NLP Process 3
1.1 Lexical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Syntactic Analysis (Parsing) . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Semantic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Discourse Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Pragmatic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Ambiguity in Natural Language 4
2.1 Lexical Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Syntactic Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Semantic Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Metonymy Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Applications of Natural Language Processing (NLP) 6
3.1 Machine Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Speech Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Speech Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 Information Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.5 Information Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.6 Question Answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.7 Text Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.8 Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Challenges of Natural Language Processing (NLP) 8
4.1 Contextual Words and Phrases & Homonyms . . . . . . . . . . . . . . . 8
4.2 Synonyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3 Irony and Sarcasm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.4 Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.5 Errors in Text or Speech . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1
4.6 Idioms and Slang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.7 Domain-Specific Language . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.8 Low-Resource Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2
1 Stages of the NLP Process
Natural Language Processing (NLP) is the technology used by machines to understand,
interpret, and respond to human languages. It involves a sequence of stages to convert
unstructured language into structured and meaningful data.
The five major stages in NLP are:
1.1 Lexical Analysis
This is the first stage of NLP and is also known as Morphological Analysis.
It involves identifying and analyzing the structure of words.
The text is broken down into tokens, which can be words, numbers, or punctuation
marks.
It helps in understanding the root form of words, identifying prefixes, suffixes, and
stems.
Example:
The word unbelievable is broken into:
→ un (prefix) + believe (root) + able (suffix)
Purpose: Helps machines understand the smallest meaningful units in text.
1.2 Syntactic Analysis (Parsing)
This stage checks the grammatical structure of the sentence.
It involves analyzing whether the words in a sentence are in proper order and follow
the rules of grammar.
Uses techniques like parsing trees to represent sentence structure.
Common parsing techniques include Top-down parsing, Bottom-up parsing, and CYK
algorithm.
Example:
Incorrect: The school goes to girl.
Correct: The girl goes to school.
→ The first sentence will be rejected by the syntactic analyzer due to incorrect word
arrangement.
Purpose: Ensures that the sentence is grammatically valid.
1.3 Semantic Analysis
In this stage, the system focuses on the meaning of the words and sentences.
It maps syntactic structures to real-world concepts to check meaningfulness.
It removes semantic ambiguity by analyzing word senses and context.
Example:
Sentence: Hot ice-cream
→ Though grammatically correct, semantically it is contradictory and may be rejected
as illogical.
Purpose: Ensures that the sentence conveys a valid and meaningful message.
3
1.4 Discourse Integration
This stage considers the meaning of a sentence in relation to the previous and following
sentences.
It helps in resolving references like he, she, it, which depend on context.
It brings coherence to a sequence of sentences and builds continuity in dialogue or
text.
Example:
Meena is a girl. She goes to school.
→ The word she is understood to refer to Meena using discourse-level understanding.
Purpose: Maintains context across multiple sentences and ensures consistent mean-
ing.
1.5 Pragmatic Analysis
This is the final stage where the focus is on the real-world meaning and intent behind
the sentence.
It interprets what the speaker actually meant, considering situational context and
world knowledge.
It handles ambiguity, sarcasm, indirect meaning, and implied messages.
Example:
John saw Mary in a garden with a cat.
→ Ambiguity: Is Mary with the cat or John?
→ Only with pragmatic understanding and real-world logic can the sentence be inter-
preted correctly.
Purpose: Determines speaker’s intention and real context beyond literal words.
1.6 Conclusion
All five stages Lexical Analysis, Syntactic Analysis, Semantic Analysis, Discourse Inte-
gration, and Pragmatic Analysis work together to allow machines to fully understand
human language. Each stage plays a crucial role in transforming raw text into useful,
structured, and meaningful data.
2 Ambiguity in Natural Language
Natural language has a very rich form and structure, but it is also highly ambiguous.
Ambiguity means that a sentence, phrase, or word can have more than one possible
interpretation. In NLP, ambiguity creates a major challenge because machines cannot
easily decide which meaning is correct without using context or additional information.
Any sentence in a language with a large-enough grammar can have multiple valid
interpretations. Humans resolve ambiguity naturally, but machines must rely on linguistic
rules, statistical models, and contextual understanding.
The main types of ambiguity in natural language are as follows:
4
2.1 Lexical Ambiguity
Lexical ambiguity occurs when a single word can have multiple meanings.
For example, the word back can be used as:
• Noun: back stage
• Adjective: back door
In such cases, the correct meaning can only be identified by looking at the surrounding
words in the sentence. In NLP, lexical ambiguity is resolved using Word Sense Disam-
biguation (WSD) techniques.
2.2 Syntactic Ambiguity
Syntactic ambiguity (or structural ambiguity) occurs when a sentence can be parsed in
more than one way according to grammar rules.
For example:
I saw the girl on the beach with my binoculars.
Here, the phrase with my binoculars could describe:
• The act of seeing (I used binoculars to see the girl), or
• The girl herself (the girl had binoculars).
Such ambiguity causes problems for parsing algorithms in NLP, as multiple parse trees
are possible for the same sentence.
2.3 Semantic Ambiguity
Semantic ambiguity is related to sentence meaning. Even if the grammatical structure is
correct, the meaning may still be unclear.
For example:
I saw the girl on the beach with my binoculars.
This can mean:
• I saw the girl through my binoculars, or
• The girl had my binoculars with her.
In NLP, resolving semantic ambiguity requires context analysis and semantic role
labelling to determine the intended relationships between words.
2.4 Metonymy Ambiguity
Metonymy ambiguity occurs when the intended meaning of a phrase is different from its
literal meaning.
For example:
Nokia is screaming for new management.
This does not mean the company is literally screaming; it means the company urgently
needs new management. Figurative expressions like this are common in human language
and require cultural or domain-specific knowledge to interpret correctly.
5
2.5 Conclusion
Ambiguity is one of the most important challenges in NLP because it can occur at multiple
levels: lexical, syntactic, semantic, and figurative language. Resolving ambiguity requires
advanced algorithms, statistical models, and context-aware AI systems. While complete
removal of ambiguity is not possible, modern NLP techniques greatly reduce its impact,
enabling more accurate translation, question answering, and information retrieval.
3 Applications of Natural Language Processing (NLP)
Natural Language Processing (NLP) deals with the complete linguistic analysis of nat-
ural language as well as linguistic generation of output sentences. There has been vast
progress in the NLP field, and it is now applied in many real-world scenarios. The major
applications of NLP are as follows:
3.1 Machine Translation
Machine Translation is the process where the text in one human language is automatically
translated into another human language.
For performing translation, it is important to have knowledge of:
• Words and phrases of both languages
• Grammar of both languages
• Semantics of both languages
• Contextual meaning of the words
Example: Translating an English sentence into Marathi using Google Translate.
3.2 Speech Recognition
Speech Recognition is the process where acoustic speech signals are mapped to a set of
words.
There is wide variation in pronunciation due to accents, dialects, and homonyms (e.g.,
sea and see), as well as acoustic ambiguities (rest and interest).
NLP helps in recognising speech accurately despite such variations.
3.3 Speech Synthesis
Speech Synthesis is the automatic production of speech in natural language from text
input.
It means converting written text into spoken words.
Examples include reading emails aloud on a phone or reading storybooks automatically.
For generating utterances, text processing is required, making NLP an important com-
ponent in speech synthesis systems.
6
3.4 Information Retrieval
Information Retrieval is the process of finding documents or resources relevant to a users
query.
Techniques involved include:
• Indexing
• Query modification
• Word sense disambiguation
• Use of lexical resources (WordNet, Longman Dictionary of Contemporary English)
These help in improving the performance of search and retrieval tasks.
3.5 Information Extraction
Information Extraction is used for extracting structured information from unstructured
or semi-structured, machine-readable documents.
It captures factual information from within a document and outputs it in a structured
format.
Unlike Information Retrieval, it does not just fetch documents; it finds specific informa-
tion that fits predefined templates.
Example: Extracting names, dates, and places from a news article.
3.6 Question Answering
Question Answering systems attempt to find exact answers to user queries rather than
returning full documents.
They make use of Information Retrieval and Information Extraction along with deeper
NLP processing, including:
• Understanding the question
• Analysing portions of the text
• Using semantic and background knowledge to locate answers
Example: A QA system answering Who wrote Hamlet? directly with William Shake-
speare.
3.7 Text Summarization
Text Summarization creates a short, correct summary of longer text documents.
Automatic text summarization reduces the time needed to extract useful information
from large documents.
It involves:
• Syntactic analysis
• Semantic analysis
• Discourse-level processing
Example: Summarising a research article into a short abstract.
7
3.8 Sentiment Analysis
Sentiment Analysis (opinion mining) analyses text to determine the attitude, emotion,
or sentiment expressed by the author.
It uses NLP along with statistical methods to classify text as positive, negative, or neutral
and can also detect emotions such as happiness, sadness, or anger.
Example: Analysing social media posts to measure public opinion about a product.
3.9 Conclusion
NLP applications include a wide range of tasks, from translating languages to answer-
ing questions, generating summaries, and analysing emotions. These applications make
human-computer interaction more natural and effective, and they continue to improve as
NLP technology advances.
4 Challenges of Natural Language Processing (NLP)
Natural Language Processing (NLP) is a powerful tool with enormous benefits; how-
ever, there are still several challenges in processing natural languages. Human languages
are highly complex, context-dependent, and constantly evolving, making it difficult for
machines to understand them perfectly.
The major challenges are as follows:
4.1 Contextual Words and Phrases & Homonyms
The same words and phrases can have different meanings depending on the sentence
context.
Example:
• I ran to the store because we ran out of milk.
• Can I run something past you really quick?
• The house is looking really run down.
In each sentence, the meaning of run changes with context.
Homonyms are words with the same pronunciation but different meanings.
Example: there / their, right / write.
These create problems for tasks like question answering and speech-to-text applications.
4.2 Synonyms
Synonyms cause contextual misunderstanding because different words can express the
same idea.
Some synonyms convey exactly the same meaning, while others have subtle differences.
Example: small, little, tiny, minute all indicate smallness but may vary in tone or
usage.
8
4.3 Irony and Sarcasm
Irony and sarcasm are difficult for NLP models since the intended meaning is often
opposite to the literal meaning.
Example: Saying Yeah, right sarcastically means disbelief, not agreement.
Although models can be trained using context clues and embeddings, detecting sarcasm
remains complex.
4.4 Ambiguity
Ambiguity occurs when a sentence or phrase can have two or more interpretations.
Types include:
• Lexical ambiguity a word has multiple meanings (bank as a financial institution
or riverbank).
• Syntactic ambiguity sentence structure allows multiple interpretations.
• Semantic ambiguity unclear meaning at the sentence level.
Ambiguity makes correct machine interpretation difficult.
4.5 Errors in Text or Speech
Misspelled or misused words cause problems for text analysis.
While autocorrect and grammar checkers can fix common mistakes, they may not always
understand the writers true intention.
4.6 Idioms and Slang
Informal phrases, idioms, and culture-specific expressions are hard for NLP to understand.
Idioms often have no direct dictionary meaning and may differ between regions.
Cultural slang changes quickly, creating new words every day.
Example: Break the ice means to start a conversation, not literally breaking ice.
4.7 Domain-Specific Language
Different industries and fields have their own technical vocabulary (jargon).
An NLP model for healthcare would be very different from one for legal document pro-
cessing.
4.8 Low-Resource Languages
Most NLP systems are built for widely used languages like English or Chinese.
Many regional languages lack large datasets for training, leading to poor accuracy.
Example: Africa has over 3,000 languages, but most have very little digital data.
9
4.9 Conclusion
NLP faces challenges from multiple factors such as context, ambiguity, informal language,
and lack of resources for many languages. Overcoming these requires advanced AI models,
domain-specific training, and continuous adaptation to language evolution.
10