NATURAL LANGUAGE PROCESSING (NLP)
CSE 4022
INTRODUCTION
Module 1
TEXT BOOKS
Text Books
1. Daniel Jurafsky and James H. Martin “Speech and Language Processing”,
3rd edition, Prentice Hall, 2009.
Reference Books
1. Chris Manning and HinrichSchütze, “Foundations of Statistical Natural
Language Processing”, 2nd edition, MITPress Cambridge, MA, 2003.
2. NitinIndurkhya, Fred J. Damerau “Handbook of Natural Language
Processing”, Second Edition, CRC Press, 2010.
3. James Allen “Natural Language Understanding”, Pearson Publication 8th
Edition. 2012.
Basics
• NLP is about – we speak
We write All that using the language
We read
NLP
ML AI
is an integral part of AI
Driven in advances in AI
Introduction
• Natural language processing (NLP) is a subfield of computer
science, information engineering, and artificial intelligence concerned with
the interactions between computers and human (natural) languages, in
particular how to program computers to process and analyze large amounts
of natural language data.
AI and ML
AI - Theory and development of computer systems able to perform the tasks
normally requiring human intelligence such as visual perception, speech
recognition, decision making and translation between languages.
ML- Machine learning is the subset of AI uses statistical techniques to give
computer ability to learn with data, without being explicitly programmed.
Introduction
• NLP describes the ability of computers to understand human speech as it is
spoken.
• NLP is a branch of artificial intelligence that has many important implications on
the ways that computers and humans interact.
• Machine learning has helped computers parse the ambiguity of human language.
History of NLP
• History of NLP generated in 1950, Alan Turing published an article titled "Intelligence"
which proposed what is now called the Turing test as a criterion of intelligence.
• The Georgetown experiment in 1954 involved fully automatic translation of more than
sixty Russian sentences into English.
• 1980s, most natural language processing systems were based on complex sets of hand-
written rules.
• 2010s, representation learning and deep neural network-style machine learning methods
became widespread in natural language processing,
• Analysing the tons of data
• Identifying various languages
• Quantitative analysis – Analysing and understanding the data
Classification
Need of NLP
Components of NLP
Natural Language Understanding (NLU)
• Mapping the given input in natural language into useful representations.
• Analyzing different aspects of the language.
Natural Language Generation (NLG)
It is the process of producing meaningful phrases and sentences in the form of
natural language from some internal representation.
• Text planning − It includes retrieving the relevant content from knowledge
base.
• Sentence planning − It includes choosing required words, forming
meaningful phrases, setting tone of the sentence.
• Text Realization − It is mapping sentence plan into sentence structure.
• The NLU is harder than NLG.
Challenges
• Speech recognition
• Natural language understanding
• Natural language generation
NLP considers the hierarchical structure of language:
Several words make a phrase, several phrases make a sentence, sentences convey
ideas
Stages in NLP
Morphological Discourse Analysis
Analysis Resolving
Individual words references Between
are analyzed into sentences
their components
Pragmatic Analysis
Syntactic Analysis To reinterpret what
was said to what was
Linear sequences actually meant
of words are Semantic Analysis
transformed into A transformation is
structures that made from the input
show how the text to an internal
words relate to representation that
each other reflects the meaning
Stages in NLP
Stages in NLP
NLP stages
Discourse
Pragmatics
Semantics
*We can go up down and up and
Syntax down and combine steps.
*Every step is equally complex
Morphology