KEMBAR78
1-Introduction To NLP - Part1 | PDF | Linguistics | Word
0% found this document useful (0 votes)
31 views31 pages

1-Introduction To NLP - Part1

The document outlines a course on Natural Language Processing (NLP) for the Spring Semester of 2024-2025, detailing the syllabus, assessment criteria, and fundamental concepts of NLP. It covers various linguistic levels, definitions, and applications of NLP, emphasizing the importance of understanding human language through computational methods. Key topics include the classification of languages, challenges in NLP, and the categories of knowledge necessary for effective language processing.

Uploaded by

dw9324764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views31 pages

1-Introduction To NLP - Part1

The document outlines a course on Natural Language Processing (NLP) for the Spring Semester of 2024-2025, detailing the syllabus, assessment criteria, and fundamental concepts of NLP. It covers various linguistic levels, definitions, and applications of NLP, emphasizing the importance of understanding human language through computational methods. Key topics include the classification of languages, challenges in NLP, and the categories of knowledge necessary for effective language processing.

Uploaded by

dw9324764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Year: 2024-2025

Spring Semester

Natural Language
Processing
Dr. Wafaa Samy
Dr. Hanaa Eissa
Introduction to Natural Language
Processing (NLP) (Part 1)
Lecture (1)

2
Contents
• Text Books
• Assessment Criteria
• Basic Definitions
• Linguistics Levels (Disciplines)
• Applications of Natural Language Processing
• Categories of Knowledge of Language
Text Books
• Main Book:
o Daniel Jurafsky and James H. Martin, Speech and Language
Processing, Prentice-Hall, Second Edition, 2009.

• Other Books:
o Foundations of Statistical Natural Language Processing,
Christopher D. Manning and Hinrich Schütze. , The MIT Press.
Springer-Verlag, 1999.
o James Allen, Natural Language Understanding (The Second Ver.),
The Benjamin / Cummings Publishing Company, Inc., 1995.
o Dexter Kozen, Automata and Computability, Springer-Verlag,
2000.
4
Assessment Criteria
• Course Activities 30
o Quizzes
o Project
o Sections Tasks

• Mid-Term Exam 20
• Final Exam 50

5
Languages
• Languages can be classified as: Natural Language and
Artificial Language.

It means human language.


6
Two forms of natural language: written and spoken forms.
What is Natural Language?
• It means human language.
• Natural Language is one of the fundamental aspects of
human behavior and is a crucial component of our lives.
• The most common way that people communicate is by
speaking or writing in one of the natural language such as
English, Chinese, German, or French.
o It is a communication mechanism whose medium is text or
speech.
o There are two forms of natural language: written and spoken
forms.

7
Computational Linguistics (CL)
• Linguistics is the study and the description of human
languages.
• Computational Linguistics (CL) is the part of the science of
human language that uses computers to aid observation of
or experiment with language.
o The use of computers in the study of languages.
o CL goal is to design mathematical models of language
structures enabling the automation of language processing by a
computer.
• Computational linguistics can be considered as:
o The formalization of linguistic theories and models or their
implementation in a machine.
o i.e. to develop new linguistic theories with the aid of a
computer.
8
Linguistics Levels (Disciplines)
• Linguistics has been divided into disciplines or
levels, which go from sounds to meaning.

9
Linguistics Levels (Disciplines) (Cont.)
• Phonetics: Concerns the production and perception of acoustic
sounds that form the speech signal. In each language, sounds can
be classified into a finite set of phonemes.
• Morphology: The second level concerns the words. Morphology is
the study of the structure and the forms of a word.
o The word set of a language is called a lexicon. Usually a lexicon consists of
root words.
o Words can appear under several forms, for instance, the singular and the
plural forms (e.g. book and books, walk and walking).
• Syntax: Study the order of words in a sentence and their
relationships. Syntax defines word categories and functions.
o Subject, verb, object is a sequence of functions that corresponds to a
common order in many European languages including English and French.
10
o Parsing determines the structure of a sentence.
Linguistics Levels (Disciplines) (Cont.)
• Semantic: It considers the meaning of words and sentences.
• Pragmatics: Pragmatics is the meaning of words and sentences in specific
situations (Study how the context of a sentence contributes to its meaning).
o Pragmatics is semantics restricted to a specific context and relies on facts that
are external to the sentence.
o These facts contribute to the inference of a sentence’s meaning or prove its
truth or falsity.
• Discourse: The production of language consists of a stream of sentences
that are linked together to form a discourse.
o This discourse is usually aimed at other people who can answer – it is to be
hoped through a dialogue.
o A dialogue is a set of linguistic interactions that enables the exchange of
information and sometimes eliminates misunderstandings or ambiguities.

11
What is Natural Language Processing
(NLP)?
• Natural language processing (NLP) is an
interdisciplinary subfield of linguistics, computer science and
artificial intelligence concerned with the interactions
between computers and human languages.
o In particular, how to program computers to process and analyze
large amounts of natural language data.
o The goal is a computer capable of "understanding" the contents
of documents, including the contextual nuances of the language
within them.
• Challenges in natural language processing involve:
o Speech Recognition.
o Natural Language Understanding (NLU).
12
o Natural Language Generation (NLG).
What is Natural Language Processing
(NLP)? (Cont.)

Natural Natural
Language Language

13
Applications of Natural Language
Processing
• Text-Based Applications.
• Dialogue-Based Applications.

14
Text-Based Applications
• Text-based applications involve the “processing of written
text”, such as books, newspapers, reports, manuals, e-mail
message, etc.
• Examples of text-based applications:
o Finding appropriate documents on certain topics from a
database of texts.
 e.g. finding relevant books in a library.
o Machine Translation (MT) from one language to another.
 e.g. Translating documents from one language to another.
o Extracting information from messages or articles on
certain topics. (e.x ATS sys)
o Summarizing texts for certain purposes.
o Web-based question answering.
15
Dialogue-Based Applications
• Dialogue-based applications involve human-machine
communication.
• Examples of dialogue-based applications:
o Question-answering systems, where natural language is
used to query a database.
o Automated customer service over the telephone.
 e.g. to perform banking transactions or order items from a
catalogue.

o Tutoring systems, where the machine interacts with a


student.
16
Categories of Knowledge of Language
• The knowledge of language needed to engage in complex
language behavior can be separated into distinct categories.
• A natural language-system must use considerable knowledge
about the structure of the language itself, including:
o What the words are?
o How words combine to form sentences?
o What the words mean?
o How word meanings contribute to sentence meanings?
 e.g. “bank” and “river bank”.
o and so on.
17
Categories of Knowledge of Language
(Cont.)
• The following are the different forms of knowledge relevant
for natural language understanding:
1. Phonetic and phonological knowledge
o Concerns how words are related to the sounds that
realize them.
o Such knowledge is crucial for speech-based systems.
2. Morphological knowledge
o Concerns how words are constructed from more basic
meaning units called morphemes.
 A morpheme is the primitive unit of meaning in a language.
o For example, the meaning of the word "friendly" is
derivable from the meaning of the noun "friend" and the
suffix "-ly", which transforms a noun into an adjective.
18
Categories of Knowledge of Language
(Cont.)
3. Syntactic knowledge
o Concerns how words can be put together to form correct
sentences.
o Determines what structural role each word plays in the
sentence and what phrases are subparts of what other
phrases.
4. Semantic knowledge
o Concerns what words mean and how these meanings
combine in sentences to form sentence meanings.
o This is the study of context-independent meaning - the
meaning a sentence has regardless of the context in which it is
used.
5. Pragmatic knowledge
o Concerns how sentences are used in different situations and
how use affects the interpretation of the sentence.
19
Categories of Knowledge of Language
(Cont.)
6. Discourse knowledge
o Concerns how the immediately preceding sentences affect the
interpretation of the next sentence.
o This information is especially important for interpreting
pronouns and for interpreting the temporal aspects of the
information conveyed.
7. World knowledge
o Includes the general knowledge about the structure of the
world that language users must have in order to, for example,
maintain a conversation.
o It includes what each language user must know about the other
user’s beliefs and goals.
• For example, to answer questions or to participate in a
conversation, a person not only must know a lot about the
structure of the language being used, but also must know about the
20 world in general and the conversational setting in particular.
Example (1)
• The following examples may help you understand the distinction
between syntax, semantics, and pragmatics.
• Consider each example below as a candidate for the initial sentence
of the book concerning natural language processing:
o Language is one of the fundamental aspects of human behavior and is
a crucial component of our lives.

Sentence appears to be a reasonable start.


It agrees with all that is known about syntax, semantics, and
pragmatics.

21
Example (1) (Cont.)
• Consider each example below as a candidate for the
initial sentence of the book concerning natural
language processing:
o Green frogs have large noses.

Syntax?
Semantics?
Pragmatics?

22
Example (1) (Cont.)
• Consider each example below as a candidate for the
initial sentence of the book concerning natural
language processing: .‫الضفادع الخضراء لها أنوف كبيرة‬

o Green frogs have large noses.

Sentence is well-formed syntactically and semantically,


but not pragmatically.
It fares poorly as the first sentence of the book because
the reader would find no reason for using it.

23
Example (1) (Cont.)
• Consider each example below as a candidate for the initial
sentence of the book concerning natural language
processing:
o Green ideas have large noses.

Syntax?
Semantics?
Pragmatics?

24
Example (1) (Cont.)
• Consider each example below as a candidate for the initial
sentence of the book concerning natural language
processing: .‫األفكار الخضراء لها أنوف كبيرة‬

o Green ideas have large noses.

This sentence is much worse.


Sentence is well-formed syntactically.
Not only it is obviously pragmatically ill-formed, it is also
semantically ill-formed.
However, the sentence does have some structure, for we can
discuss what is wrong with it: Ideas cannot be green and,
even if they could, they certainly cannot have large noses.
25
Example (1) (Cont.)
• Consider each example below as a candidate for the
initial sentence of the book concerning natural
language processing:
o Large have green ideas nose.

Syntax?
Semantics?
Pragmatics?
26
Example (1) (Cont.)
• Consider each example below as a candidate for the
initial sentence of the book concerning natural
language processing:
o Large have green ideas nose.

This sentence is even worse.


In fact, it is unintelligible, even though it contains the same
words as sentence 3 (i.e. previous sentence).
It does not even have enough structure to allow you to say
what is wrong with it. Thus, it is syntactically ill-formed.
Besides, it is obviously pragmatically ill-formed and
27 semantically ill-formed.
Example (2)
• Utterance: The spoken correlate of a sentence.
• Incidentally, there are cases in which a sentence may be
pragmatically well-formed but not syntactically well-formed.
• For example, if I ask you where you are going and you reply
“I go store”.
• The response would be understandable even though it is
syntactically ill-formed (the correct response: I go to store).
• Thus, it is at least pragmatically well-formed and may even
be semantically well-formed.

28
Example (3)
• Given that the person uttering the following
sentences is responding to a complaint that the car
is too cold.
a. The heater are on.
b. The tires are brand new.
• Classify these sentences along each of the following
dimensions:
i. Syntactically correct or not.
ii. Semantically correct or not.
iii. Pragmatically correct or not.

29
Example (3) (Cont.)
• Given that the person uttering the following
sentences is responding to a complaint that the car
is too cold.
a. The heater are on.  correct syntax is “The heater is on”
i. Syntactically incorrect.
ii. Semantically correct.
iii. Pragmatically correct.

b. The tires are brand new.


i. Syntactically correct.
ii. Semantically correct.
iii. Pragmatically incorrect.

30

You might also like