NLP Introduction Notes Anna University

Natural Language Processing (NLP) is a branch of Artificial Intelligence focused on enabling machines to understand human languages, evolving from early machine translation to modern deep learning. Key challenges in NLP include ambiguity and multilingualism, while techniques such as language modeling, regular expressions, and tokenization play crucial roles in processing text. Additionally, methods like Minimum Edit Distance are used for tasks such as spell checking and plagiarism detection.

Uploaded by

keerthanaannadurai2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views2 pages

NLP Introduction Notes Anna University

Uploaded by

keerthanaannadurai2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

INTRODUCTION

Natural Language Processing (NLP) is a subfield of Artificial Intelligence that focuses on

enabling machines to understand, interpret, generate, and respond to human languages. It
bridges the gap between human communication and computer understanding.

Origins and Challenges of NLP

The origins of NLP can be traced to the 1950s, with early efforts in machine translation
and rule-based parsing. Over the decades, it evolved through symbolic methods, statistical
models, and now deep learning. Challenges in NLP include ambiguity, contextual
understanding, world knowledge, sarcasm, multilingualism, and noisy data.

Language Modeling
Language models predict the next word in a sequence or assign probabilities to
sentences. 1. Grammar-Based Language Modeling: - Relies on predefined syntactic rules
(e.g., Context-Free Grammars). - Ensures grammatical correctness but lacks flexibility for
real-world data. 2. Statistical Language Modeling: - Based on probabilities derived from
large corpora. - Examples: Unigram, Bigram, Trigram models. - Applications: Speech
recognition, text prediction.

Regular Expressions
Regular Expressions (regex) are patterns used to match character combinations in text.
They are widely used in text preprocessing, pattern recognition, and lexical analysis.
Examples: - \d: Matches any digit. - [a-z]: Matches any lowercase letter. - ^start: Matches
string starting with "start".

Finite-State Automata (FSA)

FSA is a computational model used to recognize regular languages. It is composed of
states, transitions, a start state, and accepting states. - Deterministic FSA (DFA) -
Non-deterministic FSA (NFA) Used for token recognition, lexical analysis, and
morphological processing.

English Morphology
Morphology is the study of the structure of words. In English: - Inflectional Morphology:
modifies tense, number, etc. (e.g., cat → cats) - Derivational Morphology: creates new
words (e.g., teach → teacher) Morphological analysis is crucial for POS tagging and
lemmatization.

Transducers for Lexicon and Rules

Finite-State Transducers (FSTs) extend FSAs by associating output with transitions. They
map between lexical forms and surface forms. Applications: - Morphological analysis -
Phonological rules - Spelling correction

Tokenization
Tokenization is the process of splitting text into tokens (words, punctuation marks, etc.).
Types: - Word Tokenization: "NLP is fun" → ["NLP", "is", "fun"] - Sentence Tokenization:
Splitting paragraphs into sentences. Challenges: - Handling abbreviations, hyphenations,
contractions.

Detecting and Correcting Spelling Errors

Spelling errors can be detected and corrected using dictionary lookup, language models,
and context. - Non-word errors: "speling" → "spelling" - Real-word errors: "form" vs. "from"
Approaches: - Edit distance - Noisy Channel Model

Minimum Edit Distance

Minimum Edit Distance (MED) quantifies how dissimilar two strings are by counting the
minimum number of operations (insertions, deletions, substitutions) required to transform
one string into the other. Applications: - Spell checking - DNA sequence alignment -
Plagiarism detection

NLP Reading Material-1
No ratings yet
NLP Reading Material-1
15 pages
NLP - Shortnotes Unit 1 & 2
No ratings yet
NLP - Shortnotes Unit 1 & 2
16 pages
Brocode OP
No ratings yet
Brocode OP
133 pages
1 Introduction
No ratings yet
1 Introduction
99 pages
Natural Language Processing Tools and Approaches
No ratings yet
Natural Language Processing Tools and Approaches
106 pages
v24dsl07t - Unit I - NLP
No ratings yet
v24dsl07t - Unit I - NLP
65 pages
Unit 1
No ratings yet
Unit 1
20 pages
Introduction To NLPAbebe Zerihun
No ratings yet
Introduction To NLPAbebe Zerihun
45 pages
NLP Guide: Theory & Practice
No ratings yet
NLP Guide: Theory & Practice
26 pages
CB3591 - Engineering Ssecure Software Systems - Notes
No ratings yet
CB3591 - Engineering Ssecure Software Systems - Notes
50 pages
What Is NLP?: Components of An FSA
No ratings yet
What Is NLP?: Components of An FSA
16 pages
NLP CH 1
No ratings yet
NLP CH 1
8 pages
Module 1.1
No ratings yet
Module 1.1
9 pages
NLP - Shortnotes Unit 1 & 2
No ratings yet
NLP - Shortnotes Unit 1 & 2
16 pages
Topic 2: Introduction To Natural Language Processing (NLP)
No ratings yet
Topic 2: Introduction To Natural Language Processing (NLP)
16 pages
Module I NLP
No ratings yet
Module I NLP
65 pages
Unit I - Natural Language Processing
No ratings yet
Unit I - Natural Language Processing
34 pages
NLP PPT1
No ratings yet
NLP PPT1
29 pages
NLP Insem FlyHigh Services
No ratings yet
NLP Insem FlyHigh Services
7 pages
Natural - Language - Processing (NLP)
No ratings yet
Natural - Language - Processing (NLP)
32 pages
Unit 1a
No ratings yet
Unit 1a
53 pages
Formatted-Document NLP
No ratings yet
Formatted-Document NLP
11 pages
NLP Qna Sem 7 2024 18 11 05 03 29 1
No ratings yet
NLP Qna Sem 7 2024 18 11 05 03 29 1
37 pages
1 NLP
No ratings yet
1 NLP
26 pages
Chapter - 6 Communicating, Perceiving, and Acting
No ratings yet
Chapter - 6 Communicating, Perceiving, and Acting
30 pages
Text Analytics and Natural Language Processing - KAI073
No ratings yet
Text Analytics and Natural Language Processing - KAI073
24 pages
Introduction To NLP
No ratings yet
Introduction To NLP
15 pages
N LP Notes Detailed
No ratings yet
N LP Notes Detailed
12 pages
NLP Important Question and Answers Module Wise
No ratings yet
NLP Important Question and Answers Module Wise
101 pages
NLP 2
No ratings yet
NLP 2
45 pages
NLP Crash Course Comprehensive
No ratings yet
NLP Crash Course Comprehensive
2 pages
01 Introduction To Natural Language Processing
No ratings yet
01 Introduction To Natural Language Processing
42 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
Introduction To NLP - First - Week - Lecture - 1st
No ratings yet
Introduction To NLP - First - Week - Lecture - 1st
6 pages
Introduction To NLP - Part 1
No ratings yet
Introduction To NLP - Part 1
23 pages
Unit 1
No ratings yet
Unit 1
99 pages
NLP Week 1 20
No ratings yet
NLP Week 1 20
20 pages
NLP StudyMaterial
No ratings yet
NLP StudyMaterial
540 pages
ML Module A7707 - Part1
No ratings yet
ML Module A7707 - Part1
48 pages
Session 1
No ratings yet
Session 1
60 pages
NLP Essentials for AI Enthusiasts
No ratings yet
NLP Essentials for AI Enthusiasts
4 pages
PresentationDayone-Introduction of NLP
No ratings yet
PresentationDayone-Introduction of NLP
17 pages
NLP Module 1
No ratings yet
NLP Module 1
31 pages
NLP Ia1
No ratings yet
NLP Ia1
7 pages
NLP Basics for Tech Enthusiasts
No ratings yet
NLP Basics for Tech Enthusiasts
2 pages
Chapter-1 Introduction To NLP
No ratings yet
Chapter-1 Introduction To NLP
12 pages
NLP Lecture 1
No ratings yet
NLP Lecture 1
3 pages
Introduction To Natural Language Processing and NLTK
No ratings yet
Introduction To Natural Language Processing and NLTK
23 pages
Text Processing Guide for NLP
No ratings yet
Text Processing Guide for NLP
15 pages
NLP Unit-1 - 1
No ratings yet
NLP Unit-1 - 1
24 pages
NLP m2
No ratings yet
NLP m2
71 pages
Introduction To NLP: Natural Language Processing
No ratings yet
Introduction To NLP: Natural Language Processing
21 pages
NLP Notes Unit 1to5 Final
No ratings yet
NLP Notes Unit 1to5 Final
75 pages
Wisdom Natural Language Processing
No ratings yet
Wisdom Natural Language Processing
4 pages
Introduction To NLP
No ratings yet
Introduction To NLP
50 pages
Introduction To NLP - First - Week - Lecture - 2st
No ratings yet
Introduction To NLP - First - Week - Lecture - 2st
4 pages
Natural Language Processing New
No ratings yet
Natural Language Processing New
25 pages
Unit-I NLP
No ratings yet
Unit-I NLP
15 pages
QRMRW Assignment 4
No ratings yet
QRMRW Assignment 4
1 page
QRMRW Assignment 3
No ratings yet
QRMRW Assignment 3
1 page
QRMRW Assignment 5
No ratings yet
QRMRW Assignment 5
1 page
MCA Project
No ratings yet
MCA Project
64 pages
2 Marks
No ratings yet
2 Marks
1 page
Ans-Set B
No ratings yet
Ans-Set B
3 pages
Transitive and Intransitive Verbs
No ratings yet
Transitive and Intransitive Verbs
2 pages
Contoh (Paper) English Grammar
No ratings yet
Contoh (Paper) English Grammar
7 pages
30 Irregular and Regular Verbs To Study With Pronunciation
No ratings yet
30 Irregular and Regular Verbs To Study With Pronunciation
3 pages
Unit VI
No ratings yet
Unit VI
45 pages
AHW3e - Level 01 - Unit Test 5a
No ratings yet
AHW3e - Level 01 - Unit Test 5a
5 pages
A Handbook of The Ateso Language
100% (5)
A Handbook of The Ateso Language
152 pages
The Human Language Series 1
No ratings yet
The Human Language Series 1
8 pages
Book American Big Picture
No ratings yet
Book American Big Picture
74 pages
Dzexams Docs 3as 908839
No ratings yet
Dzexams Docs 3as 908839
11 pages
Read The Article and Write Answers To The Questions As Full Sentences
No ratings yet
Read The Article and Write Answers To The Questions As Full Sentences
3 pages
Grammar Lesson 1 Interjection
No ratings yet
Grammar Lesson 1 Interjection
6 pages
Grammar Bank U9-U11
0% (1)
Grammar Bank U9-U11
10 pages
4.1 New Total English - Upper-Intermediate (Photocopiables)
No ratings yet
4.1 New Total English - Upper-Intermediate (Photocopiables)
122 pages
C Users Eri Desktop TOEFL Sentence Correction Practice
No ratings yet
C Users Eri Desktop TOEFL Sentence Correction Practice
5 pages
Engleza Clasa 5 Recapitualare 1
No ratings yet
Engleza Clasa 5 Recapitualare 1
4 pages
2000 Words For Vocab
No ratings yet
2000 Words For Vocab
30 pages
Error Spotting Capsule 1 in PDF No Sooner... Than
No ratings yet
Error Spotting Capsule 1 in PDF No Sooner... Than
3 pages
English Exam Outline
No ratings yet
English Exam Outline
2 pages
Immediate Download Traditions Warriner S Handbook 0 Introductory Course Teachers Edition Grade 6 Holt (Holt) Ebooks 2024
100% (21)
Immediate Download Traditions Warriner S Handbook 0 Introductory Course Teachers Edition Grade 6 Holt (Holt) Ebooks 2024
85 pages
1naunton Jon Edwards Lynda Gold Pre First Coursebook
100% (9)
1naunton Jon Edwards Lynda Gold Pre First Coursebook
176 pages
A Brief Swedish Grammar
100% (7)
A Brief Swedish Grammar
344 pages
English Grammar and Conversation Guide
No ratings yet
English Grammar and Conversation Guide
12 pages
4th QTR Mtbeng wTOS
No ratings yet
4th QTR Mtbeng wTOS
15 pages
Gold - Exp - 2e - C1 - GrammarFiles - PAST TENSES
No ratings yet
Gold - Exp - 2e - C1 - GrammarFiles - PAST TENSES
2 pages
The Infinitive
No ratings yet
The Infinitive
3 pages
Quick Learning 2
No ratings yet
Quick Learning 2
73 pages
Noam Chomsky, The Minimalist Program .) Cambridge, MA: MIT Press, . Pp.
No ratings yet
Noam Chomsky, The Minimalist Program .) Cambridge, MA: MIT Press, . Pp.
14 pages
English For Everyone Course Book Level 2-Beginner-12-43
No ratings yet
English For Everyone Course Book Level 2-Beginner-12-43
32 pages
Holiday Homework 2022
No ratings yet
Holiday Homework 2022
32 pages
Unit 8 Quiz Final Test
No ratings yet
Unit 8 Quiz Final Test
4 pages

NLP Introduction Notes Anna University

Uploaded by

NLP Introduction Notes Anna University

Uploaded by

INTRODUCTION

Natural Language Processing (NLP) is a subfield of Artificial Intelligence that focuses on

Origins and Challenges of NLP

Finite-State Automata (FSA)

Transducers for Lexicon and Rules

Detecting and Correcting Spelling Errors

Minimum Edit Distance

You might also like