CS563-NLP-2024 - Assignment 1

The document outlines an assignment to implement hidden Markov models for part-of-speech tagging. It provides details on estimating HMM parameters from training data, implementing bigram and trigram models, testing models with Viterbi decoding, and evaluating performance on a benchmark dataset.

Uploaded by

arjun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views2 pages

CS563-NLP-2024 - Assignment 1

Uploaded by

arjun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

CS 563: Natural Language Processing

Assignment-1: Part-of-Speech Tagging

Deadline: 15 February 2024

● Markings will be based on the correctness and soundness of the outputs.

● Marks will be deducted in case of plagiarism.
● Proper indentation and appropriate comments (if necessary) are mandatory.
● Use of frameworks like scikit-learn etc is not allowed.
● All benchmarks(accuracy etc), answers to questions and supporting examples
should be added in a separate file with the name ‘report’.
● All code needs to be submitted in ‘.py’ format. Even if you code it in ‘.IPYNB’
format, download it in ‘.py’ format and then submit
● You should zip all the required files and name the zip file as:
○ <roll_no>_assignment_<#>.zip, eg. 1501cs11_assignment_01.zip.
● Upload your assignment ( the zip file ) in the following link:
○ CS563-NLP-2024-Assignments

Problem Statement:
● The assignment targets to implement Hidden Markov Model (HMM) to perform
Part-of-Speech (PoS) tagging task

Implementation:

HMM based Model:

● HMM Parameter Estimation

○ Input: Annotated tagged dataset
○ Output: HMM parameters
○ Procedure:
■ Step1: Find states.
■ Step2: Calculate Start probability (π).
■ Step3: Calculate transition probability (A)
■ Step4: Calculate emission probability (B)
● Features for HMM:
○ Train two HMM models based on:
■ First order markov assumption (Bigram) where current word PoS
tag is based on the previous and current words
■ Second order markov assumption (Trigram) where current word
PoS tag is based on the previous two words along with the current
word

Testing:
● After calculating all these parameters use these parameters to tag the test input
sequence using the Viterbi algorithm

Dataset:
● Dataset consists of sentences and each word is tagged with its corresponding
PoS tag
● Brown dataset: Brown_train.txt
● Format of dataset:
○ Each line contains <Word/Tag> (word followed by ‘/’ and tag)
○ Sentences are separated by a new line

Documents to submit:
● Model code
● Perform 5 fold cross-validation on the Training dataset and report both average &
individual fold results (Accuracy, Precision, Recall and F-Score).
● Create a confusion matrix using Python Library.
● Briefly discuss Bigram vs Trigram assumption while training HMMs.
● With some examples (good pairs and bad pairs) why the model is confused and
when it is giving correct results. Analyze and Explain the reason behind it.
● Also, Implement a RNN based model for this task and compare the result of both
RNN and HMM.
● Discuss which model is better? With some justification and analysis when RNN is
better than HMM and when HMM is better than RNN and when both fail and
why?
● Write a report (doc or pdf format) on how you are solving the problems as well as
all the results including model architecture (if any).

For any queries regarding this assignment, contact:

Ramakrishna Appicharla (ramakrishnaappicharla@gmail.com),
Arpan Phukan (arpanphukan@gmail.com),
Sandeep Kumar (sandeep.kumar82945@gmail.com), and,
Aizan Zafar (aizanzafar@gmail.com)

L8-10 Intro POS HMM
No ratings yet
L8-10 Intro POS HMM
22 pages
NLP Experiment 3
No ratings yet
NLP Experiment 3
3 pages
Csci544 2023 HW2
No ratings yet
Csci544 2023 HW2
3 pages
Lab 10
No ratings yet
Lab 10
2 pages
Lab 9
No ratings yet
Lab 9
2 pages
19CSE453 - Natural Language Processing: Part of Speech Tagging
No ratings yet
19CSE453 - Natural Language Processing: Part of Speech Tagging
59 pages
NLP Assignment No 3
No ratings yet
NLP Assignment No 3
1 page
Unit 5
No ratings yet
Unit 5
8 pages
PoSTagging-HMM
No ratings yet
PoSTagging-HMM
24 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
7 pages
NLP Programming en 04 HMM
No ratings yet
NLP Programming en 04 HMM
24 pages
5 Sequence Learning
No ratings yet
5 Sequence Learning
50 pages
NLP Assignment 2
No ratings yet
NLP Assignment 2
2 pages
2021 25 Pos Tagging NLP
No ratings yet
2021 25 Pos Tagging NLP
8 pages
Unit-3.Word Level Analysis AIML
No ratings yet
Unit-3.Word Level Analysis AIML
5 pages
Parts of Speech Tagging Using Hidden Markov Model, Maximum Entropy Model and Conditional Random Field
No ratings yet
Parts of Speech Tagging Using Hidden Markov Model, Maximum Entropy Model and Conditional Random Field
28 pages
HMM Model
No ratings yet
HMM Model
11 pages
POS Tagger Hindi Presentation
No ratings yet
POS Tagger Hindi Presentation
11 pages
Week 9
No ratings yet
Week 9
36 pages
Lecture Notes On Syntactic Processing
No ratings yet
Lecture Notes On Syntactic Processing
14 pages
Ai TXT Unit5
No ratings yet
Ai TXT Unit5
7 pages
L11 CRF Tagger
No ratings yet
L11 CRF Tagger
8 pages
Python
No ratings yet
Python
9 pages
NLP Session 6
No ratings yet
NLP Session 6
5 pages
POS Tagging and NER Methods
No ratings yet
POS Tagging and NER Methods
51 pages
POS Tagging HMM Notes With Diagrams
No ratings yet
POS Tagging HMM Notes With Diagrams
4 pages
Module-5 (Markov Model and Pos Tagging)
No ratings yet
Module-5 (Markov Model and Pos Tagging)
66 pages
2 cs626 Pos Tagging Week of 1aug22
No ratings yet
2 cs626 Pos Tagging Week of 1aug22
57 pages
Rutuja
No ratings yet
Rutuja
10 pages
POS HMM Viterbi Algo 2025
No ratings yet
POS HMM Viterbi Algo 2025
52 pages
Unit No 3
No ratings yet
Unit No 3
8 pages
L4 Tagging
No ratings yet
L4 Tagging
107 pages
Lec PoS Tagging 2022
No ratings yet
Lec PoS Tagging 2022
67 pages
POS Tagging and HMM in NLP
No ratings yet
POS Tagging and HMM in NLP
93 pages
Week6 Lab Pos Tagging
No ratings yet
Week6 Lab Pos Tagging
3 pages
Natural Language Processing (Weekly Laboratory Assignments) : Sumit Kumar Banerjee
No ratings yet
Natural Language Processing (Weekly Laboratory Assignments) : Sumit Kumar Banerjee
8 pages
Lec 10
No ratings yet
Lec 10
77 pages
Lecture Part of Speech Tagging
No ratings yet
Lecture Part of Speech Tagging
41 pages
Patoary 2020
No ratings yet
Patoary 2020
4 pages
HMM Detailed
No ratings yet
HMM Detailed
41 pages
Hidden Markov Model
No ratings yet
Hidden Markov Model
13 pages
NLPChapter 3
No ratings yet
NLPChapter 3
14 pages
Natural Language Processing: Parts of Speech Tagging - Pos
No ratings yet
Natural Language Processing: Parts of Speech Tagging - Pos
20 pages
HMM Stochastic Tagger
No ratings yet
HMM Stochastic Tagger
8 pages
Round 1 Task - Musk
No ratings yet
Round 1 Task - Musk
1 page
NLP Module 3 Syntax Analysis
No ratings yet
NLP Module 3 Syntax Analysis
11 pages
NLP Module 3 Syntax Analysis
No ratings yet
NLP Module 3 Syntax Analysis
11 pages
CSCI 5832 Natural Language Processing: Jim Martin
No ratings yet
CSCI 5832 Natural Language Processing: Jim Martin
46 pages
Pos Tagging of Punjabi Language Using Hidden Markov Model
No ratings yet
Pos Tagging of Punjabi Language Using Hidden Markov Model
9 pages
NLP Programming Assignment 1: Hidden Markov Models
No ratings yet
NLP Programming Assignment 1: Hidden Markov Models
4 pages
Multi-Tagging For Transition-Based Dependency Parsing
No ratings yet
Multi-Tagging For Transition-Based Dependency Parsing
10 pages
Be4 A 17 NLP Exp6
No ratings yet
Be4 A 17 NLP Exp6
4 pages
NLP Report - Modified
No ratings yet
NLP Report - Modified
8 pages
9.chapter7 POS Tagging
No ratings yet
9.chapter7 POS Tagging
37 pages
Unit3 01
No ratings yet
Unit3 01
10 pages
Sequence Labeling for NLP Students
No ratings yet
Sequence Labeling for NLP Students
79 pages
Parts of Speech Using Hidden Markov Models
No ratings yet
Parts of Speech Using Hidden Markov Models
5 pages
Lifelong Machine Learning Systems
No ratings yet
Lifelong Machine Learning Systems
7 pages
JEE Advanced CRT 1 Paper 2
No ratings yet
JEE Advanced CRT 1 Paper 2
18 pages
Brown Train
No ratings yet
Brown Train
1,304 pages
WWW - Nta.ac - in & WWW - Jeemain.nic.i N: Website
No ratings yet
WWW - Nta.ac - in & WWW - Jeemain.nic.i N: Website
5 pages
Kinematics of Machines PDF
No ratings yet
Kinematics of Machines PDF
133 pages
Fiitjee: Faridabad
No ratings yet
Fiitjee: Faridabad
4 pages
Bonding Atkins
No ratings yet
Bonding Atkins
29 pages
Maggie and Max Visit The Haunted Castle - Scan
No ratings yet
Maggie and Max Visit The Haunted Castle - Scan
8 pages
AI Robot Trouble Shooting Guide: User Was Unable To Download From Links and You Need To Send Ea Direct
No ratings yet
AI Robot Trouble Shooting Guide: User Was Unable To Download From Links and You Need To Send Ea Direct
3 pages
Study On Decentralized Identity and Privacy Preserving Cyber Security
No ratings yet
Study On Decentralized Identity and Privacy Preserving Cyber Security
7 pages
Eton-Systems - Select - Software (2) For Hanger
No ratings yet
Eton-Systems - Select - Software (2) For Hanger
8 pages
Fee Management System Project
60% (10)
Fee Management System Project
13 pages
HALOT Box User Manual
No ratings yet
HALOT Box User Manual
26 pages
Introduction of Blu-Ray Disc
No ratings yet
Introduction of Blu-Ray Disc
17 pages
Tim Cook Is The CEO
No ratings yet
Tim Cook Is The CEO
5 pages
KCAA/UAS/IP/20210108: Unmanned Aircraft System (UAS) Import Permit
No ratings yet
KCAA/UAS/IP/20210108: Unmanned Aircraft System (UAS) Import Permit
1 page
Mayuri
No ratings yet
Mayuri
71 pages
SIM7500 - SIM7600 Series - AT Command Manual - V1.10
No ratings yet
SIM7500 - SIM7600 Series - AT Command Manual - V1.10
335 pages
وثيقة العمل الحر
No ratings yet
وثيقة العمل الحر
1 page
MESA Collaborative Manufacturing Dictionary 2 Edition
No ratings yet
MESA Collaborative Manufacturing Dictionary 2 Edition
60 pages
Holistic Testing-Weave Quality Into Your Product
No ratings yet
Holistic Testing-Weave Quality Into Your Product
37 pages
2023EIR211E01MEMO
No ratings yet
2023EIR211E01MEMO
17 pages
Clipping
No ratings yet
Clipping
3 pages
Sun Cellular E-Bill-0171889715-2020-12-27
No ratings yet
Sun Cellular E-Bill-0171889715-2020-12-27
6 pages
Custodians and Midwives
No ratings yet
Custodians and Midwives
184 pages
Python Snake Game Project Report
No ratings yet
Python Snake Game Project Report
11 pages
Word Processing Application
No ratings yet
Word Processing Application
44 pages
User Guide
No ratings yet
User Guide
22 pages
II ST QP Format ML
No ratings yet
II ST QP Format ML
1 page
EZR34T Outdoor LTE Router Guidance - v1.0 Web
No ratings yet
EZR34T Outdoor LTE Router Guidance - v1.0 Web
4 pages
Experiment No 4
No ratings yet
Experiment No 4
14 pages
Evs 9300 Series
No ratings yet
Evs 9300 Series
504 pages
Install MariaDB on CentOS/RHEL 8 Guide
No ratings yet
Install MariaDB on CentOS/RHEL 8 Guide
6 pages
Research Article
No ratings yet
Research Article
9 pages
VPN Connection Guide + Troubleshooting - PUBLIC
No ratings yet
VPN Connection Guide + Troubleshooting - PUBLIC
12 pages
Introduction To Geography: Edward F. Bergman William H. Renwick
No ratings yet
Introduction To Geography: Edward F. Bergman William H. Renwick
40 pages
Likewise Open 5.0 Installation and Administration Guide
100% (2)
Likewise Open 5.0 Installation and Administration Guide
86 pages