Text Classification with Hugging Face

The document describes an assignment to build a text classification model using the Hugging Face library. It involves choosing a dataset with 1000+ samples in each category, preprocessing the text, fine-tuning a pretrained model like BERT or GPT-2 for classification, evaluating the model, making predictions, and writing a report discussing the model and results. The report, code, and dataset must be submitted.

Uploaded by

dilip

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

127 views1 page

Text Classification with Hugging Face

Uploaded by

dilip

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 1

Text pre-processing, tokenization, and stemming/lemmatization

N-grams and bag-of-words models

Part-of-speech tagging and named entity recognition
Sentiment analysis and text classification
Word embeddings (e.g. word2vec, GloVe) and deep learning techniques for NLP such as
LSTMs and Transformer
Knowledge of Python and NLP libraries such as NLTK, spaCy, and gensim
Familiarity with machine learning frameworks like Tensorflow, Pytorch
Experience with NLP application such as language model, text generation,
summarization, question answering, machine translation, etc.

Assignment: Text Classification using Hugging Face

Objective: The goal of this assignment is to build a text classification model

using the Hugging Face library to classify a dataset of text into one of multiple
categories. The candidate will use a pre-trained model such as BERT or GPT-2 as a
starting point and fine-tune it on the classification task.

Instructions:

Choose a dataset of text that has multiple categories (e.g. news articles labeled
as sports, politics, entertainment, etc.). The dataset should have at least 1000
samples for each category.

Preprocess the text data by cleaning it, removing stopwords, punctuations and other
irrelevant characters.

Use the Hugging Face library to fine-tune a pre-trained model such as BERT or GPT-2
on the classification task. The candidate should use the transformers library in
python.

Train the model on the dataset and evaluate the performance using metrics such as
accuracy, precision, recall and F1-score.

Use the trained model to predict the categories of a few samples from the test set.

Write a report that includes the following:

A brief introduction to the task and the dataset used

The preprocessing steps taken
The architecture of the model used, and how it was fine-tuned
The evaluation metrics and the results obtained
A discussion of the performance of the model and possible ways to improve it.
Sample predictions and their explanations
Submit the report, the code and the dataset used for the task.

Notes:

Use the latest version of transformers and python.

Feel free to experiment with different pre-trained models and fine-tuning
techniques.
The report should be clear, concise and well-structured.
The code should be well-commented and easy to understand.

Good luck!

Please let me know if you need me to explain more.

Complex Engineering Activity
No ratings yet
Complex Engineering Activity
2 pages
Pgi20s02j - Lab Record
No ratings yet
Pgi20s02j - Lab Record
24 pages
RAI AI Engineer Intern Assignments
No ratings yet
RAI AI Engineer Intern Assignments
3 pages
Practice IV - Second Stage - Clickbait Detection
No ratings yet
Practice IV - Second Stage - Clickbait Detection
10 pages
Set 1
No ratings yet
Set 1
4 pages
Vijayi WFH Tech - Assignment - AI Internship - Jan 2025
No ratings yet
Vijayi WFH Tech - Assignment - AI Internship - Jan 2025
3 pages
Machine Learning Project Guide
No ratings yet
Machine Learning Project Guide
11 pages
Machine Learning Assignment Guide
No ratings yet
Machine Learning Assignment Guide
6 pages
GeneativeAI Interview
No ratings yet
GeneativeAI Interview
36 pages
Natural Language Processing Tasks
No ratings yet
Natural Language Processing Tasks
5 pages
IQBAL Fresher 19
No ratings yet
IQBAL Fresher 19
3 pages
NLP Assignment 2
No ratings yet
NLP Assignment 2
3 pages
Week 2 Task
No ratings yet
Week 2 Task
4 pages
CB SC P2cse23010
No ratings yet
CB SC P2cse23010
30 pages
NLP Assignment 2024
No ratings yet
NLP Assignment 2024
12 pages
Machine Learning QA System
No ratings yet
Machine Learning QA System
35 pages
Sentiment Analysis Pipeline Guide
No ratings yet
Sentiment Analysis Pipeline Guide
8 pages
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
No ratings yet
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
2 pages
ML Case Study
No ratings yet
ML Case Study
1 page
DL Assignment 2 Final
No ratings yet
DL Assignment 2 Final
15 pages
Fine-Tuning and Chatbot Planning
No ratings yet
Fine-Tuning and Chatbot Planning
2 pages
Machine Learning Presentation
No ratings yet
Machine Learning Presentation
20 pages
Gen AI Notes Paer 2
No ratings yet
Gen AI Notes Paer 2
14 pages
Ai 1
No ratings yet
Ai 1
22 pages
cl12 Huggingface
No ratings yet
cl12 Huggingface
34 pages
Harvard CS197 Lecture 4 Notes
No ratings yet
Harvard CS197 Lecture 4 Notes
15 pages
Joshua K. Cage - Python Transformers by Huggingface Hands On - 101 Practical Implementation Hands-On of ALBERT - ViT - BigBird and Other Latest Models With Huggingface Transformers
No ratings yet
Joshua K. Cage - Python Transformers by Huggingface Hands On - 101 Practical Implementation Hands-On of ALBERT - ViT - BigBird and Other Latest Models With Huggingface Transformers
186 pages
Britto
No ratings yet
Britto
16 pages
Project Description 1
No ratings yet
Project Description 1
3 pages
Sentiment Analysis Using NLP
No ratings yet
Sentiment Analysis Using NLP
42 pages
Aiml 3
No ratings yet
Aiml 3
4 pages
NLP Text Summarization App
No ratings yet
NLP Text Summarization App
3 pages
Deliverables and Question Answer
No ratings yet
Deliverables and Question Answer
4 pages
ChatGPT - Jack of All Trades, Master of None
No ratings yet
ChatGPT - Jack of All Trades, Master of None
37 pages
AI Project Challenges for Developers
No ratings yet
AI Project Challenges for Developers
6 pages
22BCE9752 NLPDigital Assignment 02
No ratings yet
22BCE9752 NLPDigital Assignment 02
21 pages
How To Fine-Tune BERT For Text Classification?: Corresponding Author The Source Codes Are Available at
No ratings yet
How To Fine-Tune BERT For Text Classification?: Corresponding Author The Source Codes Are Available at
10 pages
Spring 2025 - CS619 - 10969
No ratings yet
Spring 2025 - CS619 - 10969
4 pages
Questions and Instructions For Test IIT - BHU
No ratings yet
Questions and Instructions For Test IIT - BHU
1 page
Report1 2
No ratings yet
Report1 2
9 pages
LangChain Custom Project - Student Implementation Guide
No ratings yet
LangChain Custom Project - Student Implementation Guide
9 pages
Project Seminar
No ratings yet
Project Seminar
12 pages
Pre-Training BERT With Hugging Face Transformers and Habana Gaudi
No ratings yet
Pre-Training BERT With Hugging Face Transformers and Habana Gaudi
12 pages
CSR 322 Syllabus
No ratings yet
CSR 322 Syllabus
2 pages
Taask
No ratings yet
Taask
18 pages
Internship Task - AgenticAI
No ratings yet
Internship Task - AgenticAI
3 pages
Finetuning
No ratings yet
Finetuning
3 pages
CM2060 NLP Coursework
No ratings yet
CM2060 NLP Coursework
5 pages
Fine Tuning and Evaluation of A Language Model - Edited
No ratings yet
Fine Tuning and Evaluation of A Language Model - Edited
10 pages
Fine-Tuning GPT-2 Document
No ratings yet
Fine-Tuning GPT-2 Document
7 pages
AD3511 DL
No ratings yet
AD3511 DL
2 pages
Course Project Report For: Artificial Intelligence EL-3011
No ratings yet
Course Project Report For: Artificial Intelligence EL-3011
8 pages
Video Presentation Information
No ratings yet
Video Presentation Information
5 pages
Resume Prep and Clarification
No ratings yet
Resume Prep and Clarification
10 pages
03 134221 038 13291998009 28032025 090951pm
No ratings yet
03 134221 038 13291998009 28032025 090951pm
4 pages
RediMinds - AIEnabler - Technical - Exercise - DF 1
No ratings yet
RediMinds - AIEnabler - Technical - Exercise - DF 1
2 pages
Britto 1 15 2 15 - Merged
No ratings yet
Britto 1 15 2 15 - Merged
18 pages
NLP MTE Syllabus and Practice Problems
No ratings yet
NLP MTE Syllabus and Practice Problems
2 pages

Text Classification with Hugging Face

Uploaded by

Text Classification with Hugging Face

Uploaded by

Text pre-processing, tokenization, and stemming/lemmatization

N-grams and bag-of-words models

Assignment: Text Classification using Hugging Face

Objective: The goal of this assignment is to build a text classification model

Write a report that includes the following:

A brief introduction to the task and the dataset used

Use the latest version of transformers and python.

Please let me know if you need me to explain more.

You might also like