C OU R SE N O : C SE 4 1 1 4
Cou rse Tit le :Pat t e rn R e cogn it ion an d Mach in e L e arn in g
A Comprehensive Study to Sentiment Analysis of Bangla
Cricket-Related Social Media Comments Using ML and LSTM
Models
Research Participants
Adibul Haque Yousuf Ali Miftahul Sheikh
ID: 20200204029 ID: 20200204037 ID: 20200204038
Slide 02
Research Paper Presentation
Outline
01 Abstract 07 Evaluation Metrics
02 Introduction 08 Research Gap
03 Literature Review 9 Conclusion
04 Datasets 10 Contribution of group members
05 Pre-processing Techniques 11 References
06 Methodology
Slide 03
ABSTRACT
• The rise of social media has sadly increased
online bullying, harming victims.
• Machine Learning Models like LR, KNN, SVM,
Random Forest and XGBoost can detect toxic
comments.
• Deep Learning model like LSTM perform
exceptionally well in detecting Bengali toxic
comments.
Slide 04
INTRODUCTION
• Bullying in cricket communities affects
players and fans, both emotionally and
socially.
• With social media growth in Bangladesh,
online harassment in cricket communities is
rising.
• Anonymous trolling makes it hard to control.
• Machine and Deep Learning models
effectively detect abusive cricket comments.
Slide 05
Motivation
• The rise of social media in Bangladesh has
increased online discussions about cricket.
• Limited research exists on analyzing Bangla-
language sentiments regarding cricket.
• We aim to use advanced models to analyze and
understand the sentiments of cricket fans in
Bangladesh.
Slide 04
LITERATURE REVIEW
• The paper analyzes Bangla movie reviews for sentiment.
EVALUATION OF NA¨ IVE BAYES
• It uses Naive Bayes (NB) and Support Vector Machines
AND SUPPORT VECTOR MACHINES
(SVM) for polarity detection.
ON BANGLA TEXTUAL MOVIE
• SVM, with stemmed unigram features, achieved a
REVIEWS.
precision of 0.86.
• Outperformed ANN (81.10%), LinearSVC (75.70%), Logit
A DEEP LEARNING APPROACH TO
(75.20%), MNB (73.90%), and RF (70.50%).
DETECT ABUSIVE BENGALI TEXT.
• LSTM achieved an accuracy of 91% better than other
models.
Slide 06
LITERATURE REVIEW
• The study used 57,000 Bangla news items to identify
A STUDY TOWARDS BANGLA FAKE fake news.
NEWS DETECTION USING MACHINE • Bi-LSTM models with GloVe and FastText achieved up to
LEARNING AND DEEP LEARNING. 96% accuracy.
• GRU model accuracy was 77%.
CRICKET SENTIMENT ANALYSIS FROM
• Applies RNN with LSTM for Bangla cricket sentiment
BANGLA TEXT USING RECURRENT
dataset.
NEURAL NETWORK WITH LONG SHORT
• The LSTM model achieves an accuracy of 95%
TERM MEMORY MODEL.
• Support Vector Machine (SVM), achieved an accuracy of
71.03%
Slide 07
DATASETS
Size: 3000 instances,
manually labeled.
Toxic Categories:
• negative: 2152 (72.24%)
• positive: 566 (19.00%)
• neutral: 261 (8.76%)
Slide 09
PRE-PROCESSING TECHNIQUES
01 02 03
CLEANING NON- T O K E N I Z AT I O N REMOVE
BENGALI TEXT P U N C T UAT I O N
04 05 06
REMOVE EMOJI REMOVE L E M M AT I Z AT I O N
AND URLS STOPWORDS
Slide 10
METHODOLOGY
Feature Extraction:
Dataset Preprocessi • TF-IDF
ng • Bag of Word
Machine Learning
Deep Learning Model: Models:
• LSTM • LR
• KNN
• SVM
• Random Forest
• XGBoost
Result
Analysis
Slide 11
Evaluation Metrics
CLASSIFIER ACCURACY PRECISION RECALL
LOGISTIC REGRESSION GFG STANDARD PROFESSIONAL
SVM 0.7214 0.6852 0.7215
RANDOM FOREST 0.7065 0.6754 0.7012
KNN 0.7114 0.6814 0.7114
XGBOOST 0.7449 0.7350 0.7450
Slide 12
RESEARCH GAP
Paper [1]: Paper [3]:
1 • Limited research on deep learning for
3 • Research on Bangla fake news detection
Bangla toxic comment classification. is limited compared to English, and the
• Larger datasets and advanced models lack of tools like NLTK hampers
needed for better accuracy. accuracy.
• Enhancing Bangla fake news
classifi cation requires better data
preprocessing and advanced models.
Paper [2]:
2 4
Paper [4]:
• The paper notes a gap in advanced • Limited application of sentiment
deep learning models for Bengali analysis on Bangla text hinders insights.
abusive text detection. • A lack of structured resources for
• It highlights the need for an eff ective
cricket-related Bangla sentiment
Bengali spelling correction mechanism analysis restricts research.
to improve accuracy.
Slide 13
CONCLUSION
1 Extensive research of ML
techniques.
Random Forest and Neural Networks are
2 highly accurate.
Feature engineering and
3 preprocessing are crucial.
Larger datasets and real-world tests
4 are needed.
The study suggests future cybersecurity
5 improvements.
Slide 14
Related
Papers
Nayan Banik and Md Hasan Hafizur Rahman. Evaluation Elias Hossain, Md Nadim Kaysar, Abu Zahid Md Jalal
of na¨ ıve bayes and support vector machines on bangla Uddin Joy, MdMizanur Rahman, and Wahidur Rahman. A
01
textual movie reviews. In 2018 international conference 03 study towards bangla fake news detection using machine
on Bangla speech and language processing (ICBSLP), learning and deep learning. In Sentimental Analysis and
pages 1–6. IEEE, 2018. Deep Learning: Proceedings of ICSADL 2021, pages 79–
95. Springer, 2022.
Estiak Ahmed Emon, Shihab Rahman, Joti Banarjee, Amit Md Ferdous Wahid, Md Jahid Hasan, and Md Shahin Alom.
Kumar Das, and Tanni Mittra. A deep learning approach to Cricket sentiment analysis from bangla text using
02 detect abusive bengali text. In 2019 7th International 04 recurrent neural network with long short term memory
Conference on Smart Computing & Communications model. In 2019 International Conference on Bangla
(ICSCC), pages 1–5. IEEE, 2019. Speech and Language Processing (ICBSLP), pages 1–4.
IEEE, 2019.
Slide 15
CONTRIBUTION OF GROUP MEMBERS
Wr i ti n g Rep or t Prep arin g
Pap er P resen tation
Abstract, Introduction, Adib
Yousfu Ali
Conclusion.
Adibul Literature Review,
Mifta
Haque Datasets, References
Pre-processing
Miftahul
Techniques, Models, Yousuf
Sheikh
Evaluation Metrics
Slide 16
THANK YOU