KEMBAR78
Cyberbullying Detection Using Machine Learning | PDF | Support Vector Machine | Machine Learning
0% found this document useful (0 votes)
136 views6 pages

Cyberbullying Detection Using Machine Learning

This document discusses cyberbullying detection using machine learning. It begins with an abstract describing how cyberbullying involves sending harmful messages online to threaten or insult others. It then discusses the need for automatic cyberbullying detection systems using machine learning models. The document proposes developing a live chat application using Naive Bayes classification to detect cyberbullying in messages. It reviews related work on cyberbullying detection using techniques like decision trees, deep learning models, and neural networks. Finally, it describes a proposed two-part cyberbullying detection framework involving natural language processing followed by machine learning classification.

Uploaded by

thomaslusifer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views6 pages

Cyberbullying Detection Using Machine Learning

This document discusses cyberbullying detection using machine learning. It begins with an abstract describing how cyberbullying involves sending harmful messages online to threaten or insult others. It then discusses the need for automatic cyberbullying detection systems using machine learning models. The document proposes developing a live chat application using Naive Bayes classification to detect cyberbullying in messages. It reviews related work on cyberbullying detection using techniques like decision trees, deep learning models, and neural networks. Finally, it describes a proposed two-part cyberbullying detection framework involving natural language processing followed by machine learning classification.

Uploaded by

thomaslusifer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Cyberbullying Detection using machine learning

Submitted by Logasree.S ,Harshini.M


B.Sc. Computer science
Arasu college of arts and science for women

Abstract—
Cyber bullying is the process of sending wrong messages to a person or community which causes heated debate with users. Cyber
bullying is mostly seen in social networking sites where users reply to post with bullying words to threaten or insult other users.
Cyber bullying is considered a misuse of technology. According to the latest survey done on all over the world data day by day,
cases are increasing on cyber bullying. In order to solve this problem many natural language processing techniques are proposed
by various authors which are time taking and not automatic. With the advancement of machine learning and artificial intelligence,
models can be created and automatic detection can be implemented. To show this scenario live chat application is developed in
python programming with multiple clients and one server and the Naive Bayes algorithm is used to train the model on a Twitter
dataset and using this model live detection of cyber bullying is predicted and alert messages are shown on the chat application.

Keywords: Cyberbullying, Machine learning, Naïve Bayes, dataset.

124
© 2023, IRJEdT Volume: 05 Issue: 04 | April-2023
PROBLEM STATEMENT: language-based cyberbullying detection method which shows
78.5% of accuracy. The authors used the decision tree and
Social networking and online chatting application provide instance-based trainer to achieve this accuracy. To improve
a platform for any user to share knowledge and talent but few cyberbullying detection, the author of the paper has used
users take this platform to threaten users with cyberbullying personalities, emotion and sentiment as the feature.
attacks which cause issues in using these platforms. Several deep learning-based models were also introduced
OBJECTIVE: to detect the cyberbullying. Deep Neural Network-based model
is applied for cyberbullying detection by using real-world data.
To provide a better platform for users to share knowledge The authors first analyse cyberbullying systematically then
on social networking sites there is a need for an effective used transfer learning to do the detection task. Badjatiya has
detection system that can automate the process of presented a method using deep neural net-work architectures
cyberbullying detections and take decisions. for detecting hate speech. A convolutional neural network-
based model has been proposed to detect cyberbullying. The
I. INTRODUCTION authors employed word embedding where similar words have
similar embedding. In a multi-modal context, Cheng research
Social media is a platform that allows people to post any the novel issue of cyberbullying identification by
thing like photos, videos, documents extensively and interact collaboratively exploiting social media data. This challenge,
with society . People connect with social media using their however, is difficult due to the complex combination of both
computers or smartphones. The most popular social media cross-modal associations among multiple methods and
includes Facebook, Twitter, Instagram, tiktok and so on. structural correlations between various social media sessions,
Nowadays, social media is involved in different sectors like and the complex attribute in-formation of different modalities.
Education , business, and also for the noble cause. Social media They propose XBully, a novel cyberbullying identification
is also enhancing the world’s economy through creating many system to overcome these challenges, which first reformulates
new job opportunities. multi-modal social media data as a heterogeneous network and
Although social media has a lot of benefits, it also has then tries to learn node embedding representations on it.
some drawbacks. Using this media, malevolent users conduct Many literatures on cyberbullying have concentrated on
unethical and fraudulent acts to hurt others feelings and damage text analysis over the past few decades. Cyberbullying,
their reputation. Recently, cyberbullying has been one of the however, is becoming multi-objective, multi-channel, and multi-
major social media issues. Cyberbullying or cyber-harassment form. The variety of bullying data on social platforms can not
refers to an electronic method of bullying or harassment. be met by conventional text analytical techniques.
Cyberbullying and cyber-harassment are also known as online Using Neural Networks to facilitate the identification of
bullying. As the digital realm has grown and technology has online bullying has become common in recent years. These
progressed, cyberbullying has become relatively common, Neural Networks are also based solely on or in conjunction with
particularly amongst adolescents. other layer types utilising Long-Short-Term-Memory layers.
As the social lifestyle exceeds the physical barrier of human Buan introduced a new model for the Neural Network that can
interaction and contains unregulated contact with strangers, it be applied in textual media to identify evidence of
is necessary to analyse and study the context of cyberbullying. cyberbullying. The concept is made on existing architectures
Cyberbullying makes the victim feel that he is being attacked that merge the strength of Long-Short-Term-Memory layers
everywhere as the internet is just a click away. It can have with Convolutionary layers. In addition, their architecture
mental, physical, and emotional effects on the victim. features the use of stacked core layers, which demonstrates that
Cyberbullying mainly takes place in the form of text or images their study enhances the Neural Network’s efficiency. A new
on social media. If bullying text can be distinguished from type of activation method is also included in the design, that is
non-bullying text, then a system can act accordingly. An called ‖Support Vector Machine like activation‖ By using L2
efficient cyberbullying detection system can be useful for weight regularisation and a linear activation function in the
social media websites and other messaging applications to activation layer along with using a Hinge loss function, the
counter such attacks and reduce the number of cyberbullying ‖Support Vector Machine like activation‖ is accomplished.
cases. The objective of the cyberbullying detection system is Cyberbullying has recently been identified by users of
to identify the cyberbullying text and also take its meaning online social networks as a significant national health problem
into consideration. and the creation of an effective detection model has consider-
II. RELATED STUDY able scientific merit. Al have introduced a collection of
specific Twitter-derived features including behaviour, user, and
tweet content. They have built a supervised machine learning
There are several works on machine learning-based solution for the detection of cyberbullying on Twitter based
cyber-bullying detection. A supervised machine learning network. An assessment shows that, based on their proposed
algorithm was proposed using a bag-of-words approach to features, their HI established detection system obtained
detect the sentiment and contextual features of a sentence . outcomes with a region under the receiver-operating
This algorithm shows barely 61.9% of accuracy. Massachusetts characteristic curve of 0.943 and an f-measure of 0.936.
Institute of Technology conducted a project called Ruminate
employing support vector machine to detect cyberbullying of III. BULLYING DETECTION MODEL
YouTube comments. The researcher combined detection with
common sense reasoning by adding social parameters. The In this section, we describe the cyberbullying detection
result of this project was improved to 66.7% accuracy for framework which consists of two major parts as shown in 1. The
applying probabilistic modelling. Reynolds proposed a first part is called NLP (Natural Language Processing) and the
125
© 2023, IRJEdT Volume: 05 Issue: 04 | April-2023
second part is named as ML (Machine learning). In the first In bag of words, every word is given equal importance while in
phase, datasets containing bullying texts, messages or post are TF-IDF the words that occur more frequently should be given
collected and prepared for the machine learning algorithms more importance as they are more useful for classification.
using natural language processing. The processed datasets are
then used to train the machine learning algorithms for detecting • Machine Learning: This module involves in applying
any harassing or bullying message on social media including various machine learning approaches like Decision Tree (DT),
Facebook and Twitter. Random Forest, Support Vector Machine, Naive Bayes to
detect the bullying message and text. The classifier with the
A. Methodology highest accuracy is discovered for a particular public
• Natural Language processing: The real world posts or text cyberbullying dataset. Next section, some common machine
contain various unnecessary characters or text. For example, learning algorithms are discussed to detect cyberbullying from
numbers or punctuation are irrelevant to bullying detection. social media texts.
Before applying the machine learning algorithms to the
comments, we need to clean and prepared them for the B. Machine Learning Algorithms
detection phase. In this phase, various processing task In this section, we discussed the basic mechanisms of several
including removal of all irrelevant characters like stop-words, machine learning algorithms. We presented Decision Tree,
punctuation and numbers, tokenizations, stemming etc. After Naive Bayes, Support Vector Machine in each subsection.
the pre-processing, we prepare the two important features of 1) Decision Tree: The decision tree classifier can be used
the texts as follows: in both classification and regression. It can help
1) Bag-of-Word: The machine learning algorithms can-not represent the decision as well as make a decision. The
work directly with the raw text. So before applying the decision tree is a tree- like structure where each internal
algorithms we must convert them to vector or numbers. So, the node represents a condition, and each leaf node
processed data is converted to Bag-of-Words (BoW) for the represents a decision. A classification tree returns the
next phase. class where the target falls. A regression tree yields the
2) TF-IDF: This is another features that we consider for our predicted value for an addressed input.
model. TF-IDF (Term Frequency-Inverse Document 2) Naive Bayes: Naive Bayes is an efficient machine
Frequency) is a statistical measure that can evaluate how learning algorithm based on Bayes theorem . The
relevant a word is to a document in a collection of documents. algorithm predicts depending on the probability of an
object. The binary and multi-class classification
problems can be quickly solved using this technique.
Based on Bayes’ Theorem it finds the probability of an
event occurring given the probability of another event
that has already occurred as follows:
p(y|X) = p(X|y) × p(y)

3) Support Vector Machine: Support Vector Machine


(SVM) is a supervised machine learning algorithm
which can be applied in both classification and
regression alike a decision tree. It can distinguish the
classes uniquely in n dimensional space. Thus, SVM
produces a more accurate result than other algorithms
in less time. In practice, SVM constructs set a of hyper
planes in a infinite-dimensional space and SVM is
implemented with kernel which transforms an input
data space into the required form. For example, Linear
Kernel uses the normal dot product of any two
instances as follows:
K(x, xi) = sum(x ∗ xi)

126
© 2023, IRJEdT Volume: 05 Issue: 04 | April-2023
correct psychometric categorization of the text. In future it is
intended to improve the system developed by use more
accurate dataset and to detect the cyberbullying or not. We also
apply other machine learning algorithm and check the accuracy
of models. Higher accuracy model will help to detect more
accurate bullying. Another interesting direction for future work
would be the detection of fine-grained cyberbullying categories
such as threats, curses and expressions of racism and hate.
When applied in a cascaded model, the system could find
severe cases of cyberbullying with high precision. This would
be particularly interesting for monitoring purposes.
Additionally, our dataset allows for detection of participant
roles typically involved in cyberbullying.

IV. I
V FEATURES
.

❖ Detection of Non-Textual Cyberbullying


We are going to develop an application
which has image in tweets or online data and we will fetch such
image from twitter and after OCR classification will be done by
our model SVM or naïve bayes model.
❖ Expanding Cyberbullying Role Detection beyond
Victims and Bullies
Roles such as instigators, defenders, and
bystanders will be identified by us based on the algorithm
model generated by us by collecting and labelling such type of
data.
❖ Determining a Victim’s Emotional State after a
Cyberbullying Incident
A victim may change his/her profile details
following such interactions, post content containing negative
sentiments, or leave the network abruptly. Such instigating
interaction can be flagged up for subsequent review by a
human who can then follow-up with appropriate actions Twitter
will not allow to go in the profile of user for this we might
create our own system which can identify such changes and
will determine how the bullying affected person.
❖ Word Representation Learning for Cyberbullying
detection

127
© 2023, IRJEdT Volume: 05 Issue: 04 | April-2023
Experiments can be performed to generate word embedding’s from different datasets, ranging from general corpora (e.g.,
Wikipedia) to more specialised datasets (e.g. Abusive tweets) to compare their effectiveness for cyberbullying detection.
❖ Detecting Cyberbullying in Streaming Data and Real-time
We will determine the cyberbullying on twitter dataset oauth token will be generated on twitter account we will fetch the tweets.
❖ Evaluating Annotation Judgement
We will annotate the each twitter sentence and output will be generated shown on text .

V. FUTURE MODIFICATION

The validity and accuracy of the predictive models to detect cyberbullying on twitter in this case primarily based on the

128
© 2023, IRJEdT Volume: 05 Issue: 04 | April-2023
VI. CONCLUSION

The goal of this project is to the automatic detection of cyberbullying-related posts on social media. Given the
information overload on the web, manual monitoring for cyberbullying has become unfeasible. Automatic detection of signals of
cyberbullying would enhance moderation and allow to respond quickly when necessary. However, these posts could just as well
indicate that cyberbullying is going on. The main aim of this project is that it presents a system to automatically detect signals of
cyberbullying on social media, including different types of cyberbullying, covering posts from bullies, victims and bystanders.

VII. REFERENCES
1. Cyberbullying Detection System on Twitter ieee
paper
2. Methods for Detection of Cyberbullying: A Survey
ieee paper
3. D. Poeter. (2011) Study: A Quarter of Parents Say
Their Child Involved in Cyberbullying. pcmag.com.
[Online].Available:
http://www.pcmag.com/article2/0,2817,2388540,00.
asp
4. J. W. Patchin and S. Hinduja, ―Bullies move
Beyond the Schoolyard; a Preliminary Look at
Cyberbullying,‖ Yout Violence and Juvenile Justice,
vol. 4, no. 2, pp. 148–169,2006
5. Anti Defamation League. (2011) Glossary of
Cyberbullying Terms.adl.org.[Online].Available:
http://www.adl.org/education/curriculum
connections/cyberbullying/glossary.pdf
6. https://www.sciencedirect.com/topics/computer-Sci
ence/deep-neural-network
7. J. C. Platt, ―Fast Training of Support Vector
Machines using Sequential Minimal Optimization,‖
Advances in Kernel Methods, pp. 185–208, 1999.
[Online]. Available:
http://portal.acm.org/citation.cfm?id=299094.29915
8. K. Dinakar, R. Reichart, and H. Lieberman,
―Modeling the Detection of Textual
Cyberbullying,‖ in Proc. IEEE International Fifth
International AAAI Conference on Weblogs and
Social Media (SWM’11), Barcelona, Spain, 2011.
9. Approaches to Automated Detection of
Cyberbullying: A Survey ieee paper
10. https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.
5627
11. https://journals.plos.org/plosone/article?id=10.1371
/journal.pone.0240924
12. https://1000projects.org/cyber-bullying-detection-us
ing-machine-learning.html
13. https://www.slideshare.net/ashisharora965/detecting
-the-presence-of-cyberbullying-using-computer-soft
ware
14. https://slideplayer.com/slide/11975230/
15. https://engineering.ucdenver.edu/current-students/c
apstone-expo/archived-expos/spring-2020/computer
-science/csci14-cyberbullying-detection-system

129
© 2023, IRJEdT Volume: 05 Issue: 04 | April-2023

You might also like