FAKE SOCIAL MEDIA
PROFILE DETECTION USING
MACHINE LEARNING
Contents
■ Problem Statement ■ System Design and Modelling
■ Introduction ■ Project Plan
■ Literature Survey ■ Project Progress
■ Objectives ■ Conclusion
■ Proposed Architecture ■ References
■ Methodology
■ Project Modules
■ SRS
Problem Statement
■ The goal of this project work is to develop a machine
learning model that can accurately detect fake social
media accounts while minimizing false positives.
■ The model should be able to handle large volumes of
social media data, be resistant to adversarial attacks, and
be scalable to handle future growth in social media usage.
■ The model should also be transparent and interpretable to
help users understand the factors that contribute to the
detection of fake social media accounts.
Introduction
■ Fake social media accounts are a major problem in the digital
world. These accounts can be used to spread misinformation,
conduct online fraud, and manipulate public opinion.
■ Detecting fake social media accounts is a challenging task due to
the large volume of social media users and the sophisticated
techniques used by fraudsters to create these accounts.
■ The input to the machine learning model can be a set of features
that characterize the behavior and characteristics of social media
accounts. These features can include account creation date, number
of followers, number of posts, content of posts, and engagement
metrics.
Literature Survey
No. Paper / Publication Author Year Description and Limitations
1. Analysis and detection of fake profile Vijay Tiwari 2017 This paper reviews many methods to detect the fake profiles and
over social network. their online social bot. Multi agent perspective of online social
networks has also been analysed. But it does not detect fake social
media profile on the social media sites.
2. Detection of Fake and Clone accounts in P.Sowmya;Madh 2020 For Profile Cloning detection two methods are used. One using
Twitter using Classification and Distance umita Chatterjee Similarity Measures and the other using C4.5 decision tree
Measure Algorithms algorithm. This system only detect the cloned social media profile.
3. Profile Similarity Communication S.Revathi;Dr.M. 2018 This study examines Node Similarity Communication Matching
Matching Approaches for Detection of Suriakala algorithm utilizing profile cloning recognition in Online Social
Duplicate Profiles in Online Social Network depending on malicious user’s latest activities in the social
Network. network. Same as the above paper this system also used to detect
cloned profile.
4. Survey on Fake Profile Detection on Kumud Patel; 2020 This paper represents the review of Fake Profile Detection on Social
Social Sites by Using Machine Learning Site by Using Machine Learning. The accuracy of this system is less
Algorithm than 80%
5. Understanding User Profiles on Social Kai Shu;Suhang 2018 authors performed a comparative analysis over explicit and implicit
Media for Fake News Detection Wang;Huan Liu profile features between these user groups, which reveals their
potential to differentiate fake news. The findings of this paper lay the
foundation for future automatic fake news detection research.
Objectives
■ To develop a system to detect fraudulent
social media accounts accurately using
machine learning algorithms.
■ Create a highly accurate machine learning
model to classify fraudulent social media
accounts.
■ To reduce malicious activities such as
harassing a person, identity theft, and
privacy violations using machine learning
techniques.
Block Diagram
Methodology
■ Data Collection: Collect a dataset of social media profiles with labeled information
about their authenticity. This dataset can be created manually by experts or using
automated tools to gather data from different social media platforms.
■ Feature Engineering: Extract and select the most relevant features from the
collected data. Features can include user-generated content, user behavior,
metadata, network properties, and other characteristics that differentiate fake and
genuine accounts.
■ Data Preprocessing: Preprocessing steps can include cleaning the data, removing
irrelevant or noisy features, and scaling the features.
■ Model Training: The model should learn to distinguish between fake and genuine
accounts by identifying patterns and behaviors associated with each type of
account.
■ Model Deployment: Once the machine learning model has been trained and
evaluated, it can be deployed for use in detecting fake social media accounts.
Project Modules
1. Dataset Collection
2. Model Training
3. Data Preprocessing
4. Classification/Prediction using ML model
5. Visualization
SOFTWARE
REQUIREMENT
SPECIFICATION
Scope of Project
■ In this work, we use the Support Vector Machine method to
identify fraudulent social media accounts using machine
learning-based classification techniques.
■ We provide a user interface for entering user features to
determine whether a user account is fraudulent.
User and System requirements
■ User requirements
– The user can detect fake social media account using this
system.
– User only needs any internet-connected device like a
mobile, tablet, or laptop to access the application.
■ System requirements
– The system must be accurate in order to be useful.
– The system should be cloud-based, highly secure, and
preserve sensitive data.
Functional Requirements
■ System should be highly accurate.
■ System should be scalable, and 100 % available.
■ System should be accessible on any device – Mobile, Tablet,
etc.
Non-Functional Requirements
■ Performance Requirements - The system should take
immediate action and show results as fast as possible
■ Safety Requirements - The system/application is currently in
the developing phase so, shouldn’t use in the real world.
■ Security Requirements - NA
SYSTEM DESIGN AND
MODELING
Data Flow diagram 0
Data Flow diagram 1
Data Flow diagram 2
Use case diagram
Class diagram
Sequence diagram
Activity diagram
IMPLEMENTATION
DETAILS
Software requirements
■ Programming Language – Python
■ Framework – Streamlit (For User Interface)
■ Libraries – NumPy, Pandas, TensorFlow, Keras
■ Database – SQLite
Algorithm / Pseudo Code
■ Step 1: Data Collection - Collect data from social media platforms (e.g., Twitter,
Facebook, Instagram) and Preprocess the data by removing noise, irrelevant
information, and duplicate records.
■ Step 2: Feature Engineering - Extract features from the preprocessed data that could
potentially distinguish real from fake accounts. Features can include, but are not limited
to, account metadata (e.g., account creation date, number of followers), post content
(e.g., sentiment analysis, topic modeling), and user engagement (e.g., likes, retweets).
■ Step 3: Model Selection and Training - Select a suitable machine learning algorithm
to classify the accounts as real or fake. Train the model using the extracted features and
labeled data (i.e., real or fake account labels).
■ Step 4: Model Evaluation and Deployment - Evaluate the model's performance on a
holdout dataset (i.e., data that the model has not seen before). Fine-tune the model if
necessary and deploy it to detect fake accounts in real-time.
RESULT
*This is a main home page of the project
*This is a signup page of the project. The user information is storing into the
SQLite database after signup.
*This
*This is
is main
main page
page of
of the
the project.
project. The
The use
use can
can fill
fill the
the profile
profile information
information to
to
predict
predict whether
whether profile
profile is
is fake
fake or
or Not
Not using
using ML.
ML.
*This output show the user in not a
fake
Conclusion
■ Detecting fake social media accounts using machine learning
techniques can be an effective way to prevent various types of
social media fraud. Machine learning algorithms can learn
from a large dataset of social media accounts and their
associated features, such as user activity patterns, user profile
information, and network structure, to identify patterns that
distinguish genuine accounts from fake ones.
Future scope
Fake social media account detection using machine learning is an
active research area, and there are several potential future
directions for this field. Here are some of them:
■ Developing more accurate and efficient algorithms
■ Improving feature engineering
■ Addressing data privacy concerns
■ Use Deep leaning techniques
References
1. 7. Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? Empirical evaluation and new design for fighting evolving Twitter spammers.
In: Proceedings of the 14th International Conference on Recent Advances in Intrusion Detection, RAID 2011, pp. 318–337. Springer,
Heidelberg (2011)
2. 8. Elyusufi, Y., Seghiouer, H., Alimam, M.A.: Building profiles based on ontology for recommendation custom interfaces. In:
International Conference on Multimedia Computing and Systems (ICMCS) Anonymous IEEE, pp. 558–562 (2014)
3. 9. Elyusufi, Y., Alimam, M.A, Seghiouer, H.: Recommendation of personalized RSS feeds based on ontology approach and multi-agent
system in web 2.0. J. Theor. Appl. Inf. Technol. 70(2), 324–332 (2014) Social Networks Fake Profiles Detection 39
4. 10. Elyusufi, Z., Elyusufi, Y., Ait Kbir, M.: Customer profiling using CEP architecture in a Big Data context. In: SCA 2018 Proceedings of
the 3rd International Conference on Smart City Applications Article No. 64, Tetouan, Morocco, 10–11 October 2018. ISBN: 978-1-4503-
6562-8
5. 11. Granik, M., Mesyura, V.: Fake news detection using naive Bayes classifier. In: Conference: IEEE First Ukraine Conference on
Electrical and Computer Engineering (UKRCON), May 2017
6. 12. Ameena, A., Reeba, R.: Survey on different classification techniques for detection of fake profiles in social networks. Int. J. Sci.
Technol. Manage. 04(01), (2015)
7. 13. Beatriche, G.: Detection of fake profiles in Online Social Networks (OSNs), Master’s degree in Applied Telecommunications and
Engineering Management (MASTEAM), (2018)
THANK Y U