RTRP Project Documentation Format-2024 (AutoRecovered)
RTRP Project Documentation Format-2024 (AutoRecovered)
BACHELOR OF TECHNOLOGY
in
INFORMATION
TECHNOLOGY
by
N. VAISHNAVI 22K81A1247
J. SHRUTHI 22K81A1227
S. SAI PRANEETH 22K81A1255
K. AJAY KUMAR 22K81A1232
1
Under the Guidance of
Dr. A.BHEEM RAJ
PROFESSOR
CERTIFICATE
This is to certify that the project entitled “behavioral biometrics deciphered: unveiling
with machine learning techiniques” is being submitted by
N.VAISHNAVI(22K81A1247), J.SHRUTHI (22K81A1227), S.SAI PRANEETH
(22K81A1255), K.AJAY KUMAR (22K81A1232) in fulfilment of the requirement for the
award of degree of BACHELOR OF TECHNOLOGY in << INFORMATION
TECHNOLOGY>> is recorded of bonafide work carried out by them. The result embodied
in this report have been verified and found satisfactory.
DEPARTMENT OF
INFORMATION TECHNOLOGY
DECLARATION
N. VAISHNAVI 22K81A1247
J. SHRUTHI 22K81A1227
First and foremost, we would like to express our deep sense of gratitude
and indebtedness to our College Management for their kind support and permission to
use the facilities available in the Institute.
J. SHRUTHI 22K81A1227
II
LIST OF FIGURES
6.10 Result 50
III
LIST OF TABLES
IV
LIST OF ACRONYMS AND DEFINITIONS
4. NB Naïve Bayes
V
CONTENTS
ACKNOWLEDGEMENT I
ABSTRACT II
LIST OF FIGURES III
LIST OF TABLES IV
LIST OF ACRONYMS AND DEFINITIONS V
CHAPTER 1 INTRODUCTION 01
CHAPTER 2 LITERATURE SURVEY 02
CHAPTER 3 SYSTEM ANALYSIS AND DESIGN 05
3.1 Existing System 05
3.2 Proposed System 05
CHAPTER 4 SYSTEM REQUIREMENTS & SPECIFICATIONS 07
4.1 Database 07
4.2 DNN & SVM Algorithm 09
4.3 Design 11
4.3.1 System Architecture 11
4.3.2 Architecture diagrams 11
4.3.3 input and output design 16
4.4 Modules 18
4.4.1 Modules Description 18
4.5 System Requirements 20
4.5.1 Hardware Requirements 20
4.5.2 Software Requirements 20
4.5.3 System study 21
4.6 Testing 23
4.6.1 Unit Testing 23
4.6.2 Integration Testing 23
4.6.3 Functional Testing 24
4.6.4 System Testing 24
4.6.5 White Box Testing 24
4.6.6 Black Box Testing 25
4.6.7 Unit Testing 25
4.6.8 Integration Testing 25
4.6.9 Acceptance Testing 26
CHAPTER 5 SOURCE CODE 28
CHAPTER 6 EXPERIMENTAL RESULTS 47
CHAPTER 7 CONCLUSION & FUTURE ENHANCEMENT 51
7.1 CONCLUSION 51
7.2 FUTURE ENHANCEMENT 51
REFERENCES 52
Patent/Publication
CHAPTER 1
INTRODUCTION
~1~
CHAPTER 2
LITERATURE SURVEY
Mobile devices and technologies have become increasingly popular, offering comparable storage
and computational capabilities to desktop computers allowing users to store and interact with
sensitive and private information. The security and protection of such personal information are
becoming more and more important since mobile devices are vulnerable to unauthorized access or
theft. User authentication is a task of paramount importance that grants access to legitimate users
at the point-of-entry and continuously through the usage session. This task is made possible with
today's smartphones' embedded sensors that enable continuous and implicit user authentication by
capturing behavioral biometrics and traits. In this paper, we survey more than 140 recent
behavioral biometric-based approaches for continuous user authentication, including motion-based
methods (28 studies), gait-based methods (19 studies), keystroke dynamics-based methods (20
studies), touch gesture-based methods (29 studies), voice-based methods (16 studies), and
multimodal-based methods (34 studies). The survey provides an overview of the current state-of-
the-art approaches for continuous user authentication using behavioral biometrics captured by
smartphones' embedded sensors, including insights and open challenges for adoption, usability,
and performance.
~2~
A Systematic Review on Gait Based Authentication System
3)AUTHORS: Divya, R. and Lavanya, R.
Bio-metric frameworks are getting to be progressively important, since they are more
reliable and proficient for identity confirmation. One such biometric is gait. The pattern by
which an individual walks is mentioned as gait. It's a locomotion that's achieved through
the movement of a person's limb. Unlike several approaches gait is a behavioral biometric,
that is taken into consideration for user authentication as it shows distinct patterns for
every individual. Also, less obtrusion of user has made this biometric method to be more
advantageous compared to others. During this survey we tend to concentrate on varied gait
approaches, applications and various machine learning techniques which will be used for
classification of gait features and its applications.
~3~
“Identifying users of portable devices from gait pattern with
accelerometers”
Identifying users of portable devices from gait signals acquired with three-dimensional
accelerometers was studied. Three approaches, correlation, frequency domain and data
distribution statistics, were used. Test subjects (N=36) walked with fast,normal and slow
walking speeds in enrolment and test sessions on separate days wearing the
accelerometer device on their belt, at back. It was shown to be possible to identify users
with this novel gait recognition method. Best equal error rate (EER=7%) was achieved
with the signal correlation method, while the frequency domain method and two
variations of the data distribution statistics method produced EER of 10%, 18% and
19%, respectively.
~4~
CHAPTER 3
~5~
ADVANTAGES OF PROPOSED SYSTEM:
~6~
CHAPTER 4
4.1DATABASE:
Biometric Data: This includes fingerprints, facial recognition data, iris scans, and
Labeled Faces in the Wild (LFW): A database for studying the problem
unconstrained face recognition.
Textual Data: This can be used for natural language processing tasks like
authorship identification or text-based identity verification.
Examples:
Behavioral Data: Data capturing user behavior, such as typing patterns, mouse
movements, and smartphone usage patterns.
Examples:
MIT Mouse Dynamics Challenge Dataset: Contains mouse movement data for
behavioral biometrics research.
~7~
Aalto University Dataset for Mobile Behavioral Biometrics: Contain smartphone usage
data.
Demographic Data: Includes data about individuals' age, gender, location, etc.,
often used in conjunction with other data types for identity verification.
Examples:
UCI Adult Dataset: Contains demographic information for machine learning tasks.
Multimodal Data: Combines multiple data types, such as video, audio, and text,
to improve the accuracy of identity verification systems.
Examples:
CASIA Iris Dataset Iris images for Iris Images CASIA Iris
biometric research.
~8~
4.2 DNN & SVM ALGORITHUM:
1. Input Layer:
o The initial layer that receives the input data.
2. Hidden Layers:
o Multiple layers where each neuron applies a weighted sum of its inputs followed
by a non-linear activation function (like ReLU, sigmoid, or tanh).
3. Output Layer:
o The final layer that produces the prediction or classification. In classification
tasks, the softmax activation function is commonly used to output probabilities.
Training Process
1. Forward Propagation:
o Input data is passed through the network, layer by layer, with each layer
transforming the data using weights and activation functions.
2. Loss Calculation:
o The network's output is compared to the true labels using a loss function (e.g.,
cross-entropy for classification tasks).
3. Backpropagation:
o Gradients of the loss with respect to each weight are calculated using the chain
rule, and weights are updated using optimization algorithms like stochastic
gradient descent (SGD).
4. Iteration:
o The process is repeated for many epochs until the model converges to an optimal
set of weights.
Support Vector Machines (SVMs) are supervised learning models used for
classification and regression tasks. SVMs work by finding the hyperplane that best
separates the classes in the feature space.
Key Concepts
1. Hyperplane:
o A decision boundary that separates different classes in the feature space. The
optimal hyperplane maximizes the margin between the closest points of the
classes (support vectors).
2. Support Vectors:
o The data points closest to the hyperplane, which are critical in defining the
position and orientation of the hyperplane.
~9~
3. Kernel Trick:
o SVMs can be extended to non-linear classification using kernel functions (e.g.,
linear, polynomial, RBF). The kernel trick maps the input features into higher-
dimensional space where a linear separator is possible.
Training Process
1. Define Objective:
o The goal is to find the hyperplane that maximizes the margin between classes.
This involves solving a quadratic optimization problem.
2. Optimization:
o Use optimization techniques to find the support vectors and the optimal
hyperplane.
3. Prediction:
o For a new data point, the SVM predicts the class based on which side of the
hyperplane it falls.
In some scenarios, combining the strengths of DNNs and SVMs can yield better results.
For example:
1. Data Preprocessing:
o Collect and preprocess the data (e.g., images, text) to be used for identity
recognition.
2. Feature Extraction:
o Train a DNN to extract meaningful features from the data. For images, this could
involve using convolutional layers followed by fully connected layers.
3. Feature Selection:
o Select the most relevant features from the DNN output, potentially reducing
dimensionality with techniques like PCA.
4. Training SVM:
o Train an SVM using the extracted features to classify identities.
5. Evaluation and Tuning:
o Evaluate the model's performance using metrics like accuracy, precision, recall,
and F1-score. Tune hyperparameters for both the DNN and SVM to improve
performance.
~ 10 ~
4.3 DESIGN
1. The DFD is also called as bubble chart. It is a simple graphical formalism that can
be used to represent a system in terms of input data to the system, various
processing carried out on this data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It is
used to model the system components. These components are the system process,
the data used by the process, an external entity that interacts with the system and
the information flows in the system.
3. DFD shows how the information moves through the system and how it is modified
by a series of transformations. It is a graphical technique that depicts information
flow and the transformations that are applied as data moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system at
any level of abstraction. DFD may be partitioned into levels that represent
increasing information flow and functional detail.
~ 11 ~
UML DIAGRAMS
The goal is for UML to become a common language for creating models of object
oriented computer software. In its current form UML is comprised of two major
components: a Meta-model and a notation. In the future, some form of method or process
may also be added to; or associated with, UML.
The UML represents a collection of best engineering practices that have proven successful
in the modeling of large and complex systems.
The UML is a very important part of developing objects oriented software and the
software development process. The UML uses mostly graphical notations to express the
design of software projects.
~ 12 ~
GOALS:
~ 13 ~
CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modeling Language (UML) is a
type of static structure diagram that describes the structure of a system by showing the
system's classes, their attributes, operations (or methods), and the relationships among the
classes. It explains which class contains information.
SEQUENCE DIAGRAM:
A sequence diagram in Unified Modeling Language (UML) is a kind of interaction
diagram that shows how processes operate with one another and in what order. It is a
construct of a Message Sequence Chart. Sequence diagrams are sometimes called event
diagrams, event scenarios, and timing diagrams.
~ 14 ~
ACTIVITY DIAGRAM:
Activity diagrams are graphical representations of workflows of stepwise activities and
actions with support for choice, iteration and concurrency. In the Unified Modeling
Language, activity diagrams can be used to describe the business and operational step-by-
step workflows of components in a system. An activity diagram shows the overall flow of
control.
~ 15 ~
4.3.3 INPUT AND OUTPUT DESIGN:
INPUT DESIGN
The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and those steps
are necessary to put transaction data in to a usable form for processing can be achieved by
inspecting the computer to read data from a written or printed document or it can occur by
having people keying the data directly into the system. The design of input focuses on
controlling the amount of input required, controlling the errors, avoiding delay, avoiding
extra steps and keeping the process simple. The input is designed in such a way so that it
provides security and ease of use with retaining the privacy. Input Design considered the
following things:
OBJECTIVES
1.Input Design is the process of converting a user-oriented description of the
input into a computer-based system. This design is important to avoid errors in the data
input process and show the correct direction to the management for getting correct
information from the computerized system.
2. It is achieved by creating user-friendly screens for the data entry to handle
large volume of data. The goal of designing input is to make data entry easier and to be
free from errors. The data entry screen is designed in such a way that all the data
manipulates can be performed. It also provides record viewing facilities.
3.When the data is entered it will check for its validity. Data can be entered
with the help of screens. Appropriate messages are provided as when needed so that the
user will not be in maize of instant. Thus the objective of input design is to create an input
layout that is easy to follow
OUTPUT DESIGN
A quality output is one, which meets the requirements of the end user and
presents the information clearly. In any system results of processing are communicated to
the users and to other system through outputs. In output design it is determined how the
information is to be displaced for immediate need and also the hard copy output. It is the
most important and direct
source information to the user. Efficient and intelligent output design improves the
system’s relationship to help user decision-making.
1. Designing computer output should proceed in an organized, well thought
out manner; the right output must be developed while ensuring that each output element is
designed so that people will find the system can use easily and effectively. When analysis
design computer output, they should Identify the specific output that is needed to meet the
requirements.
~ 16 ~
3.Create document, report, or other formats that contain information
produced by the system.
The output form of an information system should accomplish one or more of the
following objectives.
~ 17 ~
4.4 MODULES:
User
Admin
Data Preprocessing
Machine Learning
MODULES DESCRIPTION:
User:
The User can register first. While registering he required a valid user email and mobile for
further communications. Once the user register then admin can activate the user. Once
admin activated the user then user can login into our system. User can upload the dataset
based on our dataset column matched. For algorithm execution data must be in int or float
format. Here we took
Adacel Technologies Limited dataset for testing purpose. User can also add the new data
for existing dataset based on our Django application. User can click the Data Preparations
in the web page so that the data cleaning process will be starts. The cleaned data and its
required graph will be displayed.
Admin:
Admin can login with his login details. Admin can activate the registered users. Once he
activate then only the user can login into our system. Admin can view Users and he can
view overall data in the browser and he load the data. Admin can view the training data
list and test data list. Admin can load the data and view forecast results.
Data Preprocessing:
A dataset can be viewed as a collection of data objects, which are often also called as a
Sentiment tweets like positive, negative and neutral. Data objects are described by a
number of features that capture the basic characteristics of an object, such as the mass of a
physical object or the time at which an event occurred, etc.
~ 18 ~
The study is based on a pipeline that involves preprocessing, sentiment analysis, topic
modeling, natural language processing and statistical analysis of Twitter data extracted in
the form of tweets. We use Tweets and Sentiment amount of data.
Machine learning:
Based on the split criterion, the cleaned data is split into 80% training and 20% test, then
the dataset is subjected to one machine learning classifier such as Natural Language
Process(NLP). Sentiment analysis by fine tuning auto encoding models like BERT and
ALBERT to achieve a comprehensive understanding of public sentiment. Thus, we have
analyzed the results of our experiment and methodology using the contextual information
and verified the insights.
~ 19 ~
4.5 SYSTEM SPECIFICATION
System :Intel i3
Ram : 4GB.
SOFTWARE REQUIREMENTS:
Designing : Html,css,javascript.
~ 20 ~
4.5.3 SYSTEM STUDY:
FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates. During
system analysis the feasibility study of the proposed system is to be carried out. This is
to ensure that the proposed system is not a burden to the company. For feasibility
analysis, some understanding of the major requirements for the system is essential.
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will have
on the organization. The amount of fund that the company can pour into the research and
development of the system is limited. The expenditures must be justified. Thus the
developed system as well within the budget and this was achieved because most of the
technologies used are freely available. Only the customized products had to be purchased .
TECHNICAL FEASIBILITY
This study is carried out system. Any system developed must not have a high demand on
the available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the client. The developed
system must have a modest requirement, as only minimal or null changes are required for
~ 21 ~
SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the user.
This includes the process of training the user to use the system efficiently. The user must
not feel threatened by the system, instead must accept it as a necessity. The level of
acceptance by the users solely depends on the methods that are employed to educate the
user about the system and to make him familiar with it. His level of confidence must be
raised so that he is also able to make some constructive criticism, which is welcomed, as
he is the final user of the system.
~ 22 ~
4.6 TESTING
SYSTEM TESTING:
The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub assemblies, assemblies and/or a finished product It is the
process of exercising software with the intent of ensuring that the Software system meets
its requirements and user expectations and does not fail in an unacceptable manner. There
are various types of test. Each test type addresses a specific testing requirement.
TYPES OF TESTS
~ 23 ~
4.6.3 Functional test
~ 24 ~
4.6.6 Black Box Testing
Black Box Testing is testing the software without any knowledge of the
inner workings, structure or language of the module being tested. Black box tests, as most
other kinds of tests, must be written from a definitive source document, such as
specification or requirements document, such as specification or requirements document.
It is a testing in which the software under test is treated, as a black box .you cannot “see”
into it. The test provides inputs and responds to outputs without considering how the
software works.
4.6.7 Unit Testing
Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit testing to
be conducted as two distinct phases.
Test strategy and approach
Field testing will be performed manually and functional tests will be
written in detail.
Test objectives
Features to be tested
~ 25 ~
4.6.9 Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
~ 26 ~
4.7 SIMPLE TEST CASES
Remarks(IF
S.no Test Case Excepted Result Result
Fails)
If already user
1 User Register If User registration successfully. Pass email exist then
it fails.
If Username and password is Un Register
2 User Login correct then it will getting valid Pass Users will not
page. logged in.
The request will
be not accepted
The request will be accepted by the
3 svm Pass by the svm
svm
otherwise its
failed
The request will
be accepted by
The request will be accepted by the
4 Naive Bayes Pass the Naive Bayes
Naive Bayes
otherwise its
failed
View dataset by Data set will be displayed by the Results not true
5 Pass
user user failed
Calculate
macro avg and
accuracy macro macro avg and weighted avg
7 Pass weighted avg not
avg and weighted calculated
displayed failed
avg
Result will be cyberbulling or not
8 prediction pass Otherwise fail
cyberbulling
Admin can login with his login Invalid login
9 Admin login credential. If success he get his Pass details will not
home page allowed here
Admin can If user id not
Admin can activate the register
10 activate the Pass found then it
user id
register users won’t login.
~ 27 ~
CHAPTER 5
SOURCE CODE
def UserLoginCheck(request):
if request.method == "POST":
loginid = request.POST.get('loginname')
pswd = request.POST.get('pswd')
print("Login ID = ", loginid, ' Password = ', pswd)
try:
check = UserRegistrationModel.objects.get(loginid=loginid, password=pswd)
status = check.status
print('Status is = ', status)
if status == "activated":
request.session['id'] = check.id
request.session['loggeduser'] = check.name
request.session['loginid'] = loginid
request.session['email'] = check.email
print("User id At", check.id, status)
return render(request, 'users/UserHome.html', {})
~ 28 ~
else:
messages.success(request, 'Your Account has not been activated by Admin.')
return render(request, 'UserLogin.html')
except Exception as e:
print('Exception is ', str(e))
pass
messages.success(request, 'Invalid Login id and password')
return render(request, 'UserLogin.html', {})
def UserHome(request):
return render(request, 'users/UserHome.html', {})
def TrainModel(request):
import os
import tensorflow as tf
import pandas as pd
import numpy as np
from django.conf import settings
from matplotlib import pyplot as plt
activity_codes_mapping = {'A': 'walking',
'B': 'jogging',
'C': 'stairs',
'D': 'sitting',
'E': 'standing',
'F': 'typing',
'G': 'brushing teeth',
'H': 'eating soup',
'I': 'eating chips',
'J': 'eating pasta',
'K': 'drinking from cup',
'L': 'eating sandwich',
'M': 'kicking soccer ball',
'O': 'playing catch tennis ball',
'P': 'dribbling basket ball',
'Q': 'writing',
'R': 'clapping',
'S': 'folding clothes'}
~ 29 ~
activity_codes_mapping['E']: 'yellow',
activity_codes_mapping['F']: 'lightgreen',
activity_codes_mapping['G']: 'greenyellow',
activity_codes_mapping['H']: 'magenta',
activity_codes_mapping['I']: 'gold',
activity_codes_mapping['J']: 'cyan',
activity_codes_mapping['K']: 'purple',
activity_codes_mapping['L']: 'lightgreen',
activity_codes_mapping['M']: 'violet',
activity_codes_mapping['O']: 'limegreen',
activity_codes_mapping['P']: 'deepskyblue',
activity_codes_mapping['Q']: 'mediumspringgreen',
activity_codes_mapping['R']: 'plum',
activity_codes_mapping['S']: 'olive'}
if interval_in_sec == None:
ax = df1[:].plot(kind='line', x='duration', y=['x','y','z'], figsize=(25,7), grid = True)
# ,title = act)
else:
ax = df1[:interval_in_sec*20].plot(kind='line', x='duration', y=['x','y','z'],
figsize=(25,7), grid = True) # ,title = act)
if interval_in_sec == None:
~ 30 ~
ax = df1[:].plot(kind='line', x='duration', y=['x','y','z'], figsize=(25,7), grid = True)
# ,title = act)
else:
ax = df1[:interval_in_sec*20].plot(kind='line', x='duration', y=['x','y','z'],
figsize=(25,7), grid = True) # ,title = act)
#accel_phone
raw_par_10_phone_accel['activity'] =
raw_par_10_phone_accel['activity_code'].map(activity_codes_mapping)
print(raw_par_10_phone_accel)
#accel_watch
raw_par_20_watch_accel.z = raw_par_20_watch_accel.z.str.strip(';')
raw_par_20_watch_accel.z = pd.to_numeric(raw_par_20_watch_accel.z)
raw_par_20_watch_accel['activity'] =
raw_par_20_watch_accel['activity_code'].map(activity_codes_mapping)
~ 31 ~
print(raw_par_20_watch_accel)
for key in activity_codes_mapping:
show_accel_per_activity('Watch', raw_par_20_watch_accel,
activity_codes_mapping[key], 50)
#gyro_phone
raw_par_35_phone_ang_vel = pd.read_csv(datasetpath + '/' +
raw_par_35_phone_ang_vel.z = raw_par_35_phone_ang_vel.z.str.strip(';')
raw_par_35_phone_ang_vel.z = pd.to_numeric(raw_par_35_phone_ang_vel.z)
raw_par_35_phone_ang_vel['activity'] =
raw_par_35_phone_ang_vel['activity_code'].map(activity_codes_mapping)
raw_par_35_phone_ang_vel = raw_par_35_phone_ang_vel[['participant_id',
'activity_code', 'activity', 'timestamp', 'x', 'y', 'z']]
print(raw_par_35_phone_ang_vel)
#gyro_watch
raw_par_45_watch_ang_vel = pd.read_csv(datasetpath + '/' +
'raw/watch/gyro/data_1635_gyro_watch.txt', names = ['participant_id' , 'activity_code' ,
'timestamp', 'x', 'y', 'z'], index_col=None, header=None)
raw_par_45_watch_ang_vel.z = raw_par_45_watch_ang_vel.z.str.strip(';')
raw_par_45_watch_ang_vel.z = pd.to_numeric(raw_par_45_watch_ang_vel.z)
raw_par_45_watch_ang_vel['activity'] =
raw_par_45_watch_ang_vel['activity_code'].map(activity_codes_mapping)
raw_par_45_watch_ang_vel = raw_par_45_watch_ang_vel[['participant_id',
'activity_code', 'activity', 'timestamp', 'x', 'y', 'z']]
print(raw_par_45_watch_ang_vel)
~ 32 ~
features = ['ACTIVITY',
'X0', # 1st bin fraction of x axis acceleration distribution
'X1', # 2nd bin fraction ...
'X2',
'X3',
'X4',
'X5',
'X6',
'X7',
'X8',
'X9',
'Y0', # 1st bin fraction of y axis acceleration distribution
'Y1', # 2nd bin fraction ...
'Y2',
'Y3',
'Y4',
'Y5',
'Y6',
'Y7',
'Y8',
'Y9',
'Z0', # 1st bin fraction of z axis acceleration distribution
'Z1', # 2nd bin fraction ...
'Z2',
'Z3',
'Z4',
'Z5',
'Z6',
'Z7',
'Z8',
'Z9',
'XAVG', # average sensor value over the window (per axis)
'YAVG',
'ZAVG',
'XPEAK', # Time in milliseconds between the peaks in the wave associated with
most activities. heuristically determined (per axis)
'YPEAK',
'ZPEAK',
'XABSOLDEV', # Average absolute difference between the each of the 200
readings and the mean of those values (per axis)
'YABSOLDEV',
'ZABSOLDEV',
'XSTANDDEV', # Standard deviation of the 200 window's values (per axis)
***BUG!***
'YSTANDDEV',
'ZSTANDDEV',
'XVAR', # Variance of the 200 window's values (per axis) ***BUG!***
'YVAR',
'ZVAR',
'XMFCC0', # short-term power spectrum of a wave, based on a linear cosine
transform of a log power spectrum on a non-linear mel scale of frequency (13 values per
~ 33 ~
axis)
'XMFCC1',
'XMFCC2',
'XMFCC3',
'XMFCC4',
'XMFCC5',
'XMFCC6',
'XMFCC7',
'XMFCC8',
'XMFCC9',
'XMFCC10',
'XMFCC11',
'XMFCC12',
'YMFCC0', # short-term power spectrum of a wave, based on a linear cosine
transform of a log power spectrum on a non-linear mel scale of frequency (13 values per
axis)
'YMFCC1',
'YMFCC2',
'YMFCC3',
'YMFCC4',
'YMFCC5',
'YMFCC6',
'YMFCC7',
'YMFCC8',
'YMFCC9',
'YMFCC10',
'YMFCC11',
'YMFCC12',
'ZMFCC0', # short-term power spectrum of a wave, based on a linear cosine
transform of a log power spectrum on a non-linear mel scale of frequency (13 values per
axis)
'ZMFCC1',
'ZMFCC2',
'ZMFCC3',
'ZMFCC4',
'ZMFCC5',
'ZMFCC6',
'ZMFCC7',
'ZMFCC8',
'ZMFCC9',
'ZMFCC10',
'ZMFCC11',
'ZMFCC12',
'XYCOS', # The cosine distances between sensor values for pairs of axes (three
pairs of axes)
'XZCOS',
'YZCOS',
'XYCOR', # The correlation between sensor values for pairs of axes (three pairs of
axes)
~ 34 ~
'XZCOR',
'YZCOR',
'RESULTANT', # Average resultant value, computed by squaring each matching
x, y, and z value, summing them, taking the square root, and then averaging these values
over the 200 readings
'PARTICIPANT'] # Categirical: 1600 -1650
import glob
# path = r'media/wisdm-dataset/arff_files/phone/accel'
path = datasetpath + '/' + 'arff_files/phone/accel'
all_files = glob.glob(path + "/*.arff")
list_dfs_phone_accel = []
print(all_phone_accel)
print(all_phone_accel.info())
all_phone_accel_breakpoint = all_phone_accel.copy()
# all_phone_accel['ACTIVITY'].map(activity_codes_mapping).value_counts()
#_=
all_phone_accel['ACTIVITY'].map(activity_codes_mapping).value_counts().plot(kind =
'bar', figsize = (15,5), color = 'purple', title = 'row count per activity', legend = True,
fontsize = 15)
~ 35 ~
y = all_phone_accel.ACTIVITY
X = all_phone_accel.drop('ACTIVITY', axis=1)
y_train = X_train['Y']
print('-----Y Train-------')
print(y_train)
print('-------X_test-------')
print(X_test)
~ 36 ~
print(X_test)
print('len - ',len(X_test))
print('type - ',type(X_test))
X_test.to_csv('TestDataFrame.csv')
import pandas as pd
import matplotlib.pyplot as plt
import os
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.tree import DecisionTreeClassifier
import pickle
dt_classifier = DecisionTreeClassifier()
~ 37 ~
dt_model_gs.fit(X_train, y_train)
print('-------Fit Done---------')
print(dt_model_gs.best_params_)
dt_best_classifier = dt_model_gs.best_estimator_
pickle.dump(dt_best_classifier, open('Ajmodel.pkl', 'wb'))
print('-------Pickling Model Dumped------')
y_test_pred = dt_best_classifier.predict(X_test)
classification_report =
classification_report(y_true=y_test,y_pred=y_test_pred,output_dict=True)
classification_report = pd.DataFrame(classification_report).transpose().to_html()
def Predict(request):
if request.method == 'POST':
activity_codes_mapping = {'A': 'walking',
'B': 'jogging',
'C': 'stairs',
'D': 'sitting',
'E': 'standing',
'F': 'typing',
'G': 'brushing teeth',
'H': 'eating soup',
'I': 'eating chips',
'J': 'eating pasta',
'K': 'drinking from cup',
'L': 'eating sandwich',
'M': 'kicking soccer ball',
'O': 'playing catch tennis ball',
'P': 'dribbling basket ball',
'Q': 'writing',
'R': 'clapping',
'S': 'folding clothes'}
import os
from django.conf import settings
import pickle
import pandas as pd
~ 38 ~
index_no = request.POST.get('index_no')
print(index_no)
print('type ----> ',type(index_no))
modelPath = os.path.join(settings.MEDIA_ROOT,'Ajmodel.pkl')
testDataPath = os.path.join(settings.MEDIA_ROOT,'TestDataFrame.csv')
activity = activity_codes_mapping.get(pred_result[0])
else:
return render(request, 'users/prediction.html', {})
Base.html:
{% load static %}
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>offensive langauge</title>
<meta content="" name="description">
<meta content="" name="keywords">
~ 39 ~
<!-- Vendor CSS Files -->
<link href="{% static 'vendor/animate.css/animate.min.css'%}" rel="stylesheet">
<link href="{% static 'vendor/aos/aos.css'%}" rel="stylesheet">
<link href="{% static 'vendor/bootstrap/css/bootstrap.min.css'%}" rel="stylesheet">
<link href="{% static 'vendor/bootstrap-icons/bootstrap-icons.css'%}" rel="stylesheet">
<link href="{% static 'vendor/boxicons/css/boxicons.min.css'%}" rel="stylesheet">
<link href="{% static 'vendor/glightbox/css/glightbox.min.css'%}" rel="stylesheet">
<link href="{% static 'vendor/swiper/swiper-bundle.min.css'%}" rel="stylesheet">
<!-- =======================================================
* Template Name: Flattern
* Updated: May 30 2023 with Bootstrap v5.3.0
* Template URL: https://bootstrapmade.com/flattern-multipurpose-bootstrap-template/
* Author: BootstrapMade.com
* License: https://bootstrapmade.com/license/
======================================================== -->
</head>
<body>
<!-- ======= Header ======= -->
<header id="header" class="d-flex align-items-center">
<div class="container d-flex justify-content-between">
<div class="logo">
<h1 class="text-light"><a href="/index">OFFENSIVE LANGUAGE
DETECTION</a></h1>
<!-- Uncomment below if you prefer to use an image logo -->
~ 40 ~
</div>
</header><!-- End Header -->
~ 41 ~
<div class="carousel-container">
<div class="carousel-content animate__animated animate__fadeInUp">
<h2>Machine learning approach: </h2>
<p>Train a machine learning model using labeled data to classify text as
offensive or non-offensive. This typically involves extracting features from the text, such
as n-grams, word embeddings, or syntactic features, and using algorithms like Guassian
Naive Bayes,Decision Tree(DT),Support Vector Machines (SVM),Random Forest (RF),
Logistic Regresiiopn(LR),Multi Layer Preceptron (MLP), Gradient Bossting(GB),Ada
Boost.</p>
<div class="text-center"><a href="" class="btn-get-started">Read
More</a></div>
</div>
</div>
</div>
</div>
</div>
{% endblock %}
<main id="main">
<div class="row">
<div class="col-lg-4 col-md-6">
<div class="icon-box" data-aos="fade-up">
~ 42 ~
<div class="icon"><i class="bi bi-briefcase"></i></div>
<h4 class="title"><a href="">Machine Learning</a></h4>
<p class="description">Machine Learning is a program that analyses data and
learns to predict the outcome.Machine Learning is making the computer learn from
studying data and statistics.</p>
</div>
</div>
<div class="col-lg-4 col-md-6">
<div class="icon-box" data-aos="fade-up" data-aos-delay="100">
<div class="icon"><i class="bi bi-card-checklist"></i></div>
<h4 class="title"><a href="">Data preprocessing</a></h4>
<p class="description">It is a crucial step in data analysis and machine learning
tasks. It involves preparing the raw data to make it suitable for further analysis or model
training.</p>
</div>
</div>
<div class="col-lg-4 col-md-6">
<div class="icon-box" data-aos="fade-up" data-aos-delay="200">
<div class="icon"><i class="bi bi-bar-chart"></i></div>
<h4 class="title"><a href="">Handling Missing Values</a></h4>
<p class="description"> Identify and handle missing values in the dataset. This
can involve techniques such as imputation (replacing missing values with estimated
values) or deletion (removing rows or columns with missing values).</p>
</div>
</div>
<div class="col-lg-4 col-md-6">
<div class="icon-box" data-aos="fade-up" data-aos-delay="200">
<div class="icon"><i class="bi bi-binoculars"></i></div>
<h4 class="title"><a href="">Training</a></h4>
<p class="description">The training set is used to train the machine learning
model. It typically consists of a large portion of the available data, usually around 70-
80%.</p>
</div>
</div>
<div class="col-lg-4 col-md-6">
<div class="icon-box" data-aos="fade-up" data-aos-delay="300">
<div class="icon"><i class="bi bi-brightness-high"></i></div>
<h4 class="title"><a href="">Testing Set</a></h4>
<p class="description">
The testing set is used to evaluate the performance of the trained model. It
should be independent of the training set and should not be used during the training
process.</p>
</div>
</div>
~ 43 ~
<div class="col-lg-4 col-md-6">
<div class="icon-box" data-aos="fade-up" data-aos-delay="400">
<div class="icon"><i class="bi bi-calendar4-week"></i></div>
<h4 class="title"><a href="">After Traing and Testing</a></h4>
<p class="description">The predicted values are compared with the actual target
values in the testing set to assess the model's accuracy, precision, recall, F1 score, or other
performance metrics, depending on the specific task.</p>
</div>
</div>
</div>
</div>
</section><!-- End Services Section -->
<div class="footer-top">
<div class="container">
<div class="row">
</div>
</div>
</div>
~ 44 ~
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="{% static 'vendor/aos/aos.js' %}"></script>
<script src="{% static 'vendor/bootstrap/js/bootstrap.bundle.min.js' %}"></script>
<script src="{% static 'vendor/glightbox/js/glightbox.min.js' %}"></script>
<script src="{% static 'vendor/isotope-layout/isotope.pkgd.min.js' %}"></script>
<script src="{% static 'vendor/swiper/swiper-bundle.min.js' %}"></script>
<script src="{% static 'vendor/waypoints/noframework.waypoints.js' %}"></script>
<script src="{% static 'vendor/php-email-form/validate.js' %}"></script>
</body>
</html>
Admin side views:
from django.shortcuts import render, HttpResponse
from django.contrib import messages
from users.models import UserRegistrationModel
else:
messages.success(request, 'Please Check Your Login Details')
return render(request, 'AdminLogin.html', {})
def AdminHome(request):
return render(request, 'admins/AdminHome.html')
def RegisterUsersView(request):
data = UserRegistrationModel.objects.all()
return render(request,'admins/viewregisterusers.html',{'data':data})
def ActivaUsers(request):
if request.method == 'GET':
~ 45 ~
id = request.GET.get('uid')
status = 'activated'
print("PID = ", id, status)
UserRegistrationModel.objects.filter(id=id).update(status=status)
data = UserRegistrationModel.objects.all()
return render(request,'admins/viewregisterusers.html',{'data':data})
def DeleteUsers(request):
if request.method == 'GET':
id = request.GET.get('uid')
status = 'activated'
print("PID = ", id, status)
UserRegistrationModel.objects.filter(id=id).delete()
data = UserRegistrationModel.objects.all()
return render(request,'admins/viewregisterusers.html',{'data':data})
~ 46 ~
CHAPTER 6
EXPERIMENTAL RESULTS
REGISTER FORM
~ 47 ~
ADMIN HOME PAGE
ACTIVE USER
~ 48 ~
USER LOGIN PAGE
TRAIN MODEL
~ 49 ~
PREDICTION
PREDICTION RESULTS
~ 50 ~
CHAPTER 7
7.1 CONCLUSION:
~ 51 ~
REFERENCES
[1] Weiss, G., Yoneda, K. and Hayajneh, T. (2019) ‘Smartphone and Smartwatch-
Based Biometrics Using Activities of Daily Living', IEEE Access, 7, pp. 133190-133202.
[2] Abuhamad, M., Abusnaina, A., Nyang, D. and Mohaisen, D. (2021) ‘Sensor-Based
Continuous Authentication of Smartphones' Users Using Behavioral Biometrics: A
Contemporary Survey', IEEE Internet of Things Journal, 8(1), pp. 65-84.
[3] Divya, R. and Lavanya, R. (2020) ‘A Systematic Review on Gait Based
Authentication System', 2020 6th International Conference on Advanced Computing and
Communication Systems (ICACCS), pp. 505-509.
[4] Gafurov, D., Snekkenes, E. and Bours, P. (2007) ‘Gait Authentication and
Identification Using Wearable Accelerometer Sensor', 2007 IEEE Workshop on
Automatic Identification Advanced Technologies, pp. 220-225.
[5] Mantyjarvi, J., Lindholm, M., Vildjiounaite, E., Makela, S. and Ailisto,
H. (2005) ‘Identifying users of portable devices from gait pattern with accelerometers',
IEEE International Conference on Acoustics, Speech, and Signal Processing, 2, pp.973-
976.
~ 52 ~