KEMBAR78
CCF Using ML and DL | PDF | Machine Learning | Artificial Neural Network
0% found this document useful (0 votes)
129 views19 pages

CCF Using ML and DL

This document discusses using machine learning and deep learning algorithms to detect credit card fraud. It begins with an introduction that describes the growing problem of credit card fraud and how financial institutions need automated fraud detection systems. It then reviews related work that has applied machine learning techniques like decision trees, random forests, and neural networks for credit card fraud detection. The document proposes using a deep learning model with additional layers to extract features and classify credit card transactions as fraudulent or not. It will perform feature selection, test supervised machine learning and deep learning models on a credit card dataset, and evaluate the results using various performance metrics to assess accuracy. The goal is to detect fraudulent transactions using machine learning and deep learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views19 pages

CCF Using ML and DL

This document discusses using machine learning and deep learning algorithms to detect credit card fraud. It begins with an introduction that describes the growing problem of credit card fraud and how financial institutions need automated fraud detection systems. It then reviews related work that has applied machine learning techniques like decision trees, random forests, and neural networks for credit card fraud detection. The document proposes using a deep learning model with additional layers to extract features and classify credit card transactions as fraudulent or not. It will perform feature selection, test supervised machine learning and deep learning models on a credit card dataset, and evaluate the results using various performance metrics to assess accuracy. The goal is to detect fraudulent transactions using machine learning and deep learning algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Credit Card Fraud Detection Using ML and DL 2022-23

CHAPTER 1
INTRODUCTION
1.1 CCF
Credit card fraud (CCF) is an identity theft in which someone other than the owner makes
an unlawful transaction using a credit card or account details. A credit card stolen,
lost, or counterfeited might result in fraud. Card-not-present fraud, or using your credit
card number in e-commerce transactions, has also become increasingly common due to the
increase in online shopping. Increased fraud, such as CCF, has resulted from the expansion
of e-banking and several online payment environments, resulting in billions of dollars in
annual losses. In this era of digital payments, CCF detection has become one of the most
important goals. As a business owner, it cannot be disputed that the future is heading
towards a cashless culture.

As a result, traditional payment methods won't be employed in the future and won't help grow
a firm. Customers will only sometimes come into the store with cash in hand. They now give
debit and credit card payments a higher priority. To accept all forms of payment, businesses
will need to upgrade their surroundings. It is anticipated that this scenario will worsen further
during the coming years.
Out of the over 1.4 million reports of identity theft in 2020, there were 393,207 incidents
of CCF. CCF is currently the second most common type of identity theft identified this year,
behind benefits fraud and theft of official papers. There were 365,597 instances of fraud in 2020
that were committed using brand-new credit card accounts [10]. Between 2019 and 2020,
the number of identity theft complaints increased by 113%, while credit card identity theft
reports rose by 44.6%. The theft of credit and debit cards cost the world economy $24.26
billion in 2017. The United States is most susceptible to credit theft, with 38.6% of reported
card fraud losses in 2018 occurring there.
Financial institutions should therefore prioritise having an automated fraud detection
system. Supervised CCF detection aims to build an existing transactional credit card
payment data-based machine learning (ML) model. To determine if an incoming transaction
is fraudulent or not, the model should be able to distinguish between fraudulent and
nonfraudulent transactions. The problem incorporates several fundamental issues, such as
the system's quick reaction time, cost sensitivity, and feature pre-processing. ML is a branch
of artificial intelligence that uses a computer to make predictions based on historical data
trends. Several studies have applied ML models to address a wide range of problems. Deep

Department of CSE, VVIT 1|Page


Credit Card Fraud Detection Using ML and DL 2022-23

learning (DL) algorithms have been used in avariety of fields, including computer networks,
intrusion detection, banking, insurance, mobile cellular networks, health care fraud
detection, malware detection in the medical field, video surveillance fraud detection,
location tracking, Android malware detection, home automation, and the prediction of heart
disease. In this study, we investigate the practical use of ML, particularly DL algorithms, to
recognise credit card thefts in the banking sector. The support vector machine (SVM) is a
supervised ML approach for categorisation problems. It is used in many different fields,
such as public safety [16], imagerecognition [25], and credit scoring [5]. Compared to other
classifiers, SVM finds a hyperplane that splits the input data in the support vector and can
handle both linear and nonlinear binary classification problems. In the past, the first
technique used to detect credit card theft was neural networks [4]. As a result, DL
techniques are the present focus of (DL), a branch of ML. Deep learning
techniques have drawn much attention recently because of their significant and promising
results in various applications, including computer vision, natural language processing, and
voice.

Deep neural network applications for recognising CCF have been studied. [3]. For
detecting CCF, it employs a variety of deep learning methods. To ascertain whether the
initial fraud is the standard transaction of qualified datasets, we choose the CNN model
and its layers for this study. Several transactions often appear in datasets labelled fraudulent
and showed suspect transaction behaviour. As a result, in this study, we concentrate on
supervised and unsupervised learning.
The issue with ML is a class imbalance, which occurs when the total amount of one class
of data (positive) is significantly less than that of another type of data (harmful). Several studies
have focused on the classification problem of the unbalanced dataset. A vast body of research
might offer numerous solutions. As a result, the issue of class inequality still needs to be resolved
to the best of our knowledge. By including additional layers for feature extraction and
categorising credit card transactions as fraudulent or not, we suggest changing the DL algorithm
of the CNN model. The top attributes from the prepared dataset are ranked using feature selection
methods. Then, CCF is categorised utilising several supervised machine-driven and deep
learning models.

Department of CSE, VVIT 2|Page


Credit Card Fraud Detection Using ML and DL 2022-23

In this study, the main aim is to detect fraudulent transactions using credit cards with
the help ofML algorithms and deep learning algorithms. This study makes the following
contributions:

• Feature selection algorithms rank the top features from the CCF transaction
dataset,which helps in class label predictions.
• The deep learning model is proposed by adding several additional layers that are
then used toextract the features and classification from the credit card farad
detection dataset.
• To analyse the performance CNN model and apply different architectures of CNN layers.
• To perform a comparative analysis between ML with DL algorithms and
proposed CNN withbaseline model, the results prove that the proposed approach
outperforms existing approaches.
• To assess the accuracy of the classifiers, performance evaluation measures, accuracy,
precision, andrecall are used. Experiments are performed on the latest credit card
dataset.

The remainder of the essay is organised as follows: The connected works are
examined in the second section. Section 3 provides a comprehensive description of the
proposed model and its approach. Section 4 describes the dataset and evaluation metrics.
It displays the analysis andthe results of our testing on a real dataset. The paper is concluded
in Section 5.

Department of CSE, VVIT 3|Page


Credit Card Fraud Detection Using ML and DL 2022-23

CHAPTER 2
LITERATURE SURVEY
A model based on a decision tree and a fusion of Luhn's and Hunt's algorithms has
been put forth by Prajal Save et al. To detect if an incoming transaction is fraudulent, Luhn's
algorithm is employed. The input, which is the credit card number, is used to validate credit
card numbers. The degree of outliers and address mismatch is used to evaluate how far each
incoming transaction deviates from the typical profile of the cardholder. In the final stage,
the Bayes Theorem strengthens or weakens the general belief. Then an advanced
combination heuristic is used to combine the computed probability with the initial
confidence in fraud.
J. Vimala Devi et al. Three machine-learning techniques were described and used to
find fake transactions. The performance of classifiers or predictors is assessed using a
variety of metrics, including the Vector Machine, Random Forest, and Decision Tree. These
measurements can either be dependent on or independent of prevalence. Additionally,
similar methods are employed in mechanisms that detect credit card fraud, and the outcomes
of these algorithms have been contrasted.
Supervised algorithms by Popat and Chaudhary were presented. Some methods
utilised include Deep Learning, Logistic Regression, Nave Bayesian, Support Vector
Machine (SVM), Neural Network, Artificial Immune System, K Nearest Neighbor,
Decision Tree, Fuzzy Logic Based System, and Genetic Algorithm. Algorithms for
detecting credit card fraud show which transactions are likely to be fraudulent. We
compared machine learning techniques to perform prediction, grouping, and outlier
detection.
Shiyang Xuan et al. The Random Forest classifier was employed to train it on the
behavioural traits of credit card transactions. The following categories are used to prepare
the characteristics of legitimate and dishonest behaviour: Random forest based on CART
and random forest based on random trees. Performance measurements are computed to
evaluate the model's efficacy.
Akila and Deepa [17] Different methods were utilised to detect fraud, including the
Anomaly Detection Algorithm, K-Nearest Neighbor, Random Forest, K-Means, and
Decision Tree. Based on a given scenario, multiple strategies were provided, and the ideal
algorithm for spotting fraudulent transactions was projected. The system generated a fraud
score for that specific transaction using various criteria and algorithms to forecast the

Department of CSE, VVIT 4|Page


Credit Card Fraud Detection Using ML and DL 2022-23

outcome of fraud.
An approach for detecting fraud using deep networks has been proposed by Xiaohan
Yu et al. The paper described a deep neural network approach for identifying credit card
fraud. The neural network algorithm technique and the uses of deep neural networks have
been discussed. The dataset's data skew problems were fixed using focal loss and
preprocessing techniques. There are numerous methods for evaluating if a transaction is
legitimate or fraudulent, according to Siddhant Bagga et al. [24]. Evaluated and compared
the performance of nine techniques on credit card fraud data using various parameters and
metrics, including logistic regression, KNN, RF, quadrant discriminative analysis, naive
Bayes, multilayer perceptron, ada boost, ensemble learning, and pipelining.
The dataset is balanced using the ADASYN approach. The classifier’s performance
is evaluated using accuracy, recall, F1 score, flat classification rate, and Matthews'
correlation coefficient. Based on various variables, the optimum strategy to address the
problem will be determined. By analysing the warnings produced by a fraud detection
system, Urban and Carrasco have tested deep neural networks to see how well they can
identify false positives. Ten neural network topologies were used to categorise a set of alerts
generated by an FDS as either proper alerts, representing actual fraud incidents, or invalid
alerts, which meant false positives.

Department of CSE, VVIT 5|Page


Credit Card Fraud Detection Using ML and DL 2022-23

CHAPTER 3
ANALYSIS PHASE
3.1 RELATED WORK
In the field of CCF detection, several research studies have been carried out. This
section presents different research studies revolving around CCF detection. Moreover, we
strongly emphasise the research that reported fraud detection in the problem of class
imbalance. Many techniques are used to detect credit cards. Therefore, to study the most
related work in this domain, the main approaches can be categories, such as DL, ML, CCF
detection, ensemble and feature ranking, and user authentication approaches.

FIGURE 1. Payment card authorisation process.

Figure 1 shows the commonly used payment card authorisation process for credit
card authentication. There are two ways of authentication, including passwords and
authentication through biometrics. Biometrics-based authentication can be further divided
into three groups: physiological authentication and behavioural authentication, and
combined authentication.

3.2 SUPERVISED MACHINE LEARNING APPROACHES


Each of ML's various branches can handle a different type of learning problem.
However, there are various framework types for ML learning. A CCF solution is
provided by the ML technique, such as random forest (RF). The random forest is the
combination of the decision tree. Most studies use the RF method. We can utilise network
analysis and (RF) to merge the model. This process is known as APATE. Different
machine learning (ML) methods are available to researchers, including supervised and

Department of CSE, VVIT 6|Page


Credit Card Fraud Detection Using ML and DL 2022-23

unsupervised methods. ML techniques like LR, ANN, DT, SVM, and NB are frequently
employed for CCF identification. These methods can be used in conjunction with ensemble
methods to build reliable detection classifiers.

Linking multiple neurons and nodes is known as an artificial neural network. A feed-
forward perceptron multilayer comprises numerous layers: an input layer, an output layer
and one or more hidden layers. The first layer contains the input nodes to represent the
exploratory variables. These input layers are multiplied with a precise weight, and each
hidden layer node is transferred with a particular bias and added together. An activation
function is then applied to create the output of each neuron for this summation, which is
then transferred to the next layer. Finally,the algorithm’s reply is provided by the output
layer. The first set randomly used weights and formerly used the training set to minimise
the error. All these weights were adjusted by detailed algorithms suchas backpropagation,
The graphic model for contingency relationships between a group of variables is called
the Bayesian belief network. The independence assumption in naïve Bayes is that it was
developed to relax and allow for dependencies among variables.
Variable quantity is characterised as nodes, although dependencies of conditions
between variables are shown as arcs between nodes. The conditional probability table of
each node is linked, which makes the possibilities of the node’s variable depend on the
parent’s node values. The computational system of the bilateral-branch network (BBN)
is as follows: Finding a construction for the web is the first step: it was raised by human
experts, which may be conditional on the specificalgorithms by using the data. When this
network topology originates, straightforwardly fitting the network uses antique data in
naïve Bayes so that the constant variables are discretised and distributed normally.
Correspondingly, in BBN, each node is expected to be autonomous of its no offspring,
assuming its maternities in the graph. This is acknowledged as the condition of Markov.
The linear classification model is a support vector machine (SVM) and regression
problems. Rendering to the SVM algorithm, we can find the points closest to the line
from both classes. These points are called support vectors. This paper is concerned with
integrating unsupervised techniques with supervised techniques for the classification of
CCF detection. Table 1 presents the summary of machine learning algorithms.

Department of CSE, VVIT 7|Page


Credit Card Fraud Detection Using ML and DL 2022-23

3.3 DEEP LEARNING APPROACHES


DL algorithms are helpful, including the convolutional neural network (CNN)
algorithm, and more algorithms are deep belief networks (DBNs) and deep autoencoders;
these are considered learning methods. They have numerous layers of processing data,
illustration learning and classification of apattern. The objective of deep understanding is to
study artificial neural networks. The standard technique regards the size of neural networks and
is considered the backpropagation model. The efficiency of the backpropagation algorithm
decreases significantly, increasing the depth of the neural networks, which can cause problems,
such as insufficient local goals and a dilution of errors. Deep designs should be considered to be
an achievement. They can theoretically address the optimisation struggle profoundly within the
training parameters.

The training technique of the deep belief network is often considered the compelling
primary case of deep architecture training. Traditional ML algorithms, such as SVM, DT and LR,
have been extensively proposed for CCF detection. These traditional algorithms could be better
suited for large datasets. A CNN is a DL method; it can deeply relate to three-dimensional data,
such as image processing. This method is similar to the ANN; the CNN has the same structure
hidden layer and a different number of channels in each layer in addition to particular convolution
layers. The idea of moving filters through word convolution is linked to the data that can be used
to capture critical information and automatically performs feature reduction. Thus, CNN is widely
used in image processing. CNN does not require heavy data pre-processing for training.

For image processing, the purpose of using a CNN is to minimise processing without
losing critical featuresby reducing the image to make predictions. The main terms in CNN
are feature maps, channels, pooling, stride, and padding. CNN models are conventionally
used for text, image and video processingand take two-dimensional data as input, called the
2DCNN. The feature mapping process is used from the input data to learn the internal
representation. The location of features is irrelevant, and the same procedure can be used
for one-dimensional data. Natural language processing is a famous example of a 1DCNN
application where sequence classification becomes a problem. In a 1DCNN, the kernel filter
moves top to bottom in a data sample sequence, rather than moving left to right and top to
bottom in the 2DCNN, Raghavan defined an autoencoder as a virtual neural network. An
autoencoder can also encrypt the datalike it would decrypt the data. In this method, for no
anomalous points, the autoencoders aretrained. According to the reconstruction error, it

Department of CSE, VVIT 8|Page


Credit Card Fraud Detection Using ML and DL 2022-23

would present the anomaly ideas and classify it as ’fraud’ or‘no fraud,’ meaning that the
system has not been trained, which is predicted to have a higher amount of anomalies.
However, a slight value overhead the higher bound value or considers the threshold an
anomaly. This technique is also used in an autoencoder-based network detection of a
monster. An ML model is a generative adversarial network where two neural networks
collaborate to improve their prediction accuracy. GANs are often unsupervised and learn
using an obliging zero-sum game framework. The fundamental category of the deep-
learning model is a GAN, and the perception of development for DL progress it can offer
is the most promising direction. GAN takes two main modules. In training, all of the
modules make up a model of DL, which is a neural network.
The two main methods are a generator (G) and a discriminator (D). The network of
the generatorcan generate the data as simulated, and the difference between the simulated
data and the target data determines the discriminator, yielding a true and false
determination around the virtual data.
Finally, the model may generate higher-quality simulation data to finish the data
creation process. A VAE is a variational autoencoder with regularised training circulation
to guarantee that its hidden space has adequate assets, allowing us to create new data. A
VAE is generated by introducing variation based on the autoencoder. The VEG and the
GAN are incredibly similar. Once again, the goal is to change and match the data
distribution to generate virtual data near the target. Usually, the number of samples is
similar to that of a normal distribution. If all examples are found, the work can be very
successful. Consequently, investigators frequently use neural networks to approximate the
mean and modification of normal distribution. Long short-term memory (LSTM) is an
artificial recurrent neural network (RNN) architecture used in DL models. The LSTM
network is compatible with categorising, processing and building predictions based on time
sequence data. The most common type of RNN is LSTM. An ordinary neural network (NN)
cannot keep track of the preceding information of a learning task every time they have to
perform a task. In simple words, with memory, the RNN is a neural network RNNs tend to
have short-term memory because of the vanishing gradient problem. The backbone of
neural networks is backpropagation, which reduces the loss by weights of network
adjustment using the originated gradients. In RNNs, as the angle moves the spine in the
web, it shrinks, andthen there is a minor update in weight. These small updates are affected
by the earlier layers in the network. They do not learn more, and the RNN loses the ability
to recall early examples in long sequences, making it a short-term memory network.

Department of CSE, VVIT 9|Page


Credit Card Fraud Detection Using ML and DL 2022-23

The use of DL methods is still minimal, and techniques, such as CNN and LSTM, are
encouraged for image classifica-time, natural language processing (NLP), and RBM
because of their ability to handle massive datasets. This study's primary focus is how these
DL methods perform CCF classification. In addition, data pre-processing is an important
stage in the ML process. How the classificationperformance is affected in response to data
pre-processing when detecting credit cards is another question that needs to be answered.
The table presents the summary of deep learning algorithms.

RNNs tend to have short-term memory because of the vanishing gradient problem.
The backbone of neural networks is backpropagation, as it reduces the loss by weights of
network adjustment by using gradients that it originated. In RNNs, as the angle moves the
backbone in the network, it shrinks, andthen there is a minor update in weight. These small
updates are affected by the earlier layers in the network. They do not learn more, and the
RNN loses the ability to recall early examples in long sequences, making it a short-term
memory network. The use of DL methods is still minimal, and techniques, such as CNN
and LSTM, are encouraged for image classifica-time, natural language processing (NLP),
and RBM because of their ability to handle massive datasets. This study focuses on how
these DL methods perform CCF classification. In addition, data pre-processing is an
important stage in the ML process. How the classification performance is affected in
response to data pre-processing when detecting credit cards is another question that needs
to be answered. The table presents the summary of deep learning algorithms.

TABLE 1. Algorithms of machine learning and their accuracy.


Department of CSE, VVIT 10 |
Page
Credit Card Fraud Detection Using ML and DL 2022-23

CHAPTER 4
PROBLEM STATEMENT

4.1 Existing System


Credit card breaches have been on the rise significantly in recent years. Consequently,
it's essential to create credit card fraud detection methods to defend against illicit activity.
Several challenges make this process difficult to accomplish. Still, one of the main issues with
fraud detection is the dearth of real-world data for academic researchers to conduct experiments
on experimental literature that provides real-world outcomes. The sensitive financial
information related to fraud must be kept secret to protect the consumer's privacy is the cause
of this. Here, we list some qualities that a fraud detection system should possess to produce
accurate results:
There must be a suitable way to deal with the noise. Data mistakes, such as inaccurate
dates, are referred to as noise. The noise in the actual data limits the precision of generalisations
that can be made. Regardless of how large the training set is, attained.
Overlapping data is another issue in this subject. Many transactions could appear to be
fraudulent when they're legitimate. When a fraudulent transaction looks, to be honest, the
inverse also occurs.
The systems ought to be able to modify themselves to accommodate novel fraud types.
Since an effective fraudster is continuously looking for new and creative ways to do his work,
successful fraud techniques eventually lose some effectiveness as they become publicly
recognised.

4.2 Drawbacks of the existing system


• Unbalanced data: The data used to detect credit card fraud are unbalanced. It indicates that
relatively little of all credit card transactions—a tiny percentage—are fraudulent. As a result,
seeing fraudulent transactions is highly challenging and inaccurate.
• Cross-referencing information: Many transactions may be falsely reported as fraudulent when
they are lawful (false positives) and vice versa (false negatives). Therefore, one of the main
challenges facing fraud detection systems is achieving low false positive and false negative
rates.
• Lack of adaptability: Classification algorithms frequently struggle to recognise brand-new
varieties of legitimate or fraudulent patterns. New regular and fraudulent activity patterns are

Department of CSE, VVIT 11 |


Page
Credit Card Fraud Detection Using ML and DL 2022-23

complex for supervised and unsupervised fraud detection systems to identify.


• Cost of fraud detection: The system should account for the price of fraudulent behaviour
discovered and the price of stopping it.
• Absence of uniform metrics: The effectiveness of fraud detection systems cannot be evaluated
or compared using uniform evaluation criteria.

4.3 Proposed system

The detection of credit card fraud has two goals. It assists banks in lowering the
incidence of payment fraud while also helping merchants increase sales.

To operate a sustainable firm, merchants need to turn a profit, which is left over after
business expenses are subtracted from revenue. Consequently, a company's tolerance for
payment fraud depends on various factors, including its gross margin (sell price - cost of goods
sold). The tolerance for payment fraud decreases as margins are more minor.

In practice, when fraud occurs, the cardholder disputes the charge and the debit is
usually cancelled, which means either the cardholder’s bank or the merchant absorbs the loss
(see [1] for more details). Cumulatively, fraud represents a significant financial risk to the
merchant and the issuing bank. To reduce fraud, chip and pin technology, 3DSecure, and fraud
detection techniques are used. But if chip and pin technology and 3DSecure exist, why is fraud
detection used? There are two main reasons.

Department of CSE, VVIT 12 |


Page
Credit Card Fraud Detection Using ML and DL 2022-23

First, compared to the cost of fraud detection, the price of chip and pin technology and
3DSecure is relatively high. For instance, whereas online retailers are concerned about
conversion, 3DSecure lowers it by several percentage points (> 5%). Therefore, when given a
choice, many online retailers choose to disable 3DSecure and take responsibility for managing
the risk of payment fraud.

Second, increasing the number of security layers during the checkout process
significantly decreases checkout velocity and customer convenience. Although convenience
for consumers may appear hazy at first, for businesses like Amazon, which invented one-click
checkout, it is a marketing argument and a way to increase conversions and income.

4.4 objectives of the proposed system


Particularly from payment card issuers, areas like fraud detection and prevention have
recently attracted much attention on the research front. The cause of this rise in research is the
significant annual financial losses from card issuers’ fraudulent use of their card goods. A
practical fraud management approach can result in operational yearly cost savings of millions
of dollars.

For financial institutions, fraud prevention is enjoyable. The amount of fraud loss for
many banks has increased with the introduction of new technology like the telephone,
automated teller machines (ATMs), and credit card systems. Manually conducting the analysis
is virtually impossible, and automating the procedure could create significant challenges.

It is costly to determine if each transaction is authentic or not. Additionally, it takes


much time, making it impractical. Whether a fraudster or clients carried out a trade is a better
choice, but calling every cardholder would be prohibitively expensive if one wanted to check
in on every transaction.

Further, it might also lead to customer dissatisfaction. Fraud prevention by automatic


fraud detection is where well-known classification methods can be applied, and pattern 89
recognition systems play a vital role. One can learn from the past (fraud happened in the past)
and classify new instances (transactions). Past data about the customer is available in massive
amounts, which can be mined for valuable data. This old data can be analysed, and the user’s

Department of CSE, VVIT 13 |


Page
Credit Card Fraud Detection Using ML and DL 2022-23

buying behaviour can be obtained. This pattern can be used to compare current transactions
and determine the transaction's legitimacy. The fraud detection model is among the most
complicated models used for the credit card industry.

Among the issues, one must deal with while constructing a model are the skewness of the data,
the dimensionality of the search space, the varying costs of false positives and negatives, the
durability of the model, and the short time to answer a fraud prevention system.

Department of CSE, VVIT 14 |


Page
Credit Card Fraud Detection Using ML and DL 2022-23

CHAPTER 5
ARCHITECTURE

5.1 Algorithms

5.1.1 Convolutional Neural Network (CNN)


CNN’s also known as Conv-Nets, contain multiple layers and are mainly used to
process images. Object detection is widely used for image processing and classification,
estimating time series and detecting differences.
Layers in the CNN Model: Here are six distinct layers in the CNN model:
1) Input layer
2) Convo layer (Convo + ReLU)
3) Pooling layer
4) Fully connected layer (FC)
5) SoftMax/logistic layer
6) Output layer

FIGURE. CNN output layers.

5.1.2 Support Vector Machine (SVM)


The SVM texting algorithm works well. With large margins, the SVM distinguishes
between positive and negative examples. In earlier investigations on fraud detection, the SVM
outperforms the naive Bayes. Based on support vectors, training points are divided into two

Department of CSE, VVIT 15 |


Page
Credit Card Fraud Detection Using ML and DL 2022-23

groups using a decision surface. The formula for optimisation is as follows:

5.1.3 Logistic Regression


In order to calculate the likelihood that an event will occur, the simple procedure of
logistic regression assesses the relationship between one dependent binary variable and
independent variables. The regulation parameter C regulates the trade-off between keeping
the model basic and adding complexity (overfitting) (underfitting). The model becomes more
sophisticated and loses strength of regulation for high values of C, overfitting the data. For
the different datasets—the original, the standardised, and the dataset with the most crucial
features—the parameter "C" is modified using Randomised Search CV (). The logistic
regression model is started and then fitted to the training data, as stated in the technique, after
the parameter 'C' is defined for each dataset. The function of the logistic regression hypothesis
is illustrated here, along with the function g(z):

The Logistic Regression for the hypothesis can be seen as follows:

Here theta is a vector of restriction that our model calculates to be appropriate to our classifier.

5.1.4 Random Forest


RF is an ensemble technique and is considered group learning for classifying elements
and regression. Deep trees are used to learn irregular patterns. If deep trees learn the same
part of the training sample, RF takes an average of its value’s variation, which can be reduced
by this method. The training data (p = p1. . . . . . .pn) with responses (Q = q1, . . . , qn) and
bagging (X times) choose a random sample and replace it with the training set that fits the
trees for these samples as follows: For x = 1. . . , X:

Department of CSE, VVIT 16 |


Page
Credit Card Fraud Detection Using ML and DL 2022-23

CHAPTER 6
DESIGN

6.1 Data Flow Diagram

6.2 Use Case Diagram

Department of CSE, VVIT 17 |


Page
Credit Card Fraud Detection Using ML and DL 2022-23

6.3 Activity Diagram


The below graph diagrams show the activity of the model, like accuracy and data loss,
based on the different training epochs. The epoch with the minor data loss and most accuracy
is considered.

Figure: Training and validation history of accuracy and loss of


CNN model using 100 epochs

Figure: Model accuracy when epoch sizes are 20 and 50.

Department of CSE, VVIT 18 |


Page
Credit Card Fraud Detection Using ML and DL 2022-23

CHAPTER 7
CONCLUSION AND FUTURE SCOPE

The threat posed by CCF to financial institutions is growing. Fraudsters frequently


develop novel fraud techniques. A robust classifier can handle the evolving fraud landscape. A
fraud detection system's principal aim is to accurately anticipate fraud situations while lowering
the number of false-positive cases.
ML approaches function differently depending on the specific business scenario.
Different ML approaches are driven mainly by the type of incoming data. The model’s efficacy
for detecting CCF is heavily influenced by the number of features, the volume of transactions,
and the correlation between the components. Text processing and the baseline model are related
to deep learning techniques (DL), such as CNN’s and their layers.
These techniques perform better than conventional algorithms for the detection of credit
cards. When all algorithm performances are compared side by side, the CNN with 20 layers
and the baseline model comes out on top with a 99.72% accuracy rate. Several sampling
approaches are applied to improve the performance of existing examples; nevertheless, they
severely degrade the performance of unseen data. As the class disparity rose, performance on
hidden data improved. Future research may examine the application of additional cutting-edge
deep learning techniques to enhance the model’s performance suggested in this study.

Department of CSE, VVIT 19 |


Page

You might also like