Muhammad Musab Khalid
(21262)
Assignment: Data Mining
Topic: Crime Investigation
Using Data Mining in Market
MSCS (1 semester)
TITLE OF PAPER AUTHORS
Sr. #
1 A Review of Data Mining Applications in Crime Hossein Hassani1∗, Xu
Huang2, Emmanuel S. Silva3,
and Mansi Ghodsi1
2 AN OVERVIEW OF A CRIME DETECTION SYSTEM USING THE ART SRITHA ZITH DEY BABU,
OF DATA MINING DIGVIJAY PANDEY, ISMAIL
SHEIK
3 A Survey on Malware Detection Using Data Mining Techniques YANFANG YE, TAO LI,
DONALD ADJEROH, S.
SITHARAMA IYENGAR
4 Data Mining for the Internet of Things: Literature Review and Feng Chen, Pan Deng, Jiafu
Challenges Wan, Daqiang Zhang,
Athanasios V. Vasilakos, and
Xiaohui Rong
5 Detecting Financial Fraud Using Data Mining Techniques: A Mousa Albashrawi
Decade Review from 2004 to 2015
6 An Overview of Data mining application in judicial cases to Nima Norouzi, Elham Ataei
identify patterns of crime and crime detection
7 Using Data Mining to Detect Health Care Fraud and Abuse: A Hossein Joudaki, Arash
Review of Literature Rashidian, Behrouz Minaei-
Bidgoli, Mahmood
Mahmoodi, Bijan Geraili,
MahdiNasiri & Mohammad
Arab
8 A Systematic Survey of Online Data Mining Technology MATTHEW EDWARDS,
Intended for Law Enforcement AWAIS RASHID, and PAUL
RAYSON
9 Big data analytics for security and criminal investigations M.I. Pramanik, Raymond Y.K.
Lau, Wei T. Yue, Yunming Ye
and Chunping Li
10 Data mining in anti-money laundering field Noriaki Yasaka
YEAR JOURNAL HEC CATEGORY IMPACT
FACTOR PROPOSED ALGO /
METHOD
2016 Water X 1.946 The naive Bayes classifier
Environment was proposed
Research
2020 Scientometrics W 2.77 (Naïve formula is used)
We propose to the
prediction of real -time.
Though it will difficult to
get accurate cause
crimes are doing their
crimes using different
and complex methods.
2017 Communications W 4.654 intelligent malware
of the ACM detection methods
2015 SAGE Open W 1.54 Hierarchical clustering
methods used
2016 International X 3.11 The proposed
Journal of Data classification framework
Science and (i.e., naïve Bayes,
Analytics decision tree, neural
network, and SVM)
2021 Scientometrics W 2.77 Classification techniques
are used to predict
discrete features, while
predictive methods
model continuous
functions. Prediction
techniques include linear
and nonlinear
regression, neural
networks, and support
vector machines
2015 Annals of Internal W 25.39 Knowledge Discovery
Medicine from Databases (KDD)
2015 Communications W 4.654 Naive Bayes classifier
of the ACM running
2017 Water X 1.946 the proposed security
Environment games are bi-level
Research models20 that consider
an attacker’s ability to
gather information
about the
defense strategy before
planning an attack
2017 International X 0.8 sophisticated algorithms
Journal of that can detect illegal
Development behaviors quickly
Issues
MUEEN UD DIN
MS CS - 21439
COMPARISON / BENCHMARK / EVALUATION PARAMETERS
DATASETS
EXISTING METHOD / MEASURES
an overall 80–90% accuracy for opensource communities and By evaluating the
tests on New Zealand crime datasets performance on the suspect
newspapers and generally about description module, an
75% accuracy for crosscountry overall recall rate of 70%
scenarios and 100% precision was
achieved.
The bridge of data between the Network datasets used (1)Crime month. (2) Crime
police station and system of data day of the week. (3) The
mining will report about further real crime time
and upcoming crimes
Naive Bayes Classifier (NBC) Using 121 datasets parameter including the
feature selection method
(e.g., Document Frequency,
FS, or Gain Ratio) (DF, Gain
Ratio (GR), and FS) in
Classification CLOUDS: a decision tree malware detection
the parameters of the
classifier for large datasets transformation remain the
same for every time series
regardless of its nature,
related research including
DFT [86], wavelet functions
related topic [87], and PAA
[72].
Expert Systems with Applications European region was only Country, Frequency, and
and Decision Support Systems reported Percentage (%)
Data mining methods Used PSN project used data
(Descriptive This paper examines existing mining and predictive
and Predictive) databases with a analytics to investigate
crime data mining approach violent shootingrelated
(Data Bank) crimes
Data Mining (DM), Knowledge Primary studies (Dataset) that we think that low- and
Discovery from Databases (KDD) used data mining for detecting middle-income countries
and Business Intelligence (BI) health care fraud and abuse can use data mining
techniques as an
instrument for evaluating
provider’s behavior
Their detection system reached Both experiments reused the an evaluation against
an AUC of 91% on their Second Life and Entropia manually identified ground
experimental dataset, rising datasets truth would be
to 99.7% with additional stronger justification of the
components method’s validity
Artificial neural network (ANN) Handle systematic evaluation of
voluminous datasets state-of-the-art
data mining technologies
including intelligent agents,
link analysis, text mining,
machine learning (ML)
methods of estimating money United Nations Office on Drugs The suspiciousness
laundering is the suspicious and Crime evaluation
transaction report (STR) which is rules are the primary
reported to financial intelligence repository of knowledge in
units (FIU) FAIS
RESULTS DRAWBACKS ADVANTAGES
The results are Crime continues to remain a Data also points toward the
promising, with 76% severe threat to all need for more training and
detection accuracy on communities and nations investment in educating and
real world ‘blacklist’ data across the globe alongside the empowering youth with
and sound evidence that sophistication in technology knowledge on the
the approach can and processes that are being advantages
provide effective exploited to enable highly
warnings several months complex criminal activities
ahead of the official
release of blacklists
We analyzed that crimes To predict crime and analyze The data ming system will
vary with criminal's age crime activity also help people to make a
and health or political direct connection with
power also police and the law of justice
officials
Their results showed To protect legitimate users Advantage of being
that the FS was very from these threats, anti- exhaustive in detecting
accurate using just the malware software products malicious logic. In other
top 50 features from different companies, words, static analysis does
including Comodo, Kaspersky, not have the coverage
Data mining Kingsoft, andinvolves
Data mining Symantec, problem
advantage that dynamic
of distributed
technologies are discovering novel, interesting, parallel computer systems
integrated with IoT and potentially useful patterns
technologies for decision from large data sets and
making support and applying algorithms to the
system optimization. extraction of hidden
information.
data mining technique in This paper aims to review Besides this benefit,
detecting financial fraud research studies conducted to researchers can take
with a 13%, followed by detect financial fraud using advantage of knowing the
both of neural network data mining tools within one most frequent used
and decision tree, with a decade and communicate the methods
11%. While support current trends to academic
vector machine is scholars and industry
represented by a 9% and practitioners
naïve Bayes is
represented by a 6%.
Besides fraud detection.
the results should be The main purpose of this advantage of the industrial
evaluated, and its research is to study the opportunity to reap the true
importance should be method based on data mining. benefits of advanced
explained. Usually, in This method can be studied in computing technology in
classification problems, a various fields such as areas that include position
matrix of complexity is a identification, forecasting, and monitoring, intelligent
useful tool crime prevention using data control, and database
mining tools and algorithms knowledge discovery
and using the existing
database and their military
arrangement at the crime
scene to prevent crime.
Recommend seven The scale of this problem is As advantages, Shin et al
general steps to mining large enough to make it a used a simple definition of
health care claims (or priority issue for health anomaly score and
insurance claim) to systems. extracted 38 features for
detecting detecting abuse
fraud and abuse (after
preprocessing of data):
1). Identifying the most
important attributes of
data by expert
domains
This increases the risk of Technical issues with a module Behind web data
reviewer bias and human of the Dark Web Portal comes the other significant
error affecting results, data source, email, which
especially with regards has the advantage of being
to the classification of both
borderline cases long established and well
used
accountable for We also describe some We identify five major
executing a specific task challenging issues of big data technologies namely, link
and then analytics in the context analysis, intelligent agents,
synthesizing these of security and criminal text mining, ANNs, and ML
intermediate results for investigation perspectives. which have been widely
the final used in various domains for
solution developing the technical
foundations of an
automated security and
criminal investigation
system
we recognized It is difficult to determine the the advantages of the
that the data-mining definition of money decision to the same degree
theory such as laundering. as the decision maker
multivariate date Drugs and Crime estimated himself”
analysis (liner regression, the figure to be some 3.6% of Luhman, N , Risk (2008) ,
logistic regression, global GDP (2.3%-5.5%), 4th ed.New Brunswick, New
luster analysis) and equivalent to about US$2.1 Jersey, p68
artificial intelligence trillion (2009), in which is
technique included 2.7% of global GDP
(2.3%-5.5%)
SUMMARY
summary of data mining
applications in crime that can
act as a quick reference guide
for researchers.
To predict crime and analyze
crime activity we need to
proceed with a systematic
approach with data mining. By
using a data mining system one
can predict locations that have
a huge probability
To summarize, malware detection
is now conducted in a client-
server manner with the cloud-
based architecture
The Internet of Things concept
arises from the need to
manage, automate, and
explore all devices,
instruments, and sensors in the
world. In order to make wise
decisions both for people and
for the things in IoT, data
mining technologies are
integrated with IoT
technologies for decision
making support and system
optimization
Financial fraud has been a big
concern for many organizations
across industries and in different
countries. This review provides a
fast and easy-to-use source for
both
researchers and professionals,
classifies financial fraud
applications into a highlevel and
detailed-level framework
human social conditions make
confronting the phenomenon of
crime inevitable The main
purpose of this research is to
study the method based on data
mining
Inappropriate payments by
insurance organizations or third
party payers occur as a result of
error, abuse or fraud. the
technical methods used in KDD
and data mining, and paid
little attention to the practical
implications of their findings for
health care managers and
decision makers
Internet accessibility is widening
and an increasing amount of
crime taking on a digital aspect.
this study could be used as a basis
for deeper systematic exploration
of the literature regarding a
particular subgroup of topics or
techniques identified in our
results.
The purpose of this review article
is to provide researchers and
practitioners with a retrospective
view of several methodologies
and technologies. We identify five
major technologies namely, link
analysis, intelligent agents, text
mining, ANNs, and ML which have
been widely used in various
domains for developing the
technical foundations of an
automated security and criminal
investigation system
The detection and decision
making process should be done
using a risk-based approach with
his
professional knowledge. However,
there is no specific rule or
standard as to what constitutes
suspicious activities