KEMBAR78
Fuzzy Logic | PDF | Statistical Classification | Artificial Neural Network
0% found this document useful (0 votes)
42 views14 pages

Fuzzy Logic

fuzzy

Uploaded by

gprabakaran333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views14 pages

Fuzzy Logic

fuzzy

Uploaded by

gprabakaran333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Neural Computing and Applications (2022) 34:10597–10610

https://doi.org/10.1007/s00521-020-04862-2
(0123456789().,-volV)(0123456789().
,- volV)

ORIGINAL ARTICLE

Hybrid method for mining rules based on enhanced Apriori algorithm


with sequential minimal optimization in healthcare industry

LE
M. Sornalakshmi1,2 • S. Balamurali1,2 • M. Venkatesulu3 • M. Navaneetha Krishnan3 • Lakshmana Kumar Ramasamy4 •
Seifedine Kadry5 • Gunasekaran Manogaran6 • Ching-Hsien Hsu7,8,9 • Bala Anand Muthu10

Received: 30 December 2019 / Accepted: 14 March 2020 / Published online: 31 March 2020

IC
Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract

RT
Data mining may enable healthcare organizations, with analysis of the different prospects and connection between
seemingly unrelated information, to anticipate trends in the patient’s medical condition and behavior. Raw data are large
and heterogeneous from healthcare organizations. It needs to be collected and arranged, and its integration enables medical
information systems to be integrated in a united way. Health data mining offers unlimited possibilities to evaluate
numerous less obvious or secret data models utilizing common techniques for study. Association rule mining (ARM) is an

A
effective technique for detecting the connection of the data which are the most commonly used and influential algorithms
in ARM for an Apriori algorithm. However, it generates a large amount of rules and does not guarantee the efficiency and
value of the knowledge created. In order to overcome this issue, an enhanced Apriori algorithm (EAA) based on the
ED
knowledge of a context ontology (EAA-SMO) methodology for sequential minimal optimization (SMO) is suggested. The
simple knowledge is to establish the ideas of ontology as a hierarchical structure of the conceptual clusters of specific
subjects, which comprises ‘‘similar’’ concepts that mean an exact category of the knowledge within the domain. There is an
interesting rule for each cluster based on the correlation between the items. In addition, the rule developed is classified as a
prediction model for anomaly detection based on SMO regression. The experimental analysis demonstrates the proposed
CT

method improved 2% of accuracy and minimizes the execution time by 25% when compared to semantic ontology.

Keywords Context ontology  Enhanced Apriori algorithm (EAA)  Healthcare system  Sequential minimal optimization
(SMO) regression  Wireless sensor network
A

List of symbols F Linear function


Cl Candidate list ni Slack variable
TR

TID Transaction ID xmin Minimum value


FI Frequent itemset xmax Maximum value
v Region of input
e Maximum error
RE

5
Department of Mathematics and Computer Science, Faculty
& Seifedine Kadry of Science, Beirut Arab University, Beirut, Lebanon
s.kadry@bau.edu.lb 6
University of California, Davis, USA
1 7
Department of Computer Applications, Kalasalingam Department of Computer Science and Information
Academy of Research and Education, Krishnankoil, Engineering, Asia University, Taichung, Taiwan
Tamilnadu 626126, India 8
Department of Medical Research, China MedicalUniversity
2
Department of Computer Science and Engineering, Hospital, China Medical University, Taichung, Taiwan
Kalasalingam Academy of Research and Education, 9
Krishnankoil, Tamilnadu 626126, India Department of Computer Science and Information
Engineering, National ChungCheng University, Chiayi,
3
St. Joseph College of Engineering, Sriperumbudur, Chennai, Taiwan
Tamilnadu 602117, India 10
V.R.S. College of Engineering and Technology, Viluppuram,
4
Hindusthan College of Engineering and Technology, India
Coimbatore, India

123
10598 Neural Computing and Applications (2022) 34:10597–10610

1 Introduction on the incidence of these diseases can also be mined with


the Apriori algorithm in a given period.
Wireless Sensor Network (WSN) is a network of battery- In the case of an Apriori algorithm, it is equivalent to the
powered wireless sensors with energy constraints [1], importance of minimal support used and irrespective of
which are distributed within a large area. Latest progresses number of characteristics or instances that the number of
in the low power architecture and communication protocols cycles required in order to obtain best laws. The minimum
[2–6] enable the application of WSN in the high data-in- support level includes both the best created rules and the
tensive applications. WSN is cast off in the medical average size of frequent itemsets.

LE
applications for the management of healthcare services [7]. In the WSN, the raw input data are together from the
It comprises battery-operated wireless medical sensors that medical sensors and transmitted toward the data repository
are attached to the body of the patient to collect the for further processing. The main problem associated with
physiological parameters. These sensors are capable of the raw data is data quality, heterogeneity and storage

IC
analyzing the abnormality in the physiological conditions space. Association rule (AR) mining is, essentially, one of
[8]. The data obtained from the medical sensors provide the the utmost vital tasks in knowledge mining in databases
information for the physicians, nurses and caretakers to [16]. This method purposes to learn implicit associations
detect the situation of the patient. In-hospital monitoring among items in databases that can be of excessive attention

RT
the patients for extended time duration is an expensive to domain specialists. A classic and commonly used
task. Hence, it is an easy option to retain the non-emer- application of AR mining is the medical field. Actually, the
gency patients in their home. These patients can be exponential raise of the size along with the difficulty of the
checked continuously using the remote medical sensors [9]. medical data increases essential requirements, for data

A
Medical sensors such as MICAz [10], TinyNode [11], organization and knowledge finding. Nevertheless, the
TelosB [12], Shimmer [13] and IRIS [14] are flexible and huge quantity of mined association rules styles the step of
mobile [15]. The concierge may not be obtainable round processing and inferring them, very difficult for domain
ED
the clock to screen the patient. Hence, it is vital to safe- specialists. Addressing this problem, researchers ensure
guard the data reliability to detect an abnormality during suggested ranking produced rules according to their pos-
the emergency situations. Due to the current developments sible interest and generating highly ranked rules to be
in the healthcare industry, there is a need for wearable and directly available to decision makers.
portable device to monitor the condition of patients. Approaches for effective rules examination can be
CT

For successful medical research, data mining techniques classified into objective and then subjective approaches. In
have been developed to help the clinician achieve better subjective analysis, the ARs estimation is built on prior
diagnostics. The researcher may collect data from the HIS, domain information that is to be implemented inside
which is a sample from patient information, including the diverse supports (like Rule schemes, ontologies, etc.).
name, age, condition, location, area, date from laboratories Whereas, the objective approaches are based on the prac-
A

that continue to grow year after year. In this analysis, the tice of statistical knowledge in the database. While the
researcher will be able to compile data. This work will latter method can get valuable information about the
TR

identify the recurrent illness using hybrid approaches by dataset structure like the support then the confidence
collecting data from the patient information system. measures, the rule effectiveness highly depends on the
This work aims to calculate data on common diseases domain information. Definitely, the increase the knowledge
using the innovative method used to analyze the training is characterized in an expressive and official way, the
data. The Patient Information System in the medical field is increase the rule calculation is proficient. At the end, some
RE

used to collect numerous types of medical records of ill- methods have suggested to contain ontologies in the post-
nesses and of patients from various areas. It is a hercules to processing task, to model domain information since they
classify the recurrent illnesses from the large data collec- deliver a semantic support for vocabularies structuring.
tion and their causes. There are various clinics with Newly, in medical domains, ontologies have converted the
patients from different locations. In the same location, they basis in knowledge gaining and managing [17].
do not intersect.
The clinics where they are treated retain their records. It 1.1 Motivation
is not easy to collect information on the disease which
frequently occurs. Information’s may be obtained in rela- The concept of previous knowledge of the process or
tion to these types of illnesses through association law. The environment is one of the most important and difficult tasks
Association law for data mining is followed Apriori. Data for data mining. Prior knowledge is useful for the collec-
tion of suitable data and mining techniques, the elimination
of volume, the comprehensibility of performance and the

123
Neural Computing and Applications (2022) 34:10597–10610 10599

enhancement of the overall process. For order to identify Predictive analysis on a population-wide scale will
the common diseases in a large medical dataset, this greatly reduce costs by identifying which people are more at
approach is useful. The findings of this study will allow risk of disease and performing early treatment before com-
physicians to agree on medications for chronic diseases. plications arise. It consists of incorporating data connected to
Information from different geographical areas is analyzed a number of factors. It includes medical history, social or
over different periods. socioeconomic standing and related characteristics.
We need hybrid ontology Apriori algorithms based on The sections in the paper remain prearranged as follows:
SMO optimization processes in our proposed method. The Section II describes a brief study of the ontology tech-

LE
Apriori produces the candidate itemsets by adding a large niques used in the healthcare applications. Section III
number of items to the previous pass and removing the describes the future work including context ontology and
subsets that in the corresponding pass are small by taking EAA. Section IV presents the experimental analysis of the
into account transactions in the data base. The number of proposed approach. Section V mentions the conclusion of

IC
large sets of applicants has been significantly reduced by the proposed work.
including only large items from the last run. SMO splits
this large QP (quadratic programming) problem into a
variety of smallest possible QP problems. This is the lar- 2 Related works

RT
gest QP problem. Such limited QP problems can be eval-
uated so that a time intensive number QP optimization Kim et al. [19] proposed healthcare system based on the
cannot be used as an inner loop. SMO requires a linear ontology concept along with wearable sensor-based smart
quantity of memory in the training set that allows SMO to wear to ensure personalized healthcare service. The per-

A
accommodate very large training sets. formance of the healthcare system and user satisfaction is
By combining the ARM with the ontology, interesting improved. Kim and Chung [20] presented a context infor-
relationships among items can be obtained. In our proposed mation model applied to the healthcare services based on
ED
work, the context ontology is integrated with the ARM for the ontology concept. The medical references and service
healthcare monitoring. Since storage problems can be environments are considered to develop a healthcare
solved by using the ontology for the description of the model. Subhani and Kent [21] generated an audit rule
concepts and relationship between the data in a huge ontology from the rules for the Continuous Process
dataset [18]. Moreover, fresh new relations can be obtained Auditing in Healthcare Decision Support Systems. Kumar
CT

from the ontology depiction of the initial concepts. This [22] presented an ontology model for the healthcare
improves the data quality and checks the possible varia- information system. The enhancement of the ontology
tions that occur during the data integration. Association process is used for confirming the treatment process.
rule mining (ARM) is a prevailing technique used for Lamine et al. [23] wished for an ontology-driven design
discovering the relation among data. approach to enable the creation of personalized careflows.
A

Apriori is a common algorithm used for finding the Mohan and Aramudhan [24] introduced an access control
association between the different sets of data. This algo- model based on the ontology concept to enable authorized
TR

rithm is designed to operate on the database containing data access control between the users and the service
several transactions. However, the performance of the providers. The communication between the entities is
Apriori algorithm degrades for various reasons. The Apri- limited on certain rules. The data access control is enabled
ori algorithm needs ‘n’ numerals of database scans to using the permissions that are defined with the object
generate ‘n’ number of frequent itemsets. It does not properties. Mehmood et al. [25] introduced a framework to
RE

identify the transactions with the identical itemsets. Hence, classify the devices depending on the functionality. An
it consumes more resources unnecessarily for the repeated abstraction layer avoids the hardware constraints and net-
generation of itemsets from the identical transactions. In work connectivity issues. Ongenae et al. [26] proposed a
the proposed work, enhanced Apriori algorithm (EAA) is ontology framework based on the self-learning concept, for
applied for mining the frequent itemsets without the gen- allowing the context-aware applications to inherit the
eration of new candidate. The sequential minimal opti- behavior of the users at the run-time. The proposed
mization (SMO) regression is used to generate the framework achieved high detection rate and low relative
prediction model for the physiological parameters. This is error, execution time and memory usage. Campbell and
used to spot the anomaly in the physiological parameters. Pereira [27] suggested a novel ontology-based method to
The ontology is inhabited with the measurement data develop personalized healthcare applications for the
received as of the sensors and enriched with the context for patients. Larburu et al. [28] presented an ontology that
achieving efficient data mining results. specified the relationship among the technical context,
clinical data quality data and patient treatment. The safety

123
10600 Neural Computing and Applications (2022) 34:10597–10610

of the patient is guaranteed even during the variation in the rule was modeled to introduce relevant attribute series in
technological context. the Apriori-based association rule mining method, and an
As the difficult that the traditional Apriori algorithm has effective classifier based on these association rules was of
low mining competence and makes a huge no. of irrelevant constructed to predict the health abnormality using SMO.
rules, combined with the characteristics of the process A table below summarizes the techniques for different
parameters of the battery industry automated production rules of mining algorithms and techniques, their benefits
process, an association rule algorithm for battery automa- and constraints.
tion industrial production is proposed-BI_Apriori algo- A concise overview of the algorithms of the type used

LE
rithm by Zhou et al. [29]. BI_Apriori algorithm has better for the estimation of medical datasets is given. Regarding
performance in mining efficiency, generation rule number Breast Cancer Detection, different classification algorithms
and generation rule correlation. Huang et al. [30] proposed are suggested. The method of classifying the disorder as
a fresh optimization algorithm based on bit set matrix. It benign and malignant records is separated. The Association

IC
just essentially scans the database twice to create the bit set Regulations are used in order to generate the FP Classifier
matrix structure needed for the algorithm. During the with the aid of an FP-growth algorithm [33]. The recurrent
mining method, the rare and infrequent itemsets were variations collected are used to identify the target group.
erased in time to lessen the scanning range of the algo- Four separate classificatory, including decision-making

RT
rithm, and bit operation was cast off to speed up the subset trees, Naive Bayes, Naive Networks and Association Data
detection. Using bit operations to accelarate the indentifi- Sets [34].
cations of subsets and to speed up the connections. Once For achieving high classification precision, the decision
obtaining the candidate itemset help degree, it is not nec- tree is the ultimate classifier. The aim of the algorithm is to

A
essary to re-scan the database and only a bit of calculation find a large percentage of the leaves. Permutation experi-
can be obtained by biting the bit vectors of the items in the ments have also been conducted to verify the observed
candidate itemset and the non-frequent itemset can be relation statistically. Essential connections are obtained by
ED
found early in the calculation process and the remainder removing the best leaf from the tree by averaging the
can be terminated. Operations aimed at improving the produced tenfold cross-validation of the training data, the
algorithm performance. Xueyuan et al. [31] designed the precision of the algorithm is determined. When contrast to
Apriori algorithm to efficient recommendation system for the artificial neural network and logistic regression, the
college library. This system is able to provide an efficient decision tree also does well with the classification of breast
CT

and custom services, enabling readers to find the books of cancer survival [35]. For tree construction using the vector
their interest more quickly. In the journal, Hasan et al. [32] threshold, statistical tests such as Giniindex, Chi-square are
suggested a technique for avoiding this issue by using used. In the classification where weights are assigned by
binomial distribution (BD) to adapt to discover adequate domain experts to their attributes based on their signifi-
minimum assistance. Mine optimal frequency itemsets cance level, the author proposes weighted classification
A

have been helped to make their suggested method work technique [36]. Two classifiers have been created, one
better than the current benchmark. against the strong regulations list with the most important
TR

Conferring to the advantages stated above, we requisite attributes, and the other against the spare regulations list
to model a method which we can find sturdy correlation with the least significant attributes. The Naı̈ve Bayes
physiological data environments in the medical dataset. classifier, which is of the greatest accuracy relative to the
These surroundings can benefit Apriori to mine correlation artificial neural network [37], incorporates known classifi-
feature-based association rules which are cast off to fore- cation methods for forecasting chronic kidney disease.
RE

cast the abnormality. Obviously, clustering methods were In the quick miner process, the analysis is carried out.
considered to create environments in 40 the whole dataset. Through considering variables like diabetes and age for
Different from other clustering algorithms, ontology con- ckdclass, Naı̈ve Bayes achieves a 100% accuracy rate. The
cept is an automatic clustering algorithm that does not help vector machines and K-Nearest Neighbor [38] are
require specifying certain number of clusters, and it has other data mining classifiers used in this forecast. Support
advantages in generating clusters with random numbers. vector machine enjoys a higher accuracy of 78.09% than
Therefore, a Hybrid EAA-SMO is proposed in this paper. KNN classifier and also a lower error rate of 21.9%. The
The influence of EAA-SMO contains two points: (1) We MATLAB applies both of the algorithms. The prediction of
planned a novel strategy to obtain related attribute in the surviveness to kidney dialysis is examined through neural
medical dataset by context ontology. (2) A new generated artificial networks, decision trees and algorithms of logistic

123
Neural Computing and Applications (2022) 34:10597–10610 10601

regression [39]. Between them, multilayer perception with 2.1 Limitation


back propagation artificial neural network achieves the
most efficient predictor with the highest accuracy, includ- In addition to research and the efficiency issues, it raises a
ing rule-set classifiers, decision trees, neural networks and range of technological obstacles to implement the Big Data
Bayesian networks [40]. The most effective prediction Strategy in actual clinical implementations. New clinical
techniques for cardiopathy are proposed. methods and ways of thinking are expected. Big data
The set of rules is considered the most favorable indi- methods focused on machine learning and data mining
cator for IF–THEN rules for significant patterns. Lower demand that patterns be checked without understanding

LE
than the fixed threshold regular trends are structured to what can occur. This methodology is very different from
achieve strict laws. Neural networks [41] appear to be the the traditional scientific approach to starting a specific
ideal way to predict Cleveland cardiovascular data with the research topic and requires strict procedures to verify the
greatest precision when comparing decision trees and conclusions and ensure reliability and statistical

IC
algorithms in Naı̈ve Bayes. Furthermore, obesity and significance.
smoking features are incorporated to produce a more
realistic outcome. For cardiovascular disease diagnosis,
Associative Classification methodology [42] uses FP- 3 Proposed work

RT
growth algorithm to improve diagnostic precision through
various parametric approaches. With the aim of pruning the Figure 1 displays the flow diagram of our work. Initially,
rules, compatibility laws check is added. Relative to Naive, the data are composed from the medical sensors and ana-
decision tress and CBA, the precision and quality of the lyzed. Data normalization is performed to transform the

A
experimental results are fairly high. The periodic rules input variables. The context ontology is applied to generate
provided help to better prediction. rules. The EAA generates the frequent itemsets with the
context ontology, and SMO regression generates the pre-
ED
diction model for the physiological parameters.

Approach Year Objective Pros Cons

Metwally 2013 They proposed a decision support tool They stated that 98.83%, 97.07% of the The imbalance and cost sensitivity
[43] based on three different classifications precision of BDT was better than of issue of the decision tree
CT

for breast cancer (SDT), boost decision SDT classifier in medical diagnosis
tree (BDT) and decision tree forest. It
proposed to support decision detection
of breast cancer
Chowdhury 2011 The suggested method included the The method shows ANN-based neonatal The structure of artificial neural
A

et al. [44] multilayered perceptron with a back disease predictions with 75% precision networks is not determined
propagation algorithm for the with 64 training sets then 15 test sets under any particular rule
development of ANN and the and 15 validation tests
TR

recognition of patterns for neonatal


disease diagnosis and prediction
Vanisree 2011 The suggested scheme is based on a 80% of training information and 20% of In their hidden layers, networks
et al. [45] multi-layered feed forward neural testing were used by the are also susceptible to the
network of the back propagation neural scheme suggested and 90% precision amount of neurons. Too few
network trained through a supervised and 0,016 mean square errors were neurons can cause insufficiency
RE

delta learning rule accomplished


Ratnakar 2013 They proposed a solution using genetic 13 attributes that are provided for to They measure rather poorly as the
et al. [46] algorithms to select the best set of Naı̈ve Bayes classifier are decreased to no. of data rises (as regards the
characteristics for prediction of seven via GA time complexity)
cardiovascular diseases and the
technique of Naı̈ve Bayes to establish
connections among attributes using
conditional probability models
Chitra et al. 2013 The suggested technique utilizes a The findings of experiments indicate that The training method is very slow
[47] machine-learning method, Support support vector can be used effectively
Vector Machine, as a diagnostic for the diagnosis of diabetes
classifier for diabetes. Machine-
learning technique focuses on high-
dimensional medical data classification
of diabetes disease

123
10602 Neural Computing and Applications (2022) 34:10597–10610

(continued)
Approach Year Objective Pros Cons

Masethe 2014 A comparison of several WEKA data The used algorithm is concluded. J48 is
[48] mining algorithms for the prediction of having highest accuracy than the others
heart attacks was conducted in order to
find the best forecast technique
Archana 2015 They suggested a hybrid forecast model The results demonstrate that Hpm-MI A wide ranging of learning

LE
and based on K-means clustering with produced 99% of accuracy, sensitivity, iterations that do not suit real
Sandeep multilayer perceptron with missing specificity, kappa and ROC for the time
[49] value imputation (HPM-MI) Hepatitis Data, and 96.0% for the Pima
Indian Diabetes Dataset
Turabieh 2016 They suggested a hybrid algorithm that GWO is cast off to find optimal weights The findings display that the

IC
[50] includes two strong computation and biases for the ANN model, to model proposed increases the
intelligence methods, gray wolf lessen the likelihood of ANN stuck to speed of convergence and
optimization (GWO) and artificial local minimum requirements and to prediction accuracy
neural networks (ANN) converge slowly to world optimum (Time reduced to half)

RT
Tina Patil 2013 Two classification algorithms have been The outcomes display that J48 is extra
et al. [51] cast off, i.e., Naive Bayes based on precise and cost-effective than Naive
likelihoods and J48 based on choice Bayes
treaties classifying the object with
regard to the predefined classes
according to its characteristics

A
Sunila et al. 2012 They planned an improved algorithm for The outcome demonstrates that the Do not expand into larger actual
[52] multilayer perceptrons (MLP), which method proposed is better than the systems from smaller research
the whole thing on multiple training MLP algorithm and reached a precision systems
settings of 82.8%
ED
Zheng et al. 2014 They suggested the K-means algorithm With 10 times cross-validation, the K-means have problems clustering
[53] to individually find the hidden tumor proposed algorithm reaches the information in different
patterns precision at 97, 38 percent dimensions and densities of
clusters
Vadicherla 2013 They recommended that the SVM The outcome demonstrates that even It is not simple to choose a good
CT

et al. [54] method for the diagnostic scheme of with huge data sets and time efficiency, kernel feature
heart disease be sequential minimum SMO demonstrates excellent outcomes Long time for huge datasets
optimization (SMO) practice
Kumari 2013 They contrasted different RIPPER, Sensitivity, specificity, precision, error
et al. [55] decision tree, artificial neural networks rate, true positive rate and false
A

(ANNs) and support vectors (SVM) positive rate compare the output of
methods for the classification of above-mentioned methods. It can be
cardiovascular diseases evaluated therefore that SVM estimates
lowest mistake and greatest precision
TR

cardiovascular illnesses
[56] Miao 2014 They have created a multi-model based The suggested Apriori algorithm, which Limitation of the Apriori initial
Wang on the Apriori hospital algorithm for identified the features of headache in waste time algorithm for the
extracting data from the database for traditional medicine, facilitated the entire database of frequently
helpful information in the decision- choice of physicians in recipes for searched items
RE

making process. different types of headache patients.

They introduce in this paper ontology to chronically ill based on Apriori, which predicts unexpected problems in
patients and incorporate two configuration mechanisms and the medical field performed by the exploration of surfaces.
a method to test the law of association. The first process of They developed rules by the method suggested by detect-
individualization applies the quality of ontology to the ing the presence of outliers to increase the accuracy and
particularities contained in the hospital report for a par- prediction of diseases.
ticular individual condition and then creates a personalized
ontology with only the therapeutic information relevant to 3.1 Data analysis
the treatment of the patient by the healthcare professionals.
Additionally, the Apriori algorithm for classification prin- Data analysis is the first phase in the data mining process.
ciples is used. They proposed a context ontology approach In order to process a data, it is essential to analyze the data.

123
Neural Computing and Applications (2022) 34:10597–10610 10603

3.3 Context ontology

Context is the information used for characterizing the state


of entities that are found to be significant to the commu-
nication among the user and applications. The context is
classified into two types. The low-level context is measured
as explicit context then directly detected from the context
provider. The high-level context is considered as implicit

LE
context and resultant from the low-level contexts. For the
semantic creation of high-level context, context ontology
modeling and context interpretation are required based on
the domain. The ontology is used to make a high-level

IC
context, to create a semantic seamless high-level context
for the context-aware service. The ontology is well-defined
Fig. 1 Flow diagram of proposed work
as a proper description of the shared conceptualization. The

RT
benefits of ontology are knowledge sharing and reusable
and logic inference. The expressive concept and support for
the syntactic and semantic interoperability can be provided
using the ontology. It offers classes of the objects, asso-
ciations and domain constraints on their properties.

A
The structured information can be shared by mapping
the concepts in various ontologies. Therefore, ontology is
used to state the context and domain knowledge. Context
ED
ontology includes the domain languages and represents the
relationship of languages. Also, it describes the mutual
model of the domain knowledge. The ontology is designed
as two classes. The general class is a general context
concept in the prevalent environment. The domain class is
CT

the definite ontology in accordance with the domain area.


The system using the biocontext can reuse the context
Fig. 2 Ontology structure
ontology. The service rules are generated depending on the
First step in the data analysis is the credentials of the core context ontology formed by the Web Ontology Language
(OWL) to create semantically perfect rules and inference.
A

components of the data to be cast off. In this work, the


patient data are the main block. The data types, forms and It is necessary to apply ontology for interoperability and
the information are to be understood, and feature study of knowledge reuse in the computing environment [57]. Fig-
TR

data is required. The medical sensors are used for moni- ure 2 depicts the ontology structure. The physiological data
toring the blood pressure, heart rate, respiration rate, pulse describes the physiological parameters.
rate and oxygen saturation rate (SpO2) of the patient [1].
3.3.1 Abnormality detection rule
RE

3.2 Data normalization


If blood pressure exceeds threshold value, it is determined
Data normalization performs transformation of all input as abnormal.
variables in the data to a predefined range. It scales the If heart rate exceeds threshold value, it is determined as
numerical variables in the range [0, 1]. It is defined as abnormal.
If pulse rate exceeds threshold value, it is determined as
x  xmin
xnew ¼ ð1Þ abnormal.
xmax  xmin
If respiratory level exceeds threshold value, it is deter-
where ‘x’ represents the input data, xnew denotes the nor- mined as abnormal.
malized data, xmin and xmax indicates the minimum value If oxygen saturation exceeds threshold value, it is
and maximum value of the input data, respectively. determined as abnormal.
The data normalization development removes repeated
data.

123
10604 Neural Computing and Applications (2022) 34:10597–10610

3.3.2 Populating the ontology

The ontology is populated, when the sensors generate the


monitored patient data. Before populating the ontology, the
data are converted into a specific format, as the populating tool
requires a specific data representation format. A mapping file
is created to convert the data into Resource Description
Framework (RDF)/eXtensible Markup Language (XML)

LE
format. Then, the data are added as a repository to the existing
ontology. After populating, a new context-dependent data are
defined from the existing ontology data [58].

IC
3.4 Enhanced Apriori algorithm (EAA)

In the standard Apriori algorithm, the occurrence fre-


quencies of frequent itemsets are tested, when the candi-

RT
date itemsets are generated. The EAA is utilized for mining
the association rules to generate frequent k-itemsets. This
Fig. 3 Generation of candidate and frequent
algorithm finds the frequent itemsets directly and elimi-
nates the infrequent subsets based on the standard Apriori without the generation of new candidate itemset. Figure 3

A
algorithm, instead of determining these candidates are shows the generation of candidate and frequent itemsets.
frequent itemsets later the generation of new fangled can- The detail explanation of this algorithm is described below.
didates. The EAA performs mining of the frequent itemsets The EAA reduces the number of database scans and
ED
redundancy while creating and confirming the subtests in
Algorithm: EAA-AMO the database. This algorithm requires minimum time for
Input: generating the frequent itemset as compared to the standard
D=Transactional dataset (TID, List of ID) Apriori algorithm [59].
min_sup = Minimum Support
CT

min_con = Minimum confidence 3.5 SMO regression


Output:
Generated Rule
The SMO regression algorithm [60] is used for creating the
begin prediction model to detect the abnormality in the physio-
A

1: for all transaction logical parameters. The training data are


2: C1=(TID,List of ID) ½ðx1 ; y1 Þ; . . .; ðxl ; yl Þ  vxR, in which v indicates the region
3: C1= count_support(TID,min_sup) of the input data patterns. A function f ð xÞ with maximum
TR

4: for all items


5: count = count_item(D)
6: L1=count +1 // for all count
7: end for
8: FI= Null;
9: for each items in L1
RE

10: IfSupport( C1)>min_sup


11: FI= FI U C1
12: end if
13: end for
14: D2=generate_candidate(L1)
15: go to step 2 – 13 // for support and L2
candidate generation
16: Calculate C2, L2
17: D3=generate_candidate(L2)
18: go to step 2 – 13 //for support and Tertiary
dataset calculation
19: Calculate C3
20: Rule = Generate rule (FI,Min_con)
21: end for
end Fig. 4 SMO regression output

123
Neural Computing and Applications (2022) 34:10597–10610 10605

LE
IC
RT
A
ED
CT
A
TR
RE

Fig. 5 a Respiration, b arterial blood pressure (ABP), c oxygen saturation (SpO2), d heart rate and e pulse

123
10606 Neural Computing and Applications (2022) 34:10597–10610

The slack variables ni ; ni are presented to handle the


optimization problem.
1 X l  
Minimize kwk2 þC ni ; ni
2 i¼1
8
> y  hw; x ii  b  e ð4Þ
< i
Subject to hw; xi i þ b  yi  e
>

LE
:
ni ; ni  0
where C [ 0 is a constant that defines the error range. The
SMO regression output is exposed in Fig. 4.

IC
4 Performance analysis

RT
The enactment of the planned approach is assessed by
using the Physionet database [61]. The PhysioBank data-
base contains 72 whole records along with the periodic
measurements for all Multiple Intelligent Monitoring in
Intensive Care (MIMIC) 121 records, including multiple

A
recordings of 90 ICU patients. Association Rule Minera
and Deduction Analysis (ARMADA) tool [62] is used for
the simulation process to evaluate the proposed approach.
ED
The SMO regression is cast off to generate the prediction
model for all physiological parameters. Figure 5a–e pre-
sents the variations in the physiological parameters for 221
dataset. The heart rate is measured in beats per minute
(bpm). The normal heart rate values for a healthy adult at
CT

rest lie within the interval 60–100 bpm. The blood pressure
and pulse rates are represented as millimeters of mercury
(mmHg) and bpm. The normal blood pressure ranges
between 80 and 120 mmHg depending on the physical
A

condition of the person. Average respiration rate for the


grown-up is about 12–20 breaths per minute and standard
SpO2 rate lies in the range 95–100% [16]. Typically, the
TR

Fig. 6 a, b Association rule generation using ARMADA tool


heart rate and pulse values must be the same with same
variations. The heart rate and pulse values signify the
error ðeÞ is derived from the real training data. Since the
similar physiological parameter examined using dissimilar
values are lesser than e, the errors are omitted. While
sensor devices. Since there are differences among the heart
dealing with the medical data, it is vital as dropping more
RE

rate and pulse values, this point is marked as inconsistency


than the maximum error will depreciate the system per-
for the heart rate sensor.
formance. The linear function ‘f’ is defined as
To assess the performance of our future algorithm for
f ð xÞ ¼ hw; wi þ b with w 2 v; b 2 R ð2Þ discovering the frequent temporal patterns, we conducted
several experiments on the healthcare datasets. MATLAB
where h;i states the dot product in the input data. Small w is
tools for viewing and preprocessing data, MATLAB ren-
ensured by the significant reduction in the norm, i.e.,
ders machine-learning and predictive models simple and it
kwk2 ¼ hw; wi. It is defined as a convex optimization
deploys models for IT systems in industries. So, here we
problem.
have to implement our ARMADA (association rule mining
1 and deduction analysis) that was implemented in
Minimize kwk2
2 MATLAB-2015 on a 2.4 GHz Intel processor by 512 MB
Subject to yi  hw; wi  b  e ð3Þ of RAM running Windows 8 Professional.
hw; wi þ b  yi  e

123
Neural Computing and Applications (2022) 34:10597–10610 10607

LE
IC
RT
Fig. 7 a Rule support analysis and b rule confidence analysis

Performance Comparison - Semantic Ontology Vs EAA-SMO current traditional constructs, the new association princi-
100
ples will accurately and efficiently forecast the data in
90 various contexts of disease research.

A
80 Association rules are a basic but very valuable form of
data mining that define the probabilistic co-occurrence of
70
particular events in a database. In Fig. 3, the input
Accuracy(%)

ED
60 healthcare data set is used for ARMADA. They were
50 originally designed to analyze healthcare data, in which the
likelihood of dependent features of the same patient was
40
analyzed. The rule is evaluated using two numerical mea-
30 sures of Support and confidence. Figure 6a shows that the
CT

20 input dataset is given to ARMADA tool, and initial support


confidence values have been set. Figure 6b illustrated that
10
the ARMADA tools rules generation in the form of Item 1,
0
Semantic Ontology EAA-SMO Item N ? Count. Furthermore, 27 rules are generated in
A

Rule Generation Method 87.11 s of time. The minimum confidence and minimum

Fig. 8 Accuracy analysis of semantic ontology and proposed EAA- Performance Comparison - Semantic Ontology Vs EAA-SMO
TR

SMO 25

Data analysis was chosen and analyzed for the health


record of people with diabetes mellitus. Different charac- 20

teristics of the other side-effects and illnesses of diabetic


RE

individuals have been identified. This is achieved through


15
Time(ms)

numerous. The data set has taken into account over eight
characteristics such as age, sex, different levels of choles-
terol in diabetic body, triglycerides, various physical
10
imbalances and levels of blood pressure. Gender and dia-
betic disease have been taken as attributes in data of
2 years, hypertension and co-disease. 5
In the third set of data, the characteristics are diabetic,
thyroid, platelet count, hemoglobin, age, sex. The fourth
took account of the average of all of these. Proposed model Semantic Ontology EAA-SMO

educated all the datasets. The research has been conducted Rule Generation Method

under optimal proposed method conditions for derivation


Fig. 9 Execution time analysis of semantic ontology and proposed
and prediction of rules. Through strengthening other EAA-SMO

123
10608 Neural Computing and Applications (2022) 34:10597–10610

support for generated rule have been opted as 90% and willingness to give your time so generously has been very much
10%, respectively. appreciated.
Figure 7a, b illustrates the rule support and rule confi-
dence analysis. Support indicates the frequent occurrence
References
of the items in the database and confidence represents the
no. of times and statements are true. From Fig. 7, it is 1. Haque SA, Rahman M, Aziz SM (2015) Sensor anomaly detec-
observed that the proposed EAA yields better support and tion in wireless sensor networks for healthcare. Sensors
confidence values. Figure 8 shows the accurateness anal- 15:8764–8786

LE
ysis of the semantic ontology and proposed EAA-SMO. In 2. Aziz SM, Pham DM (2013) Energy efficient image transmission
in wireless multimedia sensor networks. IEEE Commun Lett
this consequence, the prediction accuracy rate of EAA- 17:1084–1087
SMO is 99% more than that of Semantic ontology. Asso- 3. Pham DM, Aziz SM (2011) FPGA architecture for object
ciation rules created in proposed method are more and extraction in wireless multimedia sensor network. In: Seventh

IC
better than that rules generated in semantic ontology with international conference on intelligent sensors, sensor networks
and information processing (ISSNIP), pp 294–299
the same supmin and confmin. It denotes that the medical 4. Pham DM, Aziz SM (2011) FPGA-based image processor
field is advantage for mining effective association rules. architecture for wireless multimedia sensor network. In: IFIP 9th
Figure 9 illustrates the execution time analysis of semantic international conference on embedded and ubiquitous computing

RT
ontology and proposed EAA-SMO. The proposed EAA- (EUC), pp 100–105
5. Pham DM, Aziz SM (2013) Object extraction scheme and pro-
SMO requires 25% of minimum execution time than the tocol for energy efficient image communication over wireless
semantic ontology. sensor networks. Comput Netw 57:2949–2960
6. Pham DM, Aziz SM (2013) An energy efficient image com-

A
pression scheme for wireless sensor networks. In: IEEE eighth
international conference on intelligent sensors, sensor networks
5 Conclusion and information processing, pp 260–264
7. Alemdar H, Ersoy C (2010) Wireless sensor networks for
ED
In this paper, a combined method of EAA and context healthcare: a survey. Comput Netw 54:2688–2710
ontology is presented for mining and modeling the physi- 8. Yilmaz T, Foster R, Hao Y (2010) Detecting vital signs with
wearable wireless sensors. Sensors 10:10837–10862
ological data using the concepts and relationships through 9. Milenković A, Otto C, Jovanov E (2006) Wireless sensor net-
the generated rules. The ontology is developed by the works for personal health monitoring: issues and an implemen-
contextual concepts and constraints and populated with the tation. Comput Commun 29:2521–2533
CT

measured physiological data attained from the medical 10. C. M. T. (CMT) (2017). MICAz ZigBee Series (MPR2400).
http://www.cmt-gmbh.de/Produkte/WirelessSensorNetworks/
sensors. By using the EAA with the context ontology, MPR2400.html. Accessed 20 Nov 2019
increasing number of rules is obtained. From the perfor- 11. Dubois-Ferrière H, Fabre L, Meier R, Metrailler P (2006) Tiny-
mance analysis, it is perceived that the planned approach Node: a comprehensive platform for wireless sensor network
yields improved support and confidence. The comparative applications. In: Proceedings of the 5th international conference
A

on information processing in sensor networks, pp 358–365


analysis demonstrates that the suggested EAA-SMO 12. T. W. R. Group (2017) The sensor network museum—Tmote
approach achieves maximum accuracy and minimum Sky. http://www.snm.ethz.ch/Projects/TmoteSky. Accessed 28
TR

execution time than the semantic ontology. Hence, our Oct 2019
proposed EAA-SMO approach is more efficient than the 13. Burns A, Greene BR, McGrath MJ, O’Shea TJ, Kuris B, Ayer SM
et al (2010) SHIMMERTM—a wireless sensor platform for non-
semantic ontology. In future work, the performance of invasive biomedical research. IEEE Sens J 10:1527–1534
Apriori algorithm further improved by optimizing the rule 14. Sun Q, Hu F, Hao Q (2014) Mobile target scenario recognition
with different sizes and types of dataset like bioinformatics, via low-cost pyroelectric sensing system: toward a context-en-
RE

CRM, telecommunication, etc. In the future, several hanced accurate identification. IEEE Trans Syst Man Cybern Syst
44:375–384
changes would be made to other algorithms in order to 15. Benferhat D, Guidec F, Quinton P (2012) Cardiac monitoring of
generate association rules to make them work more effi- marathon runners using disruption-tolerant wireless sensors. In:
ciently in current recommendation systems. New and International conference on ubiquitous computing and ambient
efficient methods of rule mining may be used to boost the intelligence, pp 395–402
16. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten
recommendation system’s efficiency. In order to enhance IH (2009) The WEKA data mining software: an update. ACM
the reliability of the recommendation system, efficient SIGKDD Explor Newsl 11:10–18
neural network approaches may be integrated into the 17. Mohd IN (2011) Interestingness measures for association rules
Association Law Mining. based on statistical validity. Knowl Based Syst 24:386–392
18. Gruber TR (1993) A translation approach to portable ontology
specifications. Knowl Acquis 5:199–220
Acknowledgements The authors would like to express their very 19. Kim J, Kim J, Lee D, Chung K-Y (2014) Ontology driven
great appreciation to Reviewers for valuable suggestions, and interactive healthcare with wearable sensors. Multimed Tools
Appl 71:827–841

123
Neural Computing and Applications (2022) 34:10597–10610 10609

20. Kim J, Chung K-Y (2014) Ontology-based healthcare context 38. Kaur G, Sharma A (2017) Predict chronic kidney disease using
information model to implement ubiquitous environment. Mul- data mining algorithms in hadoop. In: 2017 international con-
timed Tools Appl 71:873–888 ference on inventive computing and informatics (ICICI). IEEE
21. Subhani N, Kent R (2014) Novel design approach to build audit 39. Lakshmi KR, Nagesh Y, Veera Krishna M (2014) Performance
rule ontology for healthcare decision support systems. In: Pro- comparison of three data mining techniques for predicting kidney
ceedings of the international conference on e-learning, e-busi- dialysis survivability. Int J Adv Eng Technol 7(1):242
ness, enterprise information systems, and e-government (EEE), 40. Srinivas K, Kavihta Rani B, Govrdhan A (2010) Applications of
p1 data mining techniques in healthcare and prediction of heart
22. Kumar V (2015) Ontology based public healthcare system in attacks. Int J Comput Sci Eng 2(02):250–255

LE
internet of things (IoT). Procedia Comput Sci 50:99–102 41. Dangare CS, Apte SS (2012) Improved study of heart disease
23. Lamine E, Tawil ARH, Bastide R, Pingaud H (2014) An ontol- prediction system using data mining classification techniques. Int
ogy-driven approach for the management of home healthcare J Comput Appl 47(10):44–48
process. In: Enterprise interoperability VI. Springer, pp 151–161 42. Noh K et al (2006) Associative classification approach for diag-
24. Mohan K, Aramudhan M (2015) Ontology based access control nosing cardiovascular disease. In: Intelligent computing in signal
model for healthcare system in cloud computing. Indian J Sci processing and pattern recognition. Springer, Berlin, pp 721–727

IC
Technol 8:218–222 43. Azar AT, El-Metwally SM (2013) Decision tree classifiers for
25. Mehmood NQ, Culmone R, Mostarda L (2014) An ontology automated medical diagnosis. Neural Comput Appl
driven software framework for the healthcare applications based 23(7–8):2387–2403
on ANT? protocol. In 28th international conference on advanced 44. Chowdhury DR, Chatterjee M, Samanta RK (2011) An artificial

RT
information networking and applications workshops (WAINA), neural network model for neonatal disease diagnosis. Int J Artif
pp 245–250 Intell Expert Syst 2(3):96–106
26. Ongenae F, Claeys M, Dupont T, Kerckhove W, Verhoeve P, 45. Vanisree K, Singaraju J (2011) Decision support system for
Dhaene T et al (2013) A probabilistic ontology-based platform congenital heart disease diagnosis based on signs and symptoms
for self-learning context-aware healthcare applications. Expert using neural networks. Int J Comput Appl 19(6):6–12
Syst Appl 40:7629–7646 46. Ratnakar S, Rajeswari K, Jacob R (2013) Prediction of heart

A
27. Campbell D, Pereira E (2016) A novel ontology-based approach disease using genetic algorithm for selection of optimal reduced
to personalised mHealth application development. In: SAI com- set of attributes. Int J Adv Comput Eng Netw 1(2):51–55
puting conference (SAI), 2016, pp 985–989 47. Anuja Kumari V, Chitra R (2013) Classification of diabetes
28. Larburu N, Bults RG, Van Sinderen MJ, Hermens HJ (2015) An disease using support vector machine. Int J Eng Res Appl
ED
ontology for telemedicine systems resiliency to technological 3(2):1797–1801
context variations in pervasive healthcare. IEEE J Transl Eng 48. Masethe HD, Masethe MA (2014) Prediction of heart disease
Health Med 3:1–10 using classification algorithms. In: World congress on engineer-
29. Zhou N, Qiao M, Zhou J (2019) BI_Apriori algorithm: research ing and computer science 2014 Vol II WCECS 2014, San Fran-
and application based on battery production data. In: 2019 IEEE cisco, USA, 22–24 Oct 2014
9th international conference on electronics information and 49. Purwar A, Singh SK (2015) Hybrid prediction model with
CT

emergency communication (ICEIEC), Beijing, China, pp 1–5 missing value imputation for medical data. Expert Syst Appl
30. Huang Y, Lin Q, Li Y (2018) Apriori-BM algorithm for mining 42:5621–5631
association rules based on bit set matrix. In: 2018 2nd IEEE 50. Turabieh H (2016) A hybrid ANN-GWO algorithm for prediction
advanced information management, communicates, electronic of heart disease. Am J Oper Res 6:136–146
and automation control conference (IMCEC), Xi’an, 51. Tina Patil R, Sherekar SS (2013) Performance analysis of Naive
pp 2580–2584 bayes and J48 classification algorithm for data classification. Int J
A

31. Xueyuan W, Bo Y (2018) Design and implementation of an Comput Sci Appl 6(2):256–261
apriori-based recommendation system for college libraries. In: 52. Panday P, Godara N (2012) Decision support system for car-
2018 international conference on engineering simulation and diovascular heart disease diagnosis using improved multilayer
TR

intelligent control (ESAIC), Changsha, pp 372–375 perceptron. Int J Comput Appl 45(8):12–20
32. Hasan MM, Zaman Mishu S (2018) An adaptive method for 53. Zheng B, Yoon SW, Lam SS (2014) Breast cancer diagnosis
mining frequent itemsets based on apriori and FP growth algo- based on feature extraction using a hybrid of K-means and sup-
rithm. In: 2018 international conference on computer, commu- port vector machine algorithms. Expert Syst Appl
nication, chemical, material and electronic engineering 41(4):1476–1482
(IC4ME2), Rajshahi, pp 1–4 54. Technologies E, Vadicherla D, Sonawane S (2013) Decision
RE

33. Majali J, Niranjan R, Phatak V, Tadakhe O (2015) Data mining support system for heart disease based on sequential minimal
techniques for diagnosis and prognosis of cancer. Int J Adv Res optimization in support. Int J Eng Sci Emerg Technol 4(2):19–26
Comput Commun Eng 4(3):613–616 55. Ishtake SH, Sanap SA (2013) Intelligent heart disease prediction
34. Kharya S (2012) Using data mining techniques for diagnosis and system using data mining techniques. Int J Healthc Biomed Res
prognosis of cancer disease. arXiv preprint arXiv:1205.1923 1(3):94–101
35. Delen D, Walker G, Kadam A (2005) Predicting breast cancer 56. Wang M, Zhang L, Zhang Z, Xu C, Chen G, Shang H (2014) The
survivability: a comparison of three data mining methods. Artif application characteristics of traditional Chinese medical science
Intell Med 34(2):113–127 treatment on headache based on data-mining apriori algorithm.
36. Alwidian J, Hammo BH, Obeid N (2018) WCBA: weighted In: IEEE international conference on bioinformatics and biome-
classification based on association rules algorithm for breast dicine, pp 153–157
cancer disease. Appl Soft Comput 62:536–549 57. Ko E-J, Lee H-J, Lee J-W (2006) Ontology-based context-aware
37. Kunwar V et al (2016) Chronic kidney disease analysis using data service engine for u-healthcare. In: The 8th international con-
mining classification techniques. In: 2016 6th international con- ference on advanced communication technology, 2006. ICACT
ference cloud system and big data engineering (confluence). 2006, pp 632–637
IEEE 58. Bytyçi E, Ahmedi L, Kurti A (2016) ARM with context ontolo-
gies: an application to mobile sensing of water quality. In:

123
10610 Neural Computing and Applications (2022) 34:10597–10610

Metadata and semantics research: 10th international conference, 61. Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC,
MTSR 2016, Göttingen, Germany, 22–25 November 2016, Pro- Mark RG et al (2000) Physiobank, physiotoolkit, and physionet.
ceedings, pp 67–78 Circulation 101:e215–e220
59. Patil SP, Patil U, Borse S (2012) The novel approach for 62. Abdallah MA, Alshreef MHA (2014) Extracting associations
improving Apriori algorithm for mining association rule. World J from kidney transplantations dataset. Sudan University of Sci-
Sci Technol 2:75–78 ence and Technology, Khartoum
60. Smola AJ, Schölkopf B (2004) A tutorial on support vector
regression. Stat Comput 14:199–222 Publisher’s Note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

LE
IC
RT
A
ED
CT
A
TR
RE

123

You might also like