KEMBAR78
Deep Learning Based Software Defect Prediction | PDF | Deep Learning | Support Vector Machine
0% found this document useful (0 votes)
41 views11 pages

Deep Learning Based Software Defect Prediction

Uploaded by

jijianan666
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views11 pages

Deep Learning Based Software Defect Prediction

Uploaded by

jijianan666
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Neurocomputing 385 (2020) 100–110

Contents lists available at ScienceDirect

Neurocomputing
journal homepage: www.elsevier.com/locate/neucom

Deep learning based software defect prediction


Lei Qiao a, Xuesong Li a,∗, Qasim Umer a, Ping Guo b
a
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
b
Image Proc. and Patt. Recog. Lab., School of Systems Science, Beijing Normal University, Beijing, 100875, China

a r t i c l e i n f o a b s t r a c t

Article history: Software systems have become larger and more complex than ever. Such characteristics make it very
Received 20 June 2019 challengeable to prevent software defects. Therefore, automatically predicting the number of defects in
Revised 18 October 2019
software modules is necessary and may help developers efficiently to allocate limited resources. Various
Accepted 13 November 2019
approaches have been proposed to identify and fix such defects at minimal cost. However, the perfor-
Available online 5 December 2019
mance of these approaches require significant improvement. Therefore, in this paper, we propose a novel
Communicated by Dr. Kaizhu Huang approach that leverages deep learning techniques to predict the number of defects in software systems.
First, we preprocess a publicly available dataset, including log transformation and data normalization.
Keywords:
Software defect prediction Second, we perform data modeling to prepare the data input for the deep learning model. Third, we pass
Deep learning the modeled data to a specially designed deep neural network-based model to predict the number of
Software quality defects. We also evaluate the proposed approach on two well-known datasets. The evaluation results il-
Software metrics lustrate that the proposed approach is accurate and can improve upon the state-of-the-art approaches.
Robustness evaluation On average, the proposed method significantly reduces the mean square error by more than 14% and
increases the squared correlation coefficient by more than 8%.
© 2019 Elsevier B.V. All rights reserved.

1. Introduction to determine the order in which code should be inspected. Con-


sequently, the developers can allocate the limited test resources to
The complexity of modern software systems is increasing, and the code areas most likely to contain bugs. The resulting savings in
the resulting software applications often contain defects that can labor and time costs [8] can reduce the overall cost of maintenance
have severe negative impacts on the reliability and robustness of activities and maximize company profits [9].
these applications [1]. A software defect is commonly defined as Software defect prediction is a process that uses a model to
a deviation from the software specifications or requirements [2]. predict the code areas that potentially contain defects [10]. The
Such defects might lead to failures or produce unexpected results process of software defect prediction [11] includes three main
[2,3]. To reduce failures and improve software quality, many soft- steps: collecting a historical defect dataset; using the historical
ware quality assurance activities (e.g., defect prediction, code re- data to train a regression or classification model using machine
view and unit testing) are employed. Such activities typically cost learning or deep learning techniques; and applying the trained
approximately 80% of the total budget of a project [4]. To minimize model to predict the number or probability of software defects.
the cost, software engineers want to know which software mod- From the perspective of different dataset granularities, software
ules contain more defects and inspect such modules first. As a re- defect prediction techniques can be divided into four categories:
sult, software defect prediction techniques [5] have been proposed. package level [12], file (modules) level [7], method level [13],
Software defect prediction techniques help identify software and change level [9]. From the perspective of different metrics, it
system modules that are more likely to contain defects [6]. De- can be divided into two defect prediction categories: static and
fect prediction techniques can be used to build models that rank dynamic. Static defect prediction techniques mainly involve pre-
software modules by the predicted number of defects, defect prob- dicting the number of defects or the defects distribution using
ability, or classification results [7]. This ranked list can reflect the static software metrics [14]. Dynamic defect prediction techniques
priority for code inspection or unit testing and can thus be used mainly involve predicting the distribution of system defects over
time using the defect generation time [10]. Most defect predic-
tion techniques build defect prediction models that use traditional

Corresponding author. hand-crafted features, including Halstead’s software volume met-
E-mail addresses: qiaolei@bit.edu.cn (L. Qiao), lixuesong@bit.edu.cn (X. Li), rics [15] and McCabe’s cyclomatic complexity metrics [16]. Such
qasimumer667@hotmail.com (Q. Umer), pguo@bnu.edu.cn (P. Guo). approaches are based on statistical and machine learning theories.

https://doi.org/10.1016/j.neucom.2019.11.067
0925-2312/© 2019 Elsevier B.V. All rights reserved.
L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110 101

These approaches, which include support vector regression (SVR) an evaluation of the proposed approach and compares it to previ-
[17], fuzzy support vector regression (FSVR) [18] and random for- ous approaches. Section 5 introduces the threats to the validity of
est (RF) [19] models, first build a regression or classification model defect prediction. Finally, Section 6 provides conclusions and sug-
and then use the model to predict the number or probability of gests potential future work.
defects for a given unit of source code. These approaches have sig-
nificantly facilitated the prediction of software defects; however, 2. Related work
their performance (e.g., mean square error and squared correlation
coefficient) requires significant improvements [20]. In this section, we introduce related work on software metrics,
While many defect prediction approaches have been proposed classification and regression models for software defect prediction,
[10,18,21–28], most of the research on defect prediction involves and deep learning and its applications.
classification models [21]. These classification models classify a
software module only into fault-prone or non-fault-prone. How-
ever, this type of defect prediction does not provide a specific 2.1. Software metrics
number of defects. Predicting the defect probability of a given soft-
ware module is not sufficient to help in practical software testing Software metrics [9,15,16,36,37] have been widely exploited in
situations [29]. Moreover, allocating limited resources solely based software defect prediction [38]. Zimmermann et al. [39] argued
on faulty or non-faulty judgements may result in an inefficient use that combinations of complex metrics can be used to predict
of resources [26]. Some fault-prone modules may have more de- defects. They generated the Eclipse datasets for use with defect
fects than other fault-prone modules; thus, inspecting and testing prediction models and made the datasets publicly available. Catal
these modules requires additional effort [26]. and Diri [40] conducted a systematic review of previous software
In contrast, predicting the number of defects provides a spe- defect prediction research on various types of software metrics,
cific estimate of the defects in a given module [26,30]. Thus, the methods, and datasets. They found that method-level metrics
practitioners of software quality assurance activities can pay more dominate in previous defect prediction research. Defect prediction
attention to modules that have more defects and allocate their lim- that uses class-level metrics to determine software defects beyond
ited test resources more efficiently and optimally. Consequently, acceptable levels should be applied more frequently because this
the developers can concentrate their testing efforts on modules approach can predict defects in the design phase [40]. D’Ambros
that have more defects and accelerate the release schedule. There- et al. [41] used different metrics (i.e., source code churn, the
fore, predicting the number of defects in a module is a good way entropy of source code metrics, and process metrics) and built
to improve on predictions of whether a module is likely to be different classification models for defect prediction. Nam and Kim
faulty or non-faulty. Although the existing approaches to defect [42] proposed two novel approaches, CLA and CLAMI, to label
prediction classify software modules as buggy or non-buggy mod- unlabeled datasets automatically using the magnitudes of metric
ules, few approaches predict the number of defects within modules values to facilitate defect prediction for unlabeled datasets. Ozak-
[30,31]. Such approaches exploit regression models for the predic- inci and Tarhan [43] conducted a trend overview of early software
tion of the number of defects but require significant improvement defect prediction in the development life-cycle using process-
in the process of software development. To this end, in this paper, based software metrics. Nam et al. [5] proposed heterogeneous
our deep learning-based defect prediction model uses regression to defect prediction based on matching metrics to address limitations
predict the number of defects. of cross-project defect prediction that cannot be addressed via
Deep learning has been applied in a wide variety of fields such heterogeneous metric sets.
as speech recognition [32], natural language processing [33] and A number of software defect prediction techniques that use
image processing [34], and has proven to be a powerful technique. software metrics have been proposed [7,9,21–23,25,27,28,44,45].
To explore the power of deep learning for defect prediction and Such approaches can be divided into two types: supervised de-
further improve the accuracy of defect prediction, in this paper, we fect prediction approaches and unsupervised defect prediction ap-
propose a novel defect prediction approach, deep learning neural proaches. Supervised defect prediction approaches use historical
network-based defect prediction (DPNN). We employ deep learning datasets to train a defect prediction model [7]. Guo and Lyu
for defect prediction because deep learning models can learn and [46] applied the pseudoinverse learning algorithm to build a soft-
capture the discriminative features from data automatically, thus ware reliability growth model using the stacked generalization
resulting in a more accurate defect prediction model. technique. They also adopted a support vector machine (SVM) to
Wan et al. [35] conducted a large scale mixed qualitative and predict software quality [47]. Hata et al. [13] used an RF algorithm
quantitative study and found that the top 3 preferred granularity to build a method-level software defect prediction model based on
levels of the practitioners are the feature (i.e., requirement or con- historical method-level metrics. Unsupervised defect prediction ap-
ceptual concerns proposed by customers/users), commit and com- proaches can predict defect proneness without requiring a defect
ponent levels. Although defect prediction has finer granularity, it dataset [45]. These approaches can be used when a training dataset
can be helpful in facilitating defect inspection and unit tests. A is insufficient or is not available [48]. Yang et al. [45] proposed an
coarse granularity can help participants gain a good grasp of the unsupervised approach that ranks the change metrics in descend-
overall quality of the software. In this paper, we make defect pre- ing order based on the reciprocal of the raw value of each change
dictions at the module level. metric.
The paper makes the following contributions:
2.2. Classification models for software defect prediction
• We propose a deep learning-based software defect prediction
method.
Software defect prediction models can help developers locate
• We evaluate the proposed approach with metrics on real-world
and fix more bugs promptly with less effort and can be clas-
software datasets and find that the proposed approach outper-
sified as either classification or regression models. Most defect
forms the state-of-the-art approaches.
prediction models are classification models [21]. Lessmann et al.
The remainder of this paper is structured as follows. [21] proposed a framework to compare software defect classifi-
Section 2 reviews related work regarding software defect predic- cation prediction and conducted a large-scale empirical compari-
tion. Section 3 describes the proposed approach. Section 4 presents son of 22 classification models over 10 public datasets from the
102 L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110

NASA Metrics Data repository. Ghotra et al. [25] replicated the re- ber of defects. They found that only the average defect velocity
search of Lessmann et al. [21], built different classification models shows a strong positive correlation with the true number of de-
using different classification techniques and tested them on three fects.
datasets. They found that defect prediction models vary signifi- SVM was proposed by Vapnik et al. [61]. The theory of SVM
cantly in performance and that significant differences exist among is based on the idea of the structural risk minimization (SRM)
different classification techniques. Herbold et al. [49] replicated 24 approach [62]. The SVM was introduced to address classification
approaches that used different classifiers (e.g., logistic regression, problems. SVR is an extension of SVM used in solving linear and
C4.5 Decision Tree, RF, etc.). They found that the logistic regression non-linear regression problems [63].
model proposed by Camargo Cruz et al. [50] was best at predicting FSVR is commonly considered to be a combination of an SVR
fault-prone software. Jing et al. [10] – adopted a dictionary learn- with fuzzy logic [64]. SVR considers that all the data points have
ing technique to predict software defects. They classified modules the same importance in classification problems [65]. To reduce the
into buggy and non-buggy using the powerful classification ca- sensitivity of the less important data, fuzzy logic (i.e., fuzzy mem-
pability of dictionary learning. Cross-project defect prediction re- bership) is introduced to each data point in SVR. Fuzzy member-
quires some degree of homogeneity (i.e., different projects must ships assign different membership values as weights to control
be describable using the same metrics) between the projects used the importance of the corresponding data point [65]. Yan et al.
for training and those used for testing [5]. Zhang et al. [44] used a [18] employed FSVR to predict the number of defects in software
connectivity-based unsupervised classifier to solve this problem. and achieved better performance. Fuzzifying the input to their re-
Laradji et al. [51] studied the positive effects of combining fea- gression approach can address problems with unbalanced datasets.
ture selection and ensemble learning on the performance of defect Lee et al. [64] applied an FSVR model to predict the minimum de-
classification model. When building accurate classification models, parture from the nucleate boiling ratio (DNBR) in a reactor core
it is necessary to select the features carefully. Tantithamthavorn to prevent fuel cladding from melting and causing a crisis. Zhang
et al. [28] studied the impacts of mislabeling datasets on the per- et al. [66] proposed a fuzzy density weight SVR (FDW-SVR) denois-
formance of classifier models. In another work, Tantithamthavorn ing algorithm that allocated a fuzzy priority to each sample accord-
et al. [52,53] investigated classifier model performance and applied ing to its corresponding density weight.
an automated parameter optimization technique called Caret. They Sun and Sun [67] applied fuzzy SVM to the regression estima-
found that Caret can improve the classifier performance signifi- tion problem. They constructed a multi-layer SVM by combining
cantly and that the resulting classifiers are stable as those trained fuzzy logic with the SVMs. In the first layer, they used a fuzzy
using the default settings. Parameter settings can have a large im- membership function to the SVM; then, in the second layer, they
pact on the performance of defect prediction models. Ghotra et al. used a generalized SVM. Liu et al. [68] proposed a three-domain
[54] studied the impact of feature selection techniques on the fuzzy support vector regression model that integrates the kernel
performance of defect classification models. They recommended and fuzzy membership functions into a three-domain function that
applying feature selection techniques when building classification performs uncertain image denoising for a humanoid robot. Chen
models for defect prediction. et al. [69] utilized a three-layer weighted fuzzy SVR model based
He et al. [55] investigated the effects of using a simplified met- on emotion-identification information to understand human inten-
ric set to build defect predictors in the within-project and cross- tion. This model includes three layers: adjusted weighted kernel
project defect prediction scenarios. They found that naive Bayes fuzzy c-means, fuzzy SVR, and weighted fuzzy regression. Their
models achieve good defect prediction results. Tan et al. [11] con- model also has the ability to reveal the weights assigned to each
ducted the first study that applied online change classification to feature.
improve defect prediction performance. To improve the perfor- Quinlan proposed an approach to synthesizing decision trees
mance of cross-project defect prediction, Zhang et al. [56] stud- and demonstrated through several practical applications that the
ied many composite algorithms that integrated multiple machine technology for building decision trees is robust [70]. Naeem and
learning classifiers for defect prediction. Khoshgoftaar [71] used C4.5 and RF decision tree classification al-
gorithms in the context of cost-sensitive learning to build a soft-
2.3. Regression models for software defect prediction ware quality prediction model. Chen and Ma [31] used decision
tree regression (DTR) to build a defect prediction model to predict
Whereas classification models have been widely studied in the number of defects in different scenarios (i.e., within- and cross-
defect prediction, research on defect prediction using regression project defect scenarios). They found that the defect prediction
models is more limited [30,31]. A regression model considers de- models can obtain similar performance results in the within- and
fect prediction as a ranking task. The software modules are ranked cross-project defect scenarios. Rathore and Kumar [30] applied the
according to the number of predicted defects they contain. Pre- DTR to predict the number of defects in two scenarios: intra- and
dicting the number of defects explicitly in software modules is not inter-release defect prediction. Yu et al. [29] used the DTR model
easy. Therefore, an accurate regression model is necessary. as a baseline for defect prediction. Rathore and Kumar [26] con-
Mockus and Weiss [57] conducted the first study using a lin- ducted an empirical study on defect prediction models that predict
ear regression model to predict software failures on a change- the number of defects. They used the average absolute error, aver-
level dataset. Kamei et al. [9] built an effort-aware defect predic- age relative error, and level-l measures as the performance criteria
tion model by using linear regression on a dataset consisting of to evaluate the performance of the defect prediction models and
change metrics. Ohlsson and Alberg [58] proposed the Alberg di- found that the DTR achieved the best performance.
agram as a performance measure to assess the regression model
performance. Yang et al. [59] proposed a learning-to-rank approach 2.4. Deep learning and its applications
to optimize the performance measure of the ranking model. They
use fault-percentile-average (FPA) and the defects percentage in Many deep learning algorithms have been proposed, includ-
the first 20% modules to evaluate the prediction models. Felix and ing the convolutional neural networks (CNN) [72], recurrent neu-
Lee [60] proposed a machine-learning-inspired (MACLI) approach ral network (RNN) [73], and long-short term memory (LSTM) [74].
using the predictor variables derived from defect acceleration to Deep learning has been implemented in a variety of domains such
predict the number of defects in an upcoming product release and as speech recognition [32], natural language processing [33] and
determine the correlation of each predictor variable with the num- image processing [34].
L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110 103

Currently, deep learning is becoming increasingly prevalent in


the field of software engineering. Zhao and Huang [75] proposed
a new approach, DeepSim, to measure the functional similarities
of code. They encoded code control-flow and data-flow into a
matrix to create a semantic representation and then used a deep
neural network (DNN) model to learn features from the matrix
and conducted a binary classification. Ma et al. [76] proposed
MODE, an automated neural network debugging technique. Similar
to software debugging and regression testing, they conducted
a model-state analysis to identify model bugs and performed
training input selection. MODE can efficiently identify buggy
neurons and fix model bugs. Guo et al. [77] used word embedding
and an RNN to generate trace links. The word embedding learns
word vectors; then, the RNN uses these word vectors to learn
the sentence semantics. White et al. [78] used a deep learning
Fig. 1. Overview of the proposed approach.
software language model (i.e., a feed-forward network and an
RNN) to make code suggestions. They also identified avenues for
software engineering tasks using deep software language models 3. Approach
to make predictions. They reported that deep learning techniques
are applicable to source code files and found that deep learn- In this section, we propose a deep learning-based model for
ing can create high-quality models from a corpus of Java projects. software defect prediction. An overview of the proposed approach
Balog et al. [79] employed a neural network to solve programming- is presented next; then the details are presented in the remainder
competition style problems from input-output examples. Tian et al. of this section.
[80] applied a DNN to automatically generate test cases to per-
form automated testing of erroneous behaviors of DNN-controlled 3.1. Overview
vehicles.
Huo et al. [81] applied a CNN to locate potential buggy source An overview of the deep learning-based software defect predic-
code based on a bug report. They used both lexical and pro- tion is depicted in Fig. 1, which shows that the proposed approach
gram structural information to learn unified features from natural is composed of two steps: training a deep learning-based defect
language and source code to perform bug localization. Their ap- prediction model (i.e., DPNN model) and performing defect predic-
proach used a CNN to extract complete and semantic features. Gu tion for new modules based on the trained model. To predict the
et al. [82] proposed a new DEEPAPI approach built with an RNN number of defects in a software module, we perform the following
encoder–decoder model to generate an API suggestion sequence for key steps.
a given API-related query. They were the first to apply an RNN to • First, we collect and reuse datasets containing software metrics.
API sequence suggestions. White et al. [83] proposed a deep learn- • Second, to train the DPNN model, we conduct some preprocess-
ing technique to detect code clones. Their approach automatically ing operations (data log transformation and data normalization)
discovers discriminating features in source code. They suggested on the given software metrics.
that all the content in the terms and fragments of source code can • Third, we train a specially designed deep learning neural
be represented and used for clone detection. Xu et al. [84] pro- network-based model to predict the number of defects in code.
posed a deep learning approach to solve the problem of predict- • Finally, in the prediction phase, we perform preprocessing to
ing semantically linkable knowledge units. They considered it as a acquire the software metrics and then input the modeled data
multi-class classification problem rather than a traditional binary to the trained regression model to predict the number of de-
classification problem. They used word embeddings and a CNN to fects in each provided module.
capture the word-and-document-level semantics of the knowledge
units. Each of the key steps in the proposed approach is elaborated in
Lam et al. [85] proposed a novel approach that combined a the following sections.
DNN and revised vector space model (rVSM) for bug localiza-
3.2. Modeling
tion in which they used the DNN to learn to relate the terms in
bug reports and source files. Mou et al. [86] proposed a novel
The key goal of the proposed approach is to predict the number
tree-based convolutional neural network (TBCNN) for processing
of defects in software modules.
programming languages in which the convolutional layer is used
The model for defect prediction can be defined as a mapping:
to detect structural features. Trees of different sizes and shapes
can be dealt with by continuous binary trees and pooling. Gu y = f (m ) (1)
et al. [87] proposed a novel deep neural network named Code- where m is a given module and y is the predicted number of de-
Description Embedding Neural Network (CODEnn) to help develop- fects in module m.
ers perform code search. The code snippets and corresponding na- A module m consists of a set of software metrics and can be
ture language descriptions are represented as a high-dimensional defined as follows:
vector space. They built a deep neural network to perform the code
search. Hellendoorn et al. [88] proposed DEEPTYPE, a deep learn- m =< sm1 , sm2 , . . . , smn > (2)
ing model, to provide type suggestions. This model learns which where sm1 , sm2 , . . . , smn represent the software metrics of the
types occur in certain contexts and can then be used to predict module m.
variable and function type annotations. Henkel et al. [89] pro- The mapping f can be explored using different techniques. Ex-
posed an approach that transforms programs into a more suit- isting defect prediction approaches apply statistical theories (e.g.,
able representation. They then used the traces abstractions ob- linear regression or DTR) to search for the mappings that best fit
tained from a program as a representation for learning word a given dataset [9,26]. In this paper, we attempt to find such map-
embeddings. pings through deep learning techniques.
104 L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110

3.3. Data preprocessing • Third layer, output_dim = 6, kernel_initializer = ‘normal’ and


activation function = ‘ReLU’.
To determine the number of defects in software modules, • Last layer, output_dim = 1 and kernel_initializer = ‘normal’.
requires a preprocessing step to acquire the software metrics
[9,90,91]. To accomplish this task, we apply natural logarithm Notably, we set different input_dim values of the neural net-
transformation and perform data normalization on the extracted work for different datasets because the number of software met-
software metrics. rics differ. Because the Medical Imaging System (MIS) dataset1 and
the NASA Metrics Data Program (MDP) for NASA PROMISE (KC2)
3.3.1. Natural logarithm dataset2 have 11 and 21 software metrics, we specify 11 input_dim
Log transformation converts the dataset distribution from and 21 input_dim for them, respectively. The output of the neu-
highly skewed to less skewed [92]. We apply log transformation ral network is the predicted number of defects in the software
to each software metric due to the highly skewed distribution of module.
our dataset. After log transformation, a module m can be defined We explored using different neural network model architec-
as follows: tures. In our algorithm, the input and output dimensions are fixed.
  
We can change the number of layers and the activation function of
m =< sm1 , sm2 , . . . , smn > (3) the neural network. For the number of layers, we try to use a strat-
egy similar to a greedy search. We changed the number of layers
SM =< l n(sm1 ), l n(sm2 ), . . . , l n(smn ) > (4) from one to four and added one layer to the neural network ar-
  
chitecture each time. Finally, we found that the above parameter
where sm1 , sm2 , . . . , smn represent the log-transformed n soft- setting achieves the best performance.
ware metrics, SM represents a set of log-transformed n soft-
ware metrics, and ln(sm1 ), ln(sm2 ), . . . , ln(smn ) represent the less 4. Evaluation
skewed value of each software metric.
In this section, we evaluate the proposed approach against
3.3.2. Normalization
three state-of-the-art approaches, SVR [101], FSVR [18] and DTR
Data normalization is a commonly used technique that trans-
[26], on two well-known datasets: MIS dataset and KC2 dataset.
forms large data value ranges into small range values (or binary
values). Notably, we perform min-max normalization [93] because
4.1. Research questions
it is a commonly used normalization approach due to its high ac-
curacy and high learning speed [23,94–96]. Moreover, min-max
The evaluation investigates the following research questions:
normalization does not change the dataset distribution [97].
We perform data normalization for the following reasons. First, • RQ1: Does the proposed approach outperform existing ap-
the range of extracted software metrics varies greatly. Second, it is proaches?
compulsory when using deep learning algorithms [98,99]. Finally, • RQ2: How efficient is the proposed approach? How long does
it may reduce the estimation errors and calculation time (required it take to train the neural network-based regression model, and
in the training process) [100]. Min–max normalization can be de- how long does it take to perform defect prediction?
fined as follows:
  Research question RQ1 investigates the effectiveness of the pro-
 smi − min(sm )
Normalization(smi ) = (5) posed model against the SVR, FSVR, and DTR models. We choose the
max(sm ) − min(sm )
SVR, FSVR, and DTR models for the following reasons. First, these
 
where min(sm ) and max(sm ) represent the minimum and max- state-of-the-art approaches provide automatic predictions of the
 
imum values of a set of software metric sm , respectively, smi number of defects in software modules. Second, to the best of our
represents a logarithmic transformation value of the metric, and knowledge, these state-of-the-art approaches have been reported
 
Normalization(smi ) represents the normalized value of smi . After to be more accurate than other related approaches [18,26,101].
data normalization, module m can be defined as follows: Research question RQ2 reflects the time complexity of the
   proposed approach. Generally speaking, training a deep learning
m =< sm1 , sm2 , . . . , smn > (6)
model requires a long time. To compute the time complexity
where m is a set of log-transformed and data normalized software of the proposed approach, we recorded the training and testing

metrics (smi ). (prediction) times to reveal the time efficiency of the compared
approaches.
3.4. Defect prediction using deep learning
4.2. Datasets and metrics
We propose a deep learning neural network-based model for
defect prediction. An overview of this fully connected neural net- We reuse the software metrics extracted from the MIS dataset
work is shown in Fig. 2 consisting of one input layer, two hidden and the KC2 dataset. Notably, the selected software metrics (men-
layers, and one output layer. We preprocess the software metrics tioned in Table 1 and 2) from both datasets (MIS and KC2) are the
and input them to the neural network (as presented in Eq. (6)). The most commonly used software metrics for defect prediction [5].
neural network predicts the number of defects y for each module, We collected the MIS dataset from the compact disc that
where W1 W2 ... W3 are the weights of inputs (i.e., a set of prepro- accompanies the “Handbook of Software Reliability Engineering”
cessed software metrics) to the neurons. We set the input dimen- [102]. The MIS dataset is a commercial medical imaging system
sion, output dimension, kernel initializer and activation function consisting of 390 records, and approximately 40 0,0 0 0 lines of code
for each layer as follows: [103]. This application can be divided into 4500 modules. It is writ-
ten in Pascal, Fortran, assembly language, and PL/M (the Intel-8
• First layer: input_dim = 11, output_dim = 20, kernel_initializer
= ‘uniform’ and activation function = ‘Tanh’.
• Second layer, output_dim = 10, kernel_initializer = ‘normal’ and 1
http://www.cse.cuhk.edu.hk/∼lyu/book/reliability/.
activation function = ‘ReLU’. 2
http://promise.site.uottawa.ca/SERepository/datasets-page.html.
L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110 105

Fig. 2. Neural network overview.

Table 1 ond dimension is Halstead. The Halstead metric is commonly used


Metrics of MIS.
for defect prediction [5]. The third dimension is line count, includ-
Software metrics Descriptions ing Halstead’s line count, Halstead’s count of comment lines, Hal-
LOC Total number of code lines stead’s count of blank lines, Halstead’s code and comment count
CL Total code lines for each module. The fourth dimension, operator operand, includes
TChar Total number of characters the number of unique operators, number of unique operands, total
TComm Total comments operators and total operands. The fifth dimension is branch, which
MChar Number of comment characters
considers only the total number of branches.
DChar Number of code characters
N Halstead’s program length
NE Halstead’s estimated program length 4.3. Performance criteria
NF Jensen’s estimated program length
V(G) McCabe’s cyclomatic complexity
BW Belady’s bandwidth metric
The proposed model is evaluated on the first 80% of the mod-
ules and the last 20% of the modules (mentioned in Section 4.4).
Notably, we use the number of defects for evaluating and calculat-
Table 2 ing the mean squared error (denoted as MSE) [105] and the coef-
Metrics of KC2.
ficient of determination (denoted as R2 ) [106] to evaluate the per-
Metric type Software metrics Descriptions formance of the defect prediction model.
McCabe LOC McCabe’s code line count The number of defects is the number of defects contained in a
V(G) McCabe’s “cyclomatic complexity” software module. For a given module, our defect prediction model
EV(G) McCabe’s “essential complexity” and other state-of-the-art approaches can predict the number of
IV(G) McCabe “design complexity”
defects. The closer the number of predicted defects is to the actual
Halstead N Halstead total operators + operands
V Halstead’s “volume” for each module number of defects contained in a module, the better the perfor-
L Halstead’s program length mance of the defect prediction model.
D Halstead’s “difficulty” The MSE measures the average of the squared errors—the
I Halstead’s “intelligence” squared difference between the estimated parameter and the true
E Halstead’s “effort”
B Halstead
value of the parameter [107]. The MSE measures the quality of a
T Halstead’s time estimator prediction model and better values are non-negative and closer to
Line count LOCode Halstead’s line count zero. Thus, a smaller MSE represents better prediction model per-
LOComment Halstead’s count of comment lines formance [105]. Moreover, a smaller MSE means that the predic-
LOBlank Halstead’s count of blank lines
tion model is closer to the optimal model [108,109]. The MSE can
LOCodeAnd Halstead’s code and comment count
Comment for each module be computed as follows [110]:
Uniq_Op Unique operators
1
n
Uniq_Opnd Unique operands
MSE = (yi − yˆi )2 (7)
Operator operand Total_Op Total operators n
i=1
Total_Opnd Total operands
Branch BranchCount of the flow graph where yi represents the vector of actual values of the variable, yˆi
represents the predicted variable values, and n is the number of
data points [111].
programming language for microcomputers). The software metrics R2 measures the proportion of the variance in the dependent
of MIS are briefly introduced in Table 1. variable that is predictable from the independent variables [112],
We collected the KC2 dataset from the PROMISE software and it is calculated by squaring the correlation coefficient be-
dataset repository [104]. The KC2 dataset includes five dimen- tween the observed and predicted values in a regression model
sions, i.e., McCabe, Halstead, LineCount, Operator and Operand, and [112]. R2 represents the strength of the relationship between an
Branch. We used 21 software metrics of the KC2 dataset (e.g., LOC, independent and dependent variable(s) [106], and it measures the
V(G), EV(G), etc.) that are briefly introduced in Table 2. performance of a model. Its values range from 0 to 1. An R2
Tables 1 and 2 list the metrics in the MIS and KC2 dataset, re- value between 0 and 1 shows the extent to which the dependent
spectively. The metrics of MIS dataset include the total number of variable is predictable, which represents how well the regression
code lines (LOC), total code lines (CL),..., and Belady’s bandwidth model fits the datasets. An R2 value closer to 1 indicates that the
metric (BW). The KC2 dataset involves twenty-one software met- model has achieved a good fit to this problem and represents the
rics and five dimensions. The first dimension is McCabe. The sec- strength of the relationship between an independent and depen-
106 L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110

Table 3 Table 4
Experimental results on the complete dataset. Total number of defects in the last 20% of the modules.

Approach MIS-MSE MIS-R2 KC2-MSE KC2-R2 Dataset Training SVR FSVR DTR Proposed
set approach
SVR 47.35 0.32 0.13 0.193
FSVR 57.64 0.41 0.127 0.228 MIS-20% 2521 1924 2101 1629 2472
DTR 66.30 0.245 0.126 0.234 KC2-20% 205 172 179 58 202
Proposed 46.01 0.42 0.109 0.297

Table 5
MSE of first 80% and last 20% subsets of MIS and KC2.
dent variable(s). A larger R2 value represents a better model fit
Approach MIS-80% MIS-20% KC2-80% KC2-20%
[113]. The R2 can be computed as follows:
SVR 5.9 128.79 0.017 4.292
n
2 i=1 (yˆi − ȳ )2 FSVR 6.41 93.68 0.016 3.656
R = n (8) DTR 6.24 121.23 0.007 3.22
i=1 (yi − ȳ )2 Proposed 6.05 54.91 0.005 2.970

where yi represents the observed values of the dependent variable,


ȳ is the mean value, and yˆi is the predicted value. Table 6
MSE for training and testing subsets.
4.4. Process Approach MIS-train MIS-test KC2-train KC2-test

SVR 68.89 72.33 0.825 0.976


Feng et al. [114] proposed that a small number of modules FSVR 60.85 63.61 0.885 0.989
contain most of the defects. Similarly, Koru et al. [115] proposed DTR 57.8 63.61 0.922 1.36
that smaller modules are proportionally more defective than larger Proposed 46.56 51.07 0.754 0.958
ones. Some other empirical studies show that only a few (approxi-
mately 20%) files contain nearly 80 percent faults (i.e., Pareto prin-
The rows show the performance results of SVR, FSVR, DTR, and
ciple, 80/20 rule) [27,114,116]. Software testing engineers should al-
the proposed approach. The MSE and R2 of SVR, FSVR, DTR, and
locate more resources to the modules that contain more defects.
the proposed approach on the MIS dataset are (47.35, 0.32), (57.64,
We follow Yan et al. [18] and rank the dataset based on the
0.41), (66.30, 0.245), and (46.01, 0.42), respectively. Similarly, the
number of defects in software modules in ascending order. Finally,
MSE and R2 of SVR, FSVR, DTR, and the proposed approach on KC2
we divide the datasets into the first 80% and last 20% of modules
dataset are (0.13, 0.193), (0.127, 0.228), (0.126, 0.234), and (0.109,
based on their number of defects. We use the first 80% of modules
0.297), respectively.
for training and the last 20% of modules for prediction. For testing
Table 4 presents the number of predicted defects achieved by
data, we predict the number of defects of every module and then
the proposed approach compared to SVR, FSVR, and DTR on the last
add the number of defects to calculate the total number of defects
20% of modules of the MIS and KC2 datasets. The first column lists
based on the rule: predicting the number of defects 4 and 5 is
the datasets. Column 2 lists the total number of defects in the last
more accurate than predicting the number of defects 6 and 5 when
20% of modules in the training datasets of the MIS and KC2, and
two modules are input with 5 and 6 defects. Notably, we use cross-
columns 3–6 present the performance results of SVR, FSVR, DTR,
validation to evaluate the proposed approach on both datasets.
and the proposed approach, respectively. The rows show the per-
Cross-validation is used to flag problems such as overfitting or
formance results of these approaches on the last 20% of modules
selection bias. To validate the defect prediction model, we conduct
of the MIS and KC2 datasets, respectively. We use the training data
a k-fold (k = 10) cross-validation on two state-of-the-art publicly
to train the defect prediction model. Then, the defect prediction
available datasets. In 10-fold cross-validation, the datasets are ran-
model is used to predict the number of defects in the last 20%
domly divided into 10 equal subsets. For each fold of the 10-fold
modules. We predict the number of defects of every module of the
cross-validation, one subset is used as testing data, and the re-
testing data. Finally, we add the number of defects to calculate the
maining nine subsets are used as training data. Thus, each subset
total number of defects.
is used as testing data only once. The following process is used for
Table 5 presents the MSE results achieved by the proposed ap-
each fold of the evaluation:
proach compared to SVR, FSVR, and DTR on the first 80% of mod-
• Step 1, we train the proposed approach with the training ules and the last 20% of modules of the MIS and KC2 datasets. The
datasets and obtain the trained model. first column presents the approaches, and columns 2&3 and 4&5
• Step 2, we evaluate the trained models (the proposed approach, list the performance results of the first 80% modules and the last
SVR, FSVR, and DTR) on the testing data. 20% modules on the MIS and KC2 datasets, respectively. The rows
• Step 3, we calculate the performance metrics (i.e., MSE and R2 ) show the performance results of SVR, FSVR, DTR, and the proposed
for each of the evaluated approaches. approach.
The MIS dataset contains 78 modules in the last 20% of mod-
4.5. Results ules. We randomly select 10 samples from these 78 modules as the
testing dataset. Other modules and the first 80% of modules are re-
4.5.1. RQ1: comparison with three state-of-the-art approaches garded as the training dataset. This process is repeated 10 times for
To answer research question RQ1, we conduct a 10-fold cross- 10-fold cross-validation. We then calculate the average MSE. The
validation on two well-known datasets and compare the proposed same experiment is performed on the KC2 dataset. Table 6 presents
approach against the three state-of-the-art approaches (i.e., SVR, the average MSE achieved by the proposed approach against SVR,
FSVR, and DTR). The evaluation results are presented in Tables 3–6. FSVR, and DTR on the training and testing samples of the MIS and
Table 3 presents the performance results of SVR, FSVR, DTR, and KC datasets. The first column lists the approaches, and columns
the proposed approach on the complete dataset and we treat the 2&3 and 4&5 list the performance results with the training and
whole dataset as the training dataset. The first column lists the testing samples on the MIS and KC2 datasets, respectively. The
approaches, and columns 2&3 and 4&5 present the MSE and R2 rows show the performance results of SVR, FSVR, DTR, and the pro-
performance results on the MIS and KC2 datasets, respectively. posed approach.
L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110 107

From Tables 3–6, we can make the following observations: fect prediction models. Internal validity is concerned with various
uncontrolled internal factors that may affect the results. To mit-
• First, the proposed approach outperforms SVR, FSVR, and DTR
igate this threat, we used third-party libraries, such as packages
on both the MIS and KC2 datasets. The MSE of MIS varies from
from scikit-learn, and double checked the implementation of the
66.30 to 46.01, and the R2 of MIS varies from 0.32 to 0.42. Sim-
competing approaches.
ilarly, the MSE of KC2 varies from 0.13 to 0.109, and the R2 of
The first threat to external validity involves dataset selection.
KC2 varies from 0.193 to 0.297.
The different metrics used for the different datasets may affect the
• Second, the proposed approach outperforms SVR, FSVR, and DTR
defect prediction performance, which in turn may affect the per-
by improving the number of predicted defects from 23 = (202
formance of the proposed approach on other datasets.
−179) to 371 = (2472 − 2101).
The second threat to external validity involves the generaliz-
• Third, the proposed approach outperforms SVR, FSVR, and DTR
ability of the proposed approach. We evaluated our proposed ap-
on MIS (last 20% of modules) and KC2 (last 20% of modules and
proach on only two publicly available datasets. Thus, the results
first 80% of modules) datasets by 0.71 = (93.68 − 54.91)/54.91,
may not be generalizable to other software projects, such as com-
0.4 = (0.007 − 0.005)/0.005, and 0.08 = (3.22 − 2.97)/2.97, re-
mercial projects.
spectively. Whereas, SVR performs marginally better than the
proposed approach on the first 80% of modules of MIS by 0.03 6. Conclusions and future work
= (6.05 − 5.9)/5.9. However, the proposed approach performs
better than the other approaches (FSVR and DTR) on the first In this paper, we proposed a deep learning-based approach to
80% modules of MIS by 0.06 = (6.41 − 6.05)/6.05 and 0.03 = predict the number of defects in software modules. Our proposed
(6.24 − 6.05)/6.05, respectively. approach trains a deep learning model to predict the number of
• Finally, the proposed approach significantly outperforms the defects. On the given datasets, the performance improvement of
SVR, FSVR, and DTR models in terms of MSE on the training the proposed approach upon the SVR, FSVR, and DTR is significant.
and testing sets of both the MIS and KC2 datasets. The per- Compared with these state-of-art approaches on two well-known
formance improvement of the proposed approach against SVR, datasets, the proposed approach achieves a significant reduction in
FSVR, and DTR on all training and testing sets is 0.24 = (57.80 the mean square error (varying between 3% and 13%) and improves
− 46.56)/46.56, 0.25 = (63.61 − 51.07)/51.07, 0.09 = (0.825 − the squared correlation coefficient (varying between 2% and 27%).
0.754)/0.754, and 0.02 = (0.976 − 0.958)/0.958, respectively. In future work, we plan to investigate the number of defects
We note that our proposed approach leveraging the neural net- prediction in software modules by including more projects writ-
work for the number of defects prediction is effective (i.e., our ap- ten in different programming languages and commercial projects
proach achieves lower MSE and higher R2 ) compared to the state- from industry. We are also interested in predicting the number of
of-the-art approaches and the improvement is achieved by our ap- defects from the change level rather than from the module level.
proach for the following reasons. First, in the fully connected neu- In cooperating with software testing engineers to construct new
ral network, a fully connected layer learns features from all the software complexity metric datasets would be another interesting
combinations of the features of the previous layer. The connection research task. In the defect prediction field, additional new and
of each neuron to every neuron in the previous layer makes this commercial datasets are needed. It would also be interesting to in-
process possible, and each connection has its own weight. Sec- vestigate new performance criteria for defect prediction models. Fi-
ond, the neural network may use expressive, continuous-valued nally, we plan to investigate effort-aware defect prediction models
representations to learn and capture the discriminative and useful and attempt to determine the agent of the effort through an effort-
features powerfully from a dataset consisting of software metrics. aware defect prediction model and the relations between the effort
Third, we preprocess the dataset adequately. Fourth, we use cross- and the agent.
validation to reduce the bias of our defect prediction model.
Declaration of Competing Interest
Based on the preceding analysis, we can conclude that the pro-
posed approach is accurate and significantly outperforms the state-
The authors declare that they have no known competing finan-
of-the-art approaches.
cial interests or personal relationships that could have appeared to
influence the work reported in this paper.
4.5.2. RQ2: efficiency of the proposed approach
To answer research question RQ2 (i.e., to investigate the effi-
Acknowledgment
ciency of the deep learning-based regression model) we record the
time cost of the training and prediction processes. The evaluation
The research work described in this paper was partially sup-
results suggest that the proposed approach is efficient. On average,
ported by the National Natural Science Foundation of China
training the proposed approach is completed within 3 minutes. Us-
(61772071) and the grant from the Joint Research Fund in Astron-
ing the trained model, it takes 0.04 minutes to predict the num-
omy (U1531242) under cooperative agreement between the NSFC
ber of defects. On average, the training process for SVR, FSVR, and
and CAS.
DTR requires 0.08 minutes, 1.2 minutes and 0.9 minutes, respec-
tively. The prediction times of the proposed approach, SVR, FSVR, References
and DTR are 0.04 minutes, 0.0 0 05 minutes, 0.0 0 03 minutes, and
0.0 0 06 minutes, respectively. [1] D. Bowes, T. Hall, J. Petrić, Software defect prediction: do different classifiers
find the same defects? Softw. Qual. J. 26 (2) (2018) 525–552, doi:10.1007/
s11219- 016- 9353- 3.
5. Threats to validity [2] N.E. Fenton, M. Neil, A critique of software defect prediction models, IEEE
Trans. Softw. Eng. 25 (5) (1999) 675–689, doi:10.1109/32.815326.
[3] V.A. Prakash, D.V. Ashoka, V.N.M. Aradya, Application of data mining tech-
A threat to the construct validity of this study involves the se-
niques for defect detection and classification, in: Proceedings of the Third In-
lection of the performance criteria (i.e., MSE and R2 ) for evaluat- ternational Conference on Frontiers of Intelligent Computing: Theory and Ap-
ing the defect prediction model. We adopted these metrics because plications (FICTA) 2014, 327, Springer International Publishing, Cham, 2015,
they have been widely used in previous studies [18,101]. pp. 387–395.
[4] G. Tassey, The economic impacts of inadequate infrastructure for software
A threat to the internal validity of this study involves the repli- testing, Tech. Rep., National Institute of Standards and Technology (2002)
cation of the competing approaches and the parameters of the de- 85–87.
108 L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110

[5] J. Nam, W. Fu, S. Kim, T. Menzies, L. Tan, Heterogeneous defect prediction, International Conference on Software Engineering, Vol. 1, IEEE Press, Piscat-
IEEE Trans. Softw. Eng. 44 (9) (2018) 874–896, doi:10.1109/TSE.2017.2720603. away, NJ, USA, 2015, pp. 812–823.
[6] K.E. Bennin, J. Keung, P. Phannachitta, A. Monden, S. Mensah, Mahakil: di- [29] X. Yu, J. Liu, Z. Yang, X. Jia, Q. Ling, S. Ye, Learning from imbalanced
versity based oversampling approach to alleviate the class imbalance issue data for predicting the number of software defects, in: Proceedings of the
in software defect prediction, IEEE Trans. Softw. Eng. 44 (6) (2018) 534–550, 2017 IEEE Twenty-eighth International Symposium on Software Reliability
doi:10.1109/TSE.2017.2731766. Engineering (ISSRE), IEEE Computer Society, Los Alamitos, CA, USA, 2017,
[7] M. Yan, Y. Fang, D. Lo, X. Xia, X. Zhang, File-level defect prediction: Unsu- pp. 78–89, doi:10.1109/ISSRE.2017.18. https://doi.ieeecomputersociety.org/10.
pervised vs. supervised models, in: Proceedings of the Eleventh ACM/IEEE 1109/ISSRE.2017.18.
International Symposium on Empirical Software Engineering and Measure- [30] S.S. Rathore, S. Kumar, A decision tree regression based approach for the
ment, ESEM ’17, IEEE Press, Piscataway, NJ, USA, 2017, pp. 344–353, doi:10. number of software faults prediction, SIGSOFT Softw. Eng. Notes 41 (1) (2016)
1109/ESEM.2017.48. 1–6, doi:10.1145/2853073.2853083.
[8] Y. Kamei, T. Fukushima, S. McIntosh, K. Yamashita, N. Ubayashi, A.E. Has- [31] M. Chen, Y. Ma, An empirical study on predicting defect numbers, in:
san, Studying just-in-time defect prediction using cross-project models, Em- Proceedings of the 27th International Conference on Software Engineer-
pir. Softw. Eng. 21 (5) (2016) 2072–2106, doi:10.1007/s10664- 015- 9400- x. ing and Knowledge Engineering, SEKE’15, 2015, pp. 397–402, doi:10.18293/
[9] Y. Kamei, E. Shihab, B. Adams, A.E. Hassan, A. Mockus, A. Sinha, N. Ubayashi, SEKE2015-132.
A large-scale empirical study of just-in-time quality assurance, IEEE Trans. [32] Y. Qian, M. Bi, T. Tan, K. Yu, Very Deep Convolutional Neural Networks for
Softw. Eng. 39 (6) (2013) 757–773, doi:10.1109/TSE.2012.70. Noise Robust Speech Recognition, IEEE/ACM Transactions on Audio, Speech,
[10] X.Y. Jing, S. Ying, Z.W. Zhang, S.S. Wu, J. Liu, Dictionary learning based soft- and Language Processing 24 (12) (2016) 2263–2276, doi:10.1109/TASLP.2016.
ware defect prediction, in: Proceedings of the Thirty-Sixth International Con- 2602884.
ference on Software Engineering, ICSE 2014, ACM, New York, NY, USA, 2014, [33] T. Young, D. Hazarika, S. Poria, E. Cambria, Recent trends in deep learning
pp. 414–423. based natural language processing, IEEE Computational Intelligence Magazine
[11] M. Tan, L. Tan, S. Dara, C. Mayeux, Online defect prediction for imbalanced 13 (3) (2017) 55–75, doi:10.1109/MCI.2018.2840738.
data, in: Proceedings of the Thirty-seventh International Conference on Soft- [34] T. Goswami, Impact of deep learning in image processing and computer
ware Engineering – Volume 2, ICSE ’15, IEEE Press, Piscataway, NJ, USA, 2015, vision, in: Microelectronics, Electromagnetics and Telecommunications, 471,
pp. 99–108. http://dl.acm.org/citation.cfm?id=2819009.2819026. Springer Singapore, Singapore, 2018, pp. 475–485.
[12] A. Schröter, T. Zimmermann, A. Zeller, Predicting component failures at de- [35] Z. Wan, X. Xia, A.E. Hassan, D. Lo, J. Yin, X. Yang, Perceptions, expectations,
sign time, in: Proceedings of the 2006 ACM/IEEE International Symposium and challenges in defect prediction, IEEE Trans. Softw. Eng. (2018) 1, doi:10.
on Empirical Software Engineering, ISESE ’06, ACM, New York, NY, USA, 2006, 1109/TSE.2018.2877678.
pp. 18–27, doi:10.1145/1159733.1159739. [36] M.H. Tang, M.H. Kao, M.H. Chen, An empirical study on object-oriented
[13] H. Hata, O. Mizuno, T. Kikuno, Bug prediction based on fine-grained mod- metrics, in: Proceedings Sixth International Software Metrics Symposium
ule histories, in: Proceedings of the Thirty-forth International Conference on (Cat. No.PR00403), IEEE Computer Society, Washington, DC, USA, 1999,
Software Engineering, ICSE ’12, IEEE Press, Piscataway, NJ, USA, 2012, pp. 200– pp. 242–249.
210. http://dl.acm.org/citation.cfm?id=2337223.2337247. [37] S.R. Chidamber, C.F. Kemerer, A metrics suite for object oriented design, IEEE
[14] H. Tang, T. Lan, D. Hao, L. Zhang, Enhancing defect prediction with static Trans. Softw. Eng. 20 (6) (1994) 476–493, doi:10.1109/32.295895.
defect analysis, in: Proceedings of the Seventh Asia-Pacific Symposium on [38] T. Yu, W. Wen, X. Han, J. Hayes, Conpredictor: concurrency defect prediction
Internetware, Internetware ’15, ACM, New York, NY, USA, 2015, pp. 43–51, in real-world applications, IEEE Trans. Softw. Eng. (2018) , 1–1, doi:10.1109/
doi:10.1145/2875913.2875922. TSE.2018.2791521.
[15] M.H. Halstead, Elements of Software Science (Operating and Programming [39] T. Zimmermann, R. Premraj, A. Zeller, Predicting defects for eclipse, in: Pro-
Systems Series), Elsevier Science Inc., New York, NY, USA, 1977. ceedings of the Third International Workshop on Predictor Models in Soft-
[16] T.J. McCabe, A complexity measure, in: Proceedings of the Second Interna- ware Engineering (PROMISE’07: ICSE Workshops 2007), IEEE Computer Soci-
tional Conference on Software Engineering, ICSE ’76, IEEE Computer Society ety, Washington, DC, USA, 2007, p. 99, doi:10.1109/PROMISE.2007.10.
Press, Piscataway, NJ, USA, 1976, pp. 308–320, doi:10.1109/TSE.1976.233837. [40] C. Catal, B. Diri, A systematic review of software fault prediction studies, Ex-
[17] F. Xing, P. Guo, Support vector regression for software reliability growth mod- pert Syst. Appl. 36 (4) (2009) 7346–7354, doi:10.1016/j.eswa.2008.10.027.
eling and prediction, in: Proceedings of the Advances in Neural Networks [41] M. D’Ambros, M. Lanza, R. Robbes, Evaluating defect prediction approaches:
– ISNN 2005, 3496, Springer Berlin Heidelberg, Berlin, Heidelberg, 2005, a benchmark and an extensive comparison, Empir. Softw. Eng. 17 (4) (2012)
pp. 925–930. 531–577, doi:10.1007/s10664- 011- 9173- 9.
[18] Z. Yan, X. Chen, P. Guo, Software defect prediction using fuzzy support vector [42] J. Nam, S. Kim, Clami: Defect prediction on unlabeled datasets (t), in: Pro-
regression, in: Proceedings of the Advances in Neural Networks – ISNN 2010, ceedings of the 2015 Thirtieth IEEE/ACM International Conference on Au-
Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 17–24. tomated Software Engineering (ASE), IEEE Press, Piscataway, NJ, USA, 2015,
[19] K.E. Bennin, K. Toda, Y. Kamei, J. Keung, A. Monden, N. Ubayashi, Empiri- pp. 452–463, doi:10.1109/ASE.2015.56.
cal evaluation of cross-release effort-aware defect prediction models, in: Pro- [43] R. Ozakinci, A. Tarhan, The role of process in early software defect predic-
ceedings of the 2016 IEEE International Conference on Software Quality, Reli- tion: methods, attributes and metrics, in: Software Process Improvement and
ability and Security (QRS), IEEE, 2016, pp. 214–221. Capability Determination, 609, Springer International Publishing, Cham, 2016,
[20] S. Amasaki, Cross-version defect prediction using cross-project defect predic- pp. 287–300.
tion approaches: Does it work? in: Proceedings of the Fourteenth Interna- [44] F. Zhang, Q. Zheng, Y. Zou, A.E. Hassan, Cross-project defect prediction us-
tional Conference on Predictive Models and Data Analytics in Software Engi- ing a connectivity-based unsupervised classifier, in: Proceedings of the 2016
neering, PROMISE’18, ACM, New York, NY, USA, 2018, pp. 32–41, doi:10.1145/ IEEE/ACM Thirty-Eighth International Conference on Software Engineering
3273934.3273938. (ICSE), ACM, New York, NY, USA, 2016, pp. 309–320, doi:10.1145/2884781.
[21] S. Lessmann, B. Baesens, C. Mues, S. Pietsch, Benchmarking classification 2884839.
models for software defect prediction: a proposed framework and novel find- [45] Y. Yang, Y. Zhou, J. Liu, Y. Zhao, H. Lu, L. Xu, B. Xu, H. Leung, Effort-aware just-
ings, IEEE Trans. Softw. Eng. 34 (4) (2008) 485–496, doi:10.1109/TSE.2008.35. in-time defect prediction: Simple unsupervised models could be better than
[22] T. Jiang, L. Tan, S. Kim, Personalized defect prediction, in: Proceedings of the supervised models, in: Proceedings of the 2016 Twenty-forth ACM SIGSOFT
2013 Twenty-Eighth IEEE/ACM International Conference on Automated Soft- International Symposium on Foundations of Software Engineering, FSE 2016,
ware Engineering (ASE), IEEE Press, Piscataway, NJ, USA, 2013, pp. 279–289. ACM, New York, NY, USA, 2016, pp. 157–168, doi:10.1145/2950290.2950353.
[23] T. Menzies, A. Butcher, D. Cok, A. Marcus, L. Layman, F. Shull, B. Turhan, [46] P. Guo, M. Lyu, A pseudoinverse learning algorithm for feedforward neu-
T. Zimmermann, Local versus global lessons for defect prediction and effort ral networks with stacked generalization applications to software reliability
estimation, IEEE Trans. Softw. Eng. 39 (6) (2013) 822–834, doi:10.1109/TSE. growth data, Neurocomputing 56 (1) (2004) 101–121.
2012.83. [47] F. Xing, P. Guo, M.R. Lyu, A novel method for early software quality pre-
[24] E. Kocaguneli, T. Menzies, A. Bener, J.W. Keung, Exploiting the essential as- diction based on support vector machine, in: Proceedings of the Sixteenth
sumptions of analogy-based effort estimation, IEEE Trans. Softw. Eng. 38 (2) IEEE International Symposium on Software Reliability Engineering (ISSRE’05),
(2012) 425–438, doi:10.1109/TSE.2011.27. IEEE Computer Society, Washington, DC, USA, 2005, pp. 213–222, doi:10.1109/
[25] B. Ghotra, S. McIntosh, A.E. Hassan, Revisiting the impact of classification ISSRE.2005.6.
techniques on the performance of defect prediction models, in: Proceed- [48] S. Zhong, T.M. Khoshgoftaar, N. Seliya, Unsupervised learning for expert-based
ings of the Thirty-seventh International Conference on Software Engineering software quality estimation, in: Proceedings of the Eighth IEEE International
– Volume 1, ICSE ’15, IEEE Press, Piscataway, NJ, USA, 2015, pp. 789–800. Conference on High Assurance Systems Engineering, IEEE Computer Society,
http://dl.acm.org/citation.cfm?id=2818754.2818850. Washington, DC, USA, 2004, pp. 149–155. http://dl.acm.org/citation.cfm?id=
[26] S.S. Rathore, S. Kumar, An empirical study of some software fault prediction 1890580.1890595.
techniques for the number of faults prediction, Soft Comput. 21 (24) (2017) [49] S. Herbold, A. Trautsch, J. Grabowski, A comparative study to benchmark
7417–7434, doi:10.10 07/s0 050 0- 016- 2284- x. cross-project defect prediction approaches, IEEE Trans. Softw. Eng. 44 (9)
[27] T.J. Ostrand, E.J. Weyuker, R.M. Bell, Predicting the location and number of (2018) 811–833, doi:10.1109/TSE.2017.2724538.
faults in large software systems, IEEE Trans. Softw. Eng. 31 (4) (2005) 340– [50] A.E.C. Cruz, K. Ochimizu, Towards logistic regression models for predict-
355, doi:10.1109/TSE.2005.49. ing fault-prone code across software projects, in: Proceedings of the 2009
[28] C. Tantithamthavorn, S. McIntosh, A.E. Hassan, A. Ihara, K. Matsumoto, The Third International Symposium on Empirical Software Engineering and Mea-
impact of mislabelling on the performance and interpretation of defect pre- surement, IEEE Computer Society, Washington, DC, USA, 2009, pp. 460–463,
diction models, in: Proceedings of the 2015 IEEE/ACM Thirty-Seventh IEEE doi:10.1109/ESEM.20 09.53160 02.
L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110 109

[51] I.H. Laradji, M. Alshayeb, L. Ghouti, Software Defect Prediction Using Ensem- Proceedings of the 2018 Twenty-sixth ACM Joint Meeting on European Soft-
ble Learning on Selected Features, Inf. Softw. Technol. (2015) 388–402. ware Engineering Conference and Symposium on the Foundations of Software
[52] C. Tantithamthavorn, S. McIntosh, A.E. Hassan, K. Matsumoto, Automated pa- Engineering, ESEC/FSE 2018, ACM, New York, NY, USA, 2018, pp. 175–186,
rameter optimization of classification techniques for defect prediction mod- doi:10.1145/3236024.3236082.
els, in: 2016 IEEE/ACM 38th International Conference on Software Engineer- [77] J. Guo, J. Cheng, J. Cleland-Huang, Semantically enhanced software traceability
ing (ICSE), ACM, New York, NY, USA, 2016, pp. 321–332. using deep learning techniques, in: Proceedings of the 2017 IEEE/ACM Thir-
[53] C. Tantithamthavorn, S. McIntosh, A.E. Hassan, K. Matsumoto, The impact of ty-Ninth International Conference on Software Engineering (ICSE), IEEE Press,
automated parameter optimization on defect prediction models, IEEE Trans. Piscataway, NJ, USA, 2017, pp. 3–14.
Softw. Eng. (2018) 11, doi:10.1109/TSE.2018.2794977. [78] M. White, C. Vendome, M. Linares-Vásquez, D. Poshyvanyk, Toward deep
[54] B. Ghotra, S. Mcintosh, A.E. Hassan, A large-scale study of the impact of fea- learning software repositories, in: Proceedings of the Twelfth Working Con-
ture selection techniques on defect classification models, in: Proceedings of ference on Mining Software Repositories, MSR ’15, IEEE Press, Piscataway, NJ,
the Forteenth International Conference on Mining Software Repositories, MSR USA, 2015, pp. 334–345.
’17, IEEE Press, Piscataway, NJ, USA, 2017, pp. 146–157, doi:10.1109/MSR.2017. [79] M. Balog, A.L. Gaunt, M. Brockschmidt, S. Nowozin, D. Tarlow, Deepcoder:
18. Learning to write programs, in: 5th International Conference on Learning
[55] P. He, B. Li, X. Liu, J. Chen, Y. Ma, An empirical study on software defect pre- Representations, ICLR 2017, Toulon, France, 2017, pp. 1–19.
diction with a simplified metric set, Inf. Softw. Technol. 59 (C) (2015) 170– [80] Y. Tian, K. Pei, S. Jana, B. Ray, Deeptest: Automated testing of deep-neural-
190, doi:10.1016/j.infsof.2014.11.006. network-driven autonomous cars, in: Proceedings of the Fortieth International
[56] Y. Zhang, D. Lo, X. Xia, J. Sun, Combined classifier for cross-project defect pre- Conference on Software Engineering, ICSE ’18, ACM, New York, NY, USA, 2018,
diction: an extended empirical study, Front. Comput. Sci. 12 (2) (2018) 280– pp. 303–314, doi:10.1145/3180155.3180220.
296, doi:10.1007/s11704-017-6015-y. [81] X. Huo, M. Li, Z.H. Zhou, Learning unified features from natural and pro-
[57] A. Mockus, D.M. Weiss, Predicting risk of software changes, Bell Labs Tech. J. gramming languages for locating buggy source code, in: Proceedings of the
5 (2) (20 0 0) 169–180, doi:10.1002/bltj.2229. Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16,
[58] N. Ohlsson, H. Alberg, Predicting fault-prone software modules in telephone AAAI Press, 2016, pp. 1606–1612. http://dl.acm.org/citation.cfm?id=3060832.
switches, IEEE Trans. Softw. Eng. 22 (12) (1996) 886–894, doi:10.1109/32. 3060845.
553637. [82] X. Gu, H. Zhang, D. Zhang, S. Kim, Deep api learning, in: Proceedings of the
[59] X. Yang, K. Tang, X. Yao, A learning-to-rank approach to software defect pre- 2016 Twenty-forth ACM SIGSOFT International Symposium on Foundations of
diction, IEEE Transactions on Reliability 64 (1) (2015) 234–246, doi:10.1109/ Software Engineering, FSE 2016, ACM, New York, NY, USA, 2016, pp. 631–642,
TR.2014.2370891. doi:10.1145/2950290.2950334.
[60] E.A. Felix, S.P. Lee, Integrated approach to software defect prediction, IEEE Ac- [83] M. White, M. Tufano, C. Vendome, D. Poshyvanyk, Deep learning code frag-
cess 5 (2017) 21524–21547, doi:10.1109/ACCESS.2017.2759180. ments for code clone detection, in: 2016 31st IEEE/ACM International Con-
[61] V. Vapnik, S.E. Golowich, A. Smola, Support vector method for function ap- ference on Automated Software Engineering (ASE), ACM, New York, NY, USA,
proximation, regression estimation and signal processing, in: Proceedings of 2016, pp. 87–98. http://doi.acm.org/10.1145/2970276.2970326.
the 9th International Conference on Neural Information Processing Systems, [84] B. Xu, D. Ye, Z. Xing, X. Xia, G. Chen, S. Li, Predicting semantically link-
MIT Press, Cambridge, MA, USA, 1996, pp. 281–287. http://dl.acm.org/citation. able knowledge in developer online forums via convolutional neural network,
cfm?id=2998981.2999021. in: Proceedings of the 2016 31st IEEE/ACM International Conference on Au-
[62] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, Berlin, tomated Software Engineering, ACM, New York, NY, USA, 2016, pp. 51–62.
Heidelberg, 1995, pp. 267–290. http://doi.acm.org/10.1145/2970276.2970357.
[63] H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, V. Vapnik, Support vector re- [85] A.N. Lam, A.T. Nguyen, H.A. Nguyen, T.N. Nguyen, Combining deep learning
gression machines, in: Proceedings of the 9th International Conference on with information retrieval to localize buggy files for bug reports (n), in: Pro-
Neural Information Processing Systems, MIT Press, Cambridge, MA, USA, 1996, ceedings of the 2015 30th IEEE/ACM International Conference on Automated
pp. 155–161. http://dl.acm.org/citation.cfm?id=2998981.2999003. Software Engineering (ASE), IEEE Press, Piscataway, NJ, USA, 2015, pp. 476–
[64] S.W. Lee, D.S. Kim, M.G. Na, Prediction of dnbr using fuzzy support vector re- 481. https://doi.org/10.1109/ASE.2015.73.
gression and uncertainty analysis, IEEE Trans. Nuclear Sci. 57 (3) (2010) 1595– [86] L. Mou, G. Li, L. Zhang, T. Wang, Z. Jin, Convolutional neural networks over
1601, doi:10.1109/TNS.2010.2047265. tree structures for programming language processing, in: Proceedings of the
[65] T. Le, D. Tran, W. Ma, D. Sharma, A new fuzzy membership computation Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, AAAI Press, 2016,
method for fuzzy support vector machines, in: Proceedings of the Inter- pp. 1287–1293. http://dl.acm.org/citation.cfm?id=3015812.3016002.
national Conference on Communications and Electronics 2010, IEEE, 2010, [87] X. Gu, H. Zhang, S. Kim, Deep code search, in: Proceedings of the Fortieth
pp. 153–157. International Conference on Software Engineering, ICSE ’18, ACM, New York,
[66] Y. Zhang, S. Xu, K. Chen, Z. Liu, C.L.P. Chen, Fuzzy density weight- NY, USA, 2018, pp. 933–944, doi:10.1145/3180155.3180167.
based support vector regression for image denoising, Inf. Sci. 339 (2016) [88] V.J. Hellendoorn, C. Bird, E.T. Barr, M. Allamanis, Deep learning type infer-
175–188, doi:10.1016/j.ins.2016.01.007. http://www.sciencedirect.com/science/ ence, in: Proceedings of the 2018 Twentieth ACM Joint Meeting on European
article/pii/S0 020 0255160 0 0 098. Software Engineering Conference and Symposium on the Foundations of Soft-
[67] Z. Sun, Y. Sun, Fuzzy support vector machine for regression estimation, in: ware Engineering, ESEC/FSE 2018, ACM, New York, NY, USA, 2018, pp. 152–
Proceedings of the 2003 IEEE International Conference on Systems, Man and 162, doi:10.1145/3236024.3236051.
Cybernetics SMC’03. Conference Theme – System Security and Assurance [89] J. Henkel, S.K. Lahiri, B. Liblit, T. Reps, Code vectors: Understanding programs
(Cat. No.03CH37483), Vol. 4, 2003, pp. 3336–3341, doi:10.1109/ICSMC.2003. through embedded abstracted symbolic traces, in: Proceedings of the 2018
1244404. Twenty-sixth ACM Joint Meeting on European Software Engineering Confer-
[68] Z. Liu, S. Xu, C.L.P. Chen, Y. Zhang, X. Chen, Y. Wang, A three-domain fuzzy ence and Symposium on the Foundations of Software Engineering, ESEC/FSE
support vector regression for image denoising and experimental studies, IEEE 2018, ACM, New York, NY, USA, 2018, pp. 163–174, doi:10.1145/3236024.
Trans. Cybern. 44 (4) (2014) 516–525, doi:10.1109/TSMCC.2013.2258337. 3236085.
[69] L. Chen, M. Zhou, M. Wu, J. She, Z. Liu, F. Dong, K. Hirota, Three-layer [90] Q. Huang, X. Xia, D. Lo, Supervised vs unsupervised models: A holistic look
weighted fuzzy support vector regression for emotional intention under- at effort-aware just-in-time defect prediction, in: Proceedings of the 2017
standing in humancrobot interaction, IEEE Trans. Fuzzy Syst. 26 (5) (2018) IEEE International Conference on Software Maintenance and Evolution (IC-
2524–2538, doi:10.1109/TFUZZ.2018.2809691. SME), IEEE, 2017, pp. 159–170.
[70] J.R. Quinlan, Induction of decision trees, Mach. Learn. 1 (1) (1986) 81–106, [91] F.R. Porto, L.L. Minku, E. Mendes, A. Simão, A systematic study of cross-project
doi:10.10 07/BF0 0116251. defect prediction with meta-learning, ArXiv abs/1802.06025 (2018) 6–7.
[71] S. Naeem, T. M. Khoshgoftaar, The use of decision trees for cost-sensitive clas- [92] C. Feng, H. Wang, N. Lu, T. Chen, H. He, Y. Lu, X.M. Tu, Log-transformation
sification: an empirical study in software quality prediction, Wiley Interdis- and its implications for data analysis, Shanghai Archives of Psychiatry 26 (2)
cipl. Rev. Data Min. Knowl. Discov. 1 (5) (2011) 448–459, doi:10.1002/widm. (2014) 105–109, doi:10.3969/j.issn.10 02-0829.2014.02.0 09.
38. [93] S.G.K. Patro, K.K. Sahu, Normalization: a preprocessing stage, ArXiv
[72] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep con- abs/1503.06462 (2015) 1–2.
volutional neural networks, Communications of the ACM 60 (6) (2017) 84–90, [94] J. Pan, Y. Zhuang, S. Fong, The impact of data normalization on stock market
doi:10.1145/3065386. prediction: using SVM and technical indicators, in: Proceedings of the 2016
[73] J.L. Elman, Finding structure in time, Cognit. Sci. 14 (2) (1990) 179–211, doi:10. Second International Conference Soft Computing in Data Science (SCDS 2016),
1016/0364-0213(90)90 0 02-E. 652, Springer Singapore, Singapore, 2016, pp. 72–78.
[74] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. 9 (8) [95] J. Nam, S.J. Pan, S. Kim, Transfer defect learning, in: Proceedings of the
(1997) 1735–1780, doi:10.1162/neco.1997.9.8.1735. 2013 35th International Conference on Software Engineering (ICSE), IEEE
[75] G. Zhao, J. Huang, Deepsim: Deep learning code functional similarity, in: Pro- Press, Piscataway, NJ, USA, 2013, pp. 382–391. http://dl.acm.org/citation.cfm?
ceedings of the 2018 Twenty-sixth ACM Joint Meeting on European Soft- id=2486788.2486839.
ware Engineering Conference and Symposium on the Foundations of Soft- [96] V. Gajera, Shubham, R. Gupta, P.K. Jana, An effective multi-objective task
ware Engineering, ESEC/FSE 2018, ACM, New York, NY, USA, 2018, pp. 141– scheduling algorithm using min-max normalization in cloud computing, in:
151, doi:10.1145/3236024.3236068. Proceedings of the 2016 Second International Conference on Applied and
[76] S. Ma, Y. Liu, W.-C. Lee, X. Zhang, A. Grama, Mode: Automated neural net- Theoretical Computing and Communication Technology (iCATccT), IEEE, 2016,
work model debugging via state differential analysis and input selection, in: pp. 812–816.
110 L. Qiao, X. Li and Q. Umer et al. / Neurocomputing 385 (2020) 100–110

[97] Y.K. Jain, S.K. Bhandare, Min max normalization based data perturbation Lei Qiao received B.S. and M.S. degrees in computer sci-
method for privacy protection, Int. J. Comput. Commun. Technol. 3 (4) (2014) ence and technology from North China University of Tech-
45–47. nology, Beijing, China. He is currently pursuing a Ph.D.
[98] S. Bhanja, A. Das, Impact of data normalization on deep neural network for degree in computer science and technology at Beijing In-
time series forecasting, ArXiv abs/1812.05519 (2018) 2–3. stitute of Technology, Beijing, China. His research inter-
[99] P. Sane, R. Agrawal, Pixel normalization from numeric data as input to neural ests include defect prediction, software testing, machine
networks: for machine learning and image processing, in: 2017 International learning and neural networks in software engineering.
Conference on Wireless Communications, Signal Processing and Networking
(WiSPNET), IEEE, 2017, pp. 2221–2225.
[100] J. Sola, J. Sevilla, Importance of input data normalization for the application
of neural networks to complex industrial problems, IEEE Trans. Nuclear Sci.
44 (3) (1997) 1464–1468, doi:10.1109/23.589532.
[101] X. Jin, Z. Liu, R. Bie, G. Zhao, J. Ma, Support vector machines for regression
Xuesong Li received his Ph.D. degree from the Center for
and applications to software quality prediction, in: Proceedings of the 6th In-
Biomedical Imaging Research, Tsinghua University, and he
ternational Conference on Computational Science, Springer Berlin Heidelberg,
is an assistant professor at Beijing Institute of Technol-
Berlin, Heidelberg, 2006, pp. 781–788.
ogy. His academic and clinical focus is the use of algo-
[102] M.R. Lyu, Handbook of Software Reliability Engineering, McGraw-Hill, Inc.,
rithms (deep learning, machine learning, big data anal-
Hightstown, NJ, USA, 1996.
ysis, etc.), advanced medical imaging techniques (sMRI,
[103] S. Dick, A. Meeks, M. Last, H. Bunke, A. Kandel, Data mining in software met-
DTI, fMRI, etc.) and software defect prediction technology
rics databases, Fuzzy Sets and Systems 145 (1) (2004) 81–110, doi:10.1016/j.
to improve clinical practice for neurological diseases.
fss.20 03.10.0 06.
[104] Promise software engineering repository. http://promise.site.uottawa.ca/
SERepository/datasets-page.html.
[105] M. Torabi, J. Rao, Estimation of mean squared error of model-based estimators
of small area means under a nested error linear regression model, J. Multivar.
Anal. 117 (2013) 76–87, doi:10.1016/j.jmva.2013.02.008.
Qasim Umer received a B.S. degree in computer science
[106] T.O. Kvalseth, Cautionary note about r2 , The American Statistician 39 (4)
from Punjab University, Pakistan in 2006, an M.S. degree
(1985) 279–285, doi:10.1080/0 0 031305.1985.10479448.
in.net distributed system development from University of
[107] Z. Wang, A.C. Bovik, Mean squared error: love it or leave it? a new look
Hull, UK in 2009, and an M.S. degree in computer sci-
at signal fidelity measures, IEEE Signal Process. Mag. 26 (1) (2009) 98–117,
ence from University of Hull, UK in 2012. He is currently
doi:10.1109/MSP.2008.930649.
pursuing a Ph.D. degree in computer science from Bei-
[108] J.E. Figueroa-López, C. Mancini, Optimum thresholding using mean and con-
jing Institute of Technology, China. He is particularly in-
ditional mean squared error, J. Econometr. 208 (2019) 179–210, doi:10.1016/j.
terested in machine learning, data mining and software
jeconom.2018. 09.011. http://www.jstor.org/stable/24310360.
maintenance.
[109] R.C. Steorts, M. Ghosh, On estimation of mean squared errors of benchmarked
empirical Bayes estimators, Stat. Sin. 23 (2) (2013) 749–763.
[110] R. Niu, J. Lu, False information detection with minimum mean squared er-
rors for Bayesian estimation, in: Proceedings of the 2015 Forty-Ninth Annual
Conference on Information Sciences and Systems (CISS), IEEE, 2015, pp. 1–6.
Ping Guo currently is a professor at the school of sys-
[111] R. Hawkes, P. Date, A statistical test for the mean squared error, J. Stat. Com-
tems science in Beijing Normal University, the founding
put. Simul. 63 (4) (1999) 321–347, doi:10.1080/00949659908811960.
Director of the Key Laboratory of graphics, image and
[112] D.L.J. Alexander, A. Tropsha, D.A. Winkler, Beware of R2 : simple, unambiguous
pattern recognition at Beijing Normal University. He is
assessment of the prediction accuracy of qsar and qspr models, J. Chem. Inf.
the IEEE senior member and CCF senior member. His re-
Model. 55 (7) (2015) 1316–1322, doi:10.1021/acs.jcim.5b00206.
search interests include computational intelligence theory
[113] A. Schneider, G. Hommel, M. Blettner, Linear regression analysis: part 14 of
as well as applications in pattern recognition, image pro-
a series on evaluation of scientific publications, Deutsches Arzteblatt Int. 107
cessing, software reliability engineering, and astronomi-
(44) (2010) 776–782, doi:10.3238/arztebl.2010.0776.
cal data processing. He has published more than 360 pa-
[114] N.E. Fenton, N. Ohlsson, Quantitative analysis of faults and failures in a
pers, hold 6 patents, and is the author of two Chinese
complex software system, IEEE Trans. Softw. Eng. 26 (8) (20 0 0) 797–814,
books:“Computational intelligence in software reliability en-
doi:10.1109/32.879815.
gineering”, and “Image semantic analysis”. He received
[115] A.G. Koru, D. Zhang, K. El Emam, H. Liu, An investigation into the functional
2012 Beijing municipal government award of science and
form of the size-defect relationship for software modules, IEEE Trans. Softw.
technology (third rank) entitled “regularization method and its application”. Professor
Eng. 35 (2) (2009) 293–304, doi:10.1109/TSE.2008.90.
Guo received his master’s degree in optics from the Department of physics, Peking
[116] T. Galinac Grbac, P. Runeson, D. Huljenić, A second replicated quantitative
University, and received his PhD degree in computer science from the Department
analysis of fault distributions in complex software systems, IEEE Trans. Softw.
of computer science and engineering, Chinese University Hong Kong.
Eng. 39 (4) (2013) 462–476, doi:10.1109/TSE.2012.46.

You might also like