Development of Integrated Neural Network Model
Development of Integrated Neural Network Model
Research Article
Development of Integrated Neural Network Model for
Identification of Fake Reviews in E-Commerce Using
Multidomain Datasets
Received 28 February 2021; Revised 20 March 2021; Accepted 5 April 2021; Published 15 April 2021
Copyright © 2021 Saleh Nagi Alsubari et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
Online product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services,
the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products
and make purchasing decisions. Nevertheless, it is possible to enhance or hamper specific E-business products by posting fake
reviews, which can be written by persons called fraudsters. These reviews can cause financial loss to E-commerce businesses and
misguide consumers to take the wrong decision to search for alternative products. Thus, developing a fake review detection
system is ultimately required for E-commerce business. The proposed methodology has used four standard fake review datasets
of multidomains include hotels, restaurants, Yelp, and Amazon. Further, preprocessing methods such as stopword removal,
punctuation removal, and tokenization have performed as well as padding sequence method for making the input sequence has
fixed length during training, validation, and testing the model. As this methodology uses different sizes of datasets, various input
word-embedding matrices of n-gram features of the review’s text are developed and created with help of word-embedding layer
that is one component of the proposed model. Convolutional and max-pooling layers of the CNN technique are implemented
for dimensionality reduction and feature extraction, respectively. Based on gate mechanisms, the LSTM layer is combined with
the CNN technique for learning and handling the contextual information of n-gram features of the review’s text. Finally, a
sigmoid activation function as the last layer of the proposed model receives the input sequences from the previous layer and
performs binary classification task of review text into fake or truthful. In this paper, the proposed CNN-LSTM model was
evaluated in two types of experiments, in-domain and cross-domain experiments. For an in-domain experiment, the model is
applied on each dataset individually, while in the case of a cross-domain experiment, all datasets are gathered and put into a
single data frame and evaluated entirely. The testing results of the model in-domain experiment datasets were 77%, 85%, 86%,
and 87% in the terms of accuracy for restaurant, hotel, Yelp, and Amazon datasets, respectively. Concerning the cross-domain
experiment, the proposed model has attained 89% accuracy. Furthermore, comparative analysis of the results of in-domain
experiments with existing approaches has been done based on accuracy metric and, it is observed that the proposed model
outperformed the compared methods.
customer reviews to detect product issues and to discover for detecting and discriminating between fake and truthful
business intelligence knowledge about their opponents [1]. reviews in online E-commerce websites. In order to miti-
Fraudsters post fake comments termed misleading reviews gate the problems of online review mining systems, it is
to affect business by manipulating potential reputation of necessary for developing a model to detect and eliminate
product brands. Fake reviews are divided into 3 types: (1) online fake reviews due to their effect on customers and
untrusted (fake) reviews, (2) review on product name only, E-commerce companies.
and (3) nonreviews. The fake reviews are posted deliberately
to mislead and deceive buyers and consumers. These reviews 2. Literature Review
contain unjust positive reviews for particular desired prod-
ucts to promote them and provide unfavorable reviews to This section sheds light on methods and datasets used in pre-
worthy products for deprecating. Hyperactive fake reviews vious studies for fake/spam review detection. Online product
are linked to this type of review. Reviews on products brand reviews are defined as guidelines that are widely used by a
only are the second version of fake reviews that can be potential customer to make online purchasing that involves
created to manipulate the brands of products. Nonreviews choosing or not to purchase a particular product, identifying
are composed of two subsets, namely, (a) advertisement the problems of manufacturing companies’ products, and
and (b) unrelated reviews [2]. Larger amounts of positive gaining intelligent information of their competitors in mar-
reviews lead to making the shoppers and customers buy keting research. Recently media news from the New York
products and enhance companies’ financial benefits, whereas Times and the BBC have reported that counterfeit reviews
negative reviews can make customers to search for substitute are very widespread on E-commerce, for example, a photog-
products that way resulting in revenue loss. However, a sig- raphy company has recently been targeted by fake reviews of
nificant number of review comments are generated across thousands of fraudulent [11]. Over the last two decades,
social media applications, adding complications for extract- fake/spam review detection has become a popular topic of
ing views and difficulty in obtaining accurate findings. In study. Since fake reviews have such a significant effect on E-
addition, there is no monitoring on the reliability of digital commerce and customers, several researchers have conducted
content generated on the E-commerce websites, and this several types of research on spam/fake review analysis.
encourages the creation of several low-quality reviews possi-
ble. Various companies hire persons to write fake reviews for 2.1. Fake Review Detection Based on Machine Learning
rising the purchasing of their online products and services. Methods. Jindal et al. [2] have presented first research
Such persons are known as fake reviewers or spammers, towards spam review detection. The authors dealt with dupli-
and the activities they perform are called review faking [3]. cate or near-duplicate in Amazon product reviews as fake
Therefore, the existence of fake and spam reviews makes reviews that were comprised attributes regarding the review
the issue more considerable to be handled because they affect text and reviewer. It has been applied the logistic regression
the possible changing of buying decision to customers and technique for classifying reviews into truthful or fake with
shoppers. A huge amount of positive reviews enable a con- reaching 78% in the terms of accuracy.
sumer to purchase a product and improve the manufacture’s Ott et al. [10] have utilized the crowdsourcing website
financial profits, whereas negative reviews encourage (Amazon Mechanical Turk) to create a dataset, and the nat-
consumers to search for substitutes and therefore causing ural language processing tool was also used to obtain linguis-
financial losses [3, 4]. Consumer-generated reviews can get tic features from the review contents. They trained and
a huge influence on the reputation of products and brands, compared several types of supervised machine learning tech-
and hence, E-business companies would be motivated to pro- niques. However, the obtained results on real market datasets
duce positive fake reviews over their products and negative have not been very good. Lau et al. [11] have presented model
deceptive reviews over their competitors’ products [5–7]. for fake opinion identification using an LDA algorithm,
Electronic commerce sites have numerous ways of spamming namely, Topic Spam that can categorize the text of the review
with spam reviews, for instance, hiring expert persons who by calculating the likelihood of spam index to the little
are specialized in generating fraud reviews, utilizing crowd- dissimilarity between the distribution of the keywords of
sourcing websites to utilize review fraudsters, and using auto- the spam and the nonspam reviews.
mation tool bots for feedback [8, 9]. The capability of vendors Shojaee et al. [12] have proposed syntactic grammar and
to produce misleading opinions as a way of either promoting lexical-based attributes named stylometric attributes. These
their products or defame the reputation of their competitors attributes are utilized to distinguish fake reviews from online
is indeed worth remembering. Fake reviews have a tremen- hotel reviews. Using lexical features, the authors imple-
dous influence on consumer satisfaction. For example, when mented SMO (sequential minimal optimization) and Naive
a consumer is tricked or mislead via a fake review, a Bayes methods for classifying the reviews into fake or truthful
consumer will not utilize that E-commerce website again and the obtained results were 81% and 70% in the terms of
for purchasing. Ott et al. [10] reported that about 57% is F1-score, respectively. However, then, they have enhanced
the total average of testing accuracy of human judges for the performance of the model by merging lexical and syntac-
distinguishing fake reviews from truthful ones; therefore, tic features, and the SMO technique attained 84% F1-score.
further research is required in identifying misleading (fake) Xu and Zhao [13] suggested a parsing tree-based model for
reviews. The limitations of existing studies of fake/decepti- detecting and classifying fake reviews. They used textual fea-
ve/spam review detection are proposing automated methods tures of the review text that were taken out from the parsing
Applied Bionics and Biomechanics 3
tree by using syntactic analysis and implemented them to the results were 82% and 81% in terms of accuracy for DFFN and
model for identifying fake reviews. They just concentrated on CNN methods, respectively. Goswami et al. [20, 21] have
textual features and ignored behavioral features. Allahbakhsh proposed Artificial Neural Network model to investigate
et al. [14] have examined the involvement of reviewers who the influences of social relations of reviewers for deception
place prejudiced score reviews on online rating classification recognition at online customer reviews, and in their experi-
systems collected through some attributes that can assist to ment, Yelp’s review dataset was gathered and preprocessed.
point out a set of spammers. In their model, they utilized Then, they mined behavioral and social relation features of
the Amazon log (AMZLog) with its dataset for carrying out customers and applied the backpropagation neural network
the experiments. Duan and Yang [15] explored fake review algorithm for classifying reviews into genuine and fake with
identification based on reviews of the hotels. Through their a detection rate of 95% accuracy. Ren and Ji [22] have pro-
method, they measured and used three features of the review posed a hyper deep learning model that is consisted of a gated
text for detecting spam actions, general score, subscore, and recurrent neural network and convolutional neural network
review content. Feng et al. [16] have concentrated on dissem- (GRNN-CNN) for detecting deceptive opinion spam on in-
ination footprints of reviewers and giving an association domain and in-cross domain datasets. They used doctors,
between distribution abnormalities and spammer’s actions. restaurants, and hotels with a size of 432, 720, and 1280
Using the Markov model, they assessed the product review reviews, respectively. By combining all these datasets, they
dataset collected from the Trip Advisor website. Barbado applied their proposed method for classification of the
et al. [17] have proposed framework of significant features reviews into spam and nonspam reviews. The best classifica-
for deceptive review detection. Based on online Yelp product tion result obtained was 83% in terms of accuracy. Using the
reviews, they carried out experiments using different super- same datasets used in [22], Zeng et al. [23] have proposed a
vised machine learning techniques. In terms of features, recurrent neural network-bidirectional long-short technique
reviewer (personal, social, review activity, and trust) and for deceptive review detection. They divided the review text
review features (sentiment score) were used. Their experi- into three parts: a first sentence, middle context, and last
mental result showed that the AdaBoost algorithm provided sentence. The best-achieved results of their method were
the best performance with obtained 82% accuracy. Noekhah 85% in terms of accuracy.
et al. [18] have presented a novel approach-based graph for
detecting opinion spam in Amazon product reviews. First, 3. Methodology
they calculated an average value for review and reviewer fea-
tures individually. Then, they asked three experts for assign- Figure 1 shows the proposed methodology for fake review
ing weight for every feature. Finally, they are multiplying the identification system that is consisted of six modules, namely,
weight of the feature with its average value for calculating the datasets, preprocessing, CNN-LSTM method, data splitting,
spamicity for the review text and reviewer. Their approach evaluation metrics, and results. The details of the framework
achieved 93% accuracy. Alsubari et al. [3] have proposed are discussed below.
different models based on supervised machine learning algo-
3.1. Datasets. This module presents the datasets used in these
rithms such as Random Forest, AdaBoost, and Decision tree.
experiments that are performed for the identification of
They used the standard Yelp product review dataset. The
deceptive/fake reviews. We have employed four standard
information gain method was applied as feature selection.
fake review datasets: hotel, restaurant, Amazon, and Yelp.
From their experimental results, it is observed that the
AdaBoost algorithm has provided the best performance by 3.1.1. Amazon-Based Dataset. This dataset is standard fake
recording 97% accuracy. Amazon product reviews consists of 21,000 reviews (10500
truthful and 10500 fake), and each review has metafeature
2.2. Fake Review Detection Based on Deep Learning Methods. such as product Id, product name, reviewer name, verified
The use of deep learning neural network models for fake purchase (no or yes), and rating value as well as a class label,
review identification has three key points. The first point is while in the statistical analysis of the dataset, we found that
that deep learning models utilize real-valued hidden layers the average rating value of the reviews was 4.13, and 55.7%
for automated feature compositions that can catch compli- of the data was recognized as verified purchases. The reviews
cated global semantic data, which is difficult by utilizing of this dataset are equally distributed through 30 discrete
typical specific handcrafted features. This provides an effec- product classifications (e.g., wireless, PC, health, etc.). Each
tive way in solving the shortcomings of different traditional product has 700 reviews (350 fake and 350 truthful reviews).
models aforementioned above. The second point is that neu- Furthermore, the reference for labeling this dataset is the
ral networks consider clustered word embedding as inputs Amazon filtering algorithm that is employed by the Amazon
that can be learned from raw text, hence mitigating the short- website [20, 21, 24].
age of labeled data. The third point is that neural models can
learn consistent text structure instantaneously. Based on 3.1.2. Yelp-Based Dataset. This dataset is standard fake elec-
Amazon electronic product review dataset, Hajek et al. [19] tronic products reviews combined from four USA cities
have implemented two neural network methods that were (Los Angeles, Miami, NY, and San Francisco) by Barbado
Deep Feed-Forward Neural Network and convolution neural et al. [17]. A reference for labeling this dataset is the Yelp
network. Then, they extracted features from the review text filtering algorithm utilized by the http://Yelp.com/ website
set such as word emotions and n-grams. Their methodology [25]. The dataset includes 9461 reviews and reviewers with
4 Applied Bionics and Biomechanics
Dataset Total of Training set Validation set Testing set Total of deceptive Total of truthful
name samples (70%) (10%) (20%) reviews reviews
Amazon 21,000 15120 1680 4200 10500 10500
Yelp 9460 6622 946 1892 4730 4730
Restaurants 110 80 8 22 55 55
Hotels 1600 1152 128 320 800 800
Feature sequence
Information extraction
Convolutional layer
Convolutional neural network
x1 x2 x3 …… xt
annotated as X 1 , X 2 , X 3 ,….X t as shown in Figure 2 These matrices can be expressed in equations (1),
cited above section, and each word is assigned a spe- (2), and (3) that are given below.
cific index integer number. The embedding layer
converts the indices of each word into D dimensional P = Rlxw , ð1Þ
word vector representation. In our proposed model,
we have used dissimilar domain datasets and for F = Rl×m , ð2Þ
each dataset, we have created different embedding
matrix sizes [V × D] where V represents the vocabu- O = Rl×d , ð3Þ
lary size and D is the dimension vector representa-
tions of each word in V. For input sequence length,
where P, F, and O indicate the input, filter, and output
we assigned a fixed sequence length for all datasets
matrices, respectively, R is representing entirely real num-
with 500 words. The embedding matrix can be
bers, l is the sequence length, and w denotes the width of
symbolized as E ∈ RV×D .
the input matrix that is presented as R30000×100 for Amazon
(B) Convolution Layer. In the CNN-LSTM model, the and Yelp datasets and R10000×100 for restaurant and hotel
convolution layer is a second layer and performing datasets. M is the width of the filter matrix, and d is the width
a mathematical operation that is applied on two of the output matrix. A convolutional layer is utilized to mine
objective functions, which produces a third function. the sequence knowledge and decrease the dimensions of the
The convolutional operation is calculated on the input sequences [30–32]. It has parameters such as filters
dimension vectors of various matrices such as input with window size. Here, we set the window size to 2 × 2 and
matrix (I), filter matrix (F), and output matrix (O). the number of filters to 100, which passes over the input
6 Applied Bionics and Biomechanics
9 2000
Fake
False pos
Fake
8 2098 241
3 7 1500
36.36% 49.95% 5.74%
13.64%
6 1250
5 1000
Truthful
Truthful
2 250
Fake Truthful Fake Truthful
Figure 4: Confusion matrix for restaurant dataset. Figure 7: Confusion matrix for Amazon dataset.
147
13 Restaurant 82 72 75 78 77
45.94% 100
4.06%
Hotel 77.5 92 90 83 85
80 Yelp 87 86 88 87 86
Amazon 85 90 87 86 87
Truthful
True pos 60
False neg
36 124
40
11.25% 38.75% 100
20 90
80
Fake Truthful 70
Accuracy
60
Figure 5: Confusion matrix for hotel dataset. 50
40
30
20
800 10
True neg False pos 0
Restaurants Hotels Yelp Amazon
Fake
False neg
133 892 300
47.15% Figure 8: Visualization of the classification results for in-domain
7.03% experiment.
200
Fake Truthful
2987 332
46.43% 5.16% 2000
text, we create a specific word-embedding matrix for every
dataset using a hidden neural network-embedding layer,
1500
which is one component of the proposed CNN-LSTM model.
True pos
Truthful
In-cross domain datasets Sensitivity (%) Specificity (%) Precision (%) F1-score (%) Accuracy (%)
Restaurant+hotel+Yelp+Amazon 89 90 90 89 89
90.2
90
89.8
Axis title 89.6
89.4
89.2
89
88.8
88.6
88.4
Cross-domain datasets
Sensitivity F1-score
Specificity Accuracy
Precision
0.950 0.45
0.925
0.40
0.900
0.35
0.875
Accuracy
Loss
0.30
0.850
0.25
0.825
0.800 0.20
0.775 0.15
0.750 0.10
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Epoch Epoch
Train
Val
Figure 11: The performance and loss of the CNN-LSTM model on cross-domain datasets.
LSTM layer with sigmoid function is used for learning and fake reviews. Table 2 and Figure 8 summarize and visualize
classifying an input sequences into fake or truthful reviews. the results for the in-domain experiments.
Figures 4–7 show the confusion matrices for restaurant,
hotels, Yelp, and Amazon datasets. 3.6.2. Cross-Domain Experiment. In this experiment, we have
In confusion matrices depicted in above Figures 4–7, true gathered all domain datasets into a single data frame for
negative (TN) represents the total numbers of samples that discovering features that are more robust. The size of this
the model successfully predicted as fake reviews. False nega- dataset is 32170 review text distributed as 21,000 different
tive denotes the total number of samples that the model Amazon product reviews, 9460 Yelp electronic product
incorrectly predicted as truthful reviews. True positive reviews, 110 restaurant reviews, and 1600 hotel reviews. We
denotes the total number of samples that the model success- have split the datasets into 70% as a training set, 10% as a
fully predicted as truthful reviews. FP represents the total validation set, and 20% as a testing set. Based on word
number of samples that the model incorrectly predicted as embedding of n-gram features of the review text, we have
Applied Bionics and Biomechanics 9
created an input embedding matrix that has the size of V × D in-domain and cross-domain have been carried out on four
(V is vocabulary size of the dataset, and D is embedding standard fake review datasets (hotel, restaurant, Yelp, and
dimensions of each word in V) which is equal to 50000 × Amazon). Preprocessing methods such as lowercase, remov-
100. Further, the convolutional and max-pooling layers of ing of stopword and punctuation, and tokenization have
CNN are utilized for sliding over an input matrix and extract been conducted for the dataset cleaning as well as padding
the feature maps from input sequences. Then, LSTM layer sequence method was used to make a fixed length for all
receives the output from the max-pooling layer and performs input sequences. Further, an embedding layer as one compo-
the processing task for handling of contextual information of nent of the proposed model was applied to create different
the sequences based on gate mechanism. Finally, last layer is types of word-embedding matrices of size V ∗ D (V is the
the sigmoid function that is applied for classification of the vocabulary size of the dataset, and D is an embedding dimen-
input sequence into truthful or fake. The experimental results sion of each word in V) for in-domain and cross-domain
show that CNN-LSTM model provides better performance in experiments. Convolutional and max-pooling layers of the
cross-domain than an in-domain datasets. Figure 9 below CNN technique perform the feature extraction and selection.
Further, the LSTM technique is combined with the CNN for
presents the confusion matrix for cross-domain datasets.
contextual information processing of input sequences that
From the experimental results carried out in this research
are based on gate mechanisms and forward the output to
work, we conclude that a large number of n-gram features
the last layer. A sigmoid function as last layer of the proposed
lead to better accuracy with deep learning neural network model is used to classify the review text sequences into fake
techniques. Table 3 and Figure 10 show the classification or truthful. For in-domain experiments, the proposed model
and visualization of results in cross-domain experiment. is applied to each dataset individually for fake review detec-
In the above Figure 11 and on the left plot, the X-axis rep- tion. Further, a cross-domain experiment was performing
resents the training and validation accuracy and Y is the on mixed data of restaurants, hotels, Yelp, and Amazon
number of epochs, which indicate the number of iterations reviews. From experimental results, we conclude that a large
that the CNN-LSTM model has trained and tested on the number of features lead to better accuracy while using deep
dataset. The right plot shows the model loss. learning neural network (DLNN) algorithms. Outstandingly,
the proposed model surpassed existing baseline and state-of-
4. Comparative Analysis the-art fake review identification techniques in terms of accu-
racy and F1-score measures for in-domain experiment. The
In this section, we compare the results of in-domain experi- experimental results also revealed that the proposed model
ments performed by the proposed model (CNN-LSTM) with provides better performance in a cross-domain experiment
the existing works based on accuracy metric. Table 4 reports than an in-domain experiment because the first one is imple-
the comparative analysis using the accuracy metric. mented to a large-size dataset with more features. According
According to the literature review of fake review detec- to the literature review of fake review detection methods,
tion, there is no research work has used the same datasets there is no research work has used the same datasets in a
in a cross-domain experiment. Thus, we are unable to make cross-domain experiment. Thus, we are unable to make com-
comparative analyses for cross-domain datasets. parative analyses with cross-domain datasets.
Conflicts of Interest ference on Privacy, Security and Trust (PST 2012), Paris,
France, 2012.
The authors declare that they have no conflicts of interest. [16] S. Feng, “Distributional footprints of deceptive product
reviews,” in Sixth International AAAI Conference on Weblogs
and Social Media, Dublin, Ireland, 2012.
References [17] R. Barbado, O. Araque, and C. A. Iglesias, “A framework for
fake review detection in online consumer electronics retailers,”
[1] D. U. Vidanagama, T. P. Silva, and A. S. Karunananda, Information Processing & Management, vol. 56, no. 4,
“Deceptive consumer review detection: a survey,” Artificial pp. 1234–1244, 2019.
Intelligence Review, vol. 53, no. 2, pp. 1323–1352, 2020.
[18] S. Noekhah, E. Fouladfar, N. Salim, S. H. Ghorashi, and A. A.
[2] N. Jindal and B. Liu, “Opinion spam and analysis,” in Pro- Hozhabri, “A novel approach for opinion spam detection in e-
ceedings of the 2008 international conference on web search commerce,” in Proceedings of the 8th IEEE international con-
and data mining, pp. 219–230, Palo Alto, California, USA, ference on E-commerce with focus on E-trust, Mashhad, Iran,
2008. 2014.
[3] S. N. Alsubari, M. B. Shelke, and S. N. Deshmukh, “Fake [19] P. Hajek, A. Barushka, and M. Munk, “Fake consumer review
reviews identification based on deep computational linguistic,” detection using deep neural networks integrating word embed-
International Journal of Advanced Science and Technology, dings and emotion mining,” Neural Computing and Applica-
vol. 29, pp. 3846–3856, 2020. tions, vol. 32, no. 23, pp. 17259–17274, 2020.
[4] S. Rayana and L. Akoglu, “Collective opinion spam detection: [20] K. Goswami, Y. Park, and C. Song, “Impact of reviewer social
bridging review networks and metadata,” in Proceedings of interaction on online consumer review fraud detection,” Jour-
the 21th acm sigkdd international conference on knowledge dis- nal of Big Data, vol. 4, no. 1, pp. 1–9, 2017.
covery and data mining, pp. 985–994, Sydney, NSW, Australia, [21] M. Young, The Technical Writer’s Handbook, University
2015. Science, Mill Valley, CA, 1989.
[5] C. Miller, Company settles case of reviews it faked, New York [22] Y. Ren and D. Ji, “Neural networks for deceptive opinion spam
Times, 2009. detection: an empirical study,” Information Sciences, vol. 385,
[6] Y. Ren, D. Ji, and H. Zhang, “Positive unlabeled learning for pp. 213–224, 2017.
deceptive reviews detection,” in Proceedings of the 2014 confer- [23] Z. Y. Zeng, J. J. Lin, M. S. Chen, M. H. Chen, Y. Q. Lan,
ence on empirical methods in natural language processing and J. L. Liu, “A review structure based ensemble model
(EMNLP), pp. 488–498, Doha, Qatar, 2014. for deceptive review spam,” Information, vol. 10, no. 7,
[7] D. Streitfeld, For $2 a star, an online retailer gets 5-star product p. 243, 2019.
reviews, vol. 26, New York Times, 2012. [24] L. Garcia, Deception on Amazon on an NL exploration, 2018,
[8] A. Heydari, M. Ali Tavakoli, N. Salim, and Z. Heydari, “Detec- https://medium.com//@lievgarcial/deception-on-amazon-
tion of review spam: a survey,” Expert Systems with Applica- cle30d977cfd.
tions, vol. 42, no. 7, pp. 3634–3642, 2015. [25] S. Kim, H. Chang, S. Lee, M. Yu, and J. Kang, “Deep semantic
[9] M. Arjun, V. Vivek, L. Bing, and G. Natalie, “What yelp fake frame-based deceptive opinion spam analysis,” in Proceedings
review filter might be doing,” in Proceedings of The Interna- of the 24th ACM International on Conference on Information
tional AAAI Conference on Weblogs and Social Media and Knowledge Management, pp. 1131–1140, Melbourne,
(ICWSM-2013), Massachusetts USA, 2013. Australia, 2015.
[10] M. Ott, Y. Choi, C. Cardie, and J. T. Hancock, “Finding [26] L. Gutierrez-Espinoza, F. Abri, A. S. Namin, K. S. Jones, and
deceptive opinion spam by any stretch of the imagination,” D. R. Sears, “Ensemble learning for detecting fake reviews,”
in Proceedings of the 49th Annual Meeting of the Association in 2020 IEEE 44th Annual Computers, Software, and Applica-
for Computational Linguistics: Human Language tions Conference (COMPSAC), pp. 1320–1325, Madrid, Spain,
Technologies-Volume 1, Oregon, USA, 2011. 2020.
[11] R. Y. Lau, S. Y. Liao, R. C. Kwok, K. Xu, Y. Xia, and Y. Li, “Text [27] F. Abri, L. F. Gutierrez, A. S. Namin, K. S. Jones, and D. R.
mining and probabilistic language modeling for online review Sears, “Fake reviews detection through analysis of linguistic
spam detection,” ACM Transactions on Management Informa- features,” 2020, https://arxiv.org/abs/2010.04260.
tion Systems (TMIS), vol. 2, no. 4, pp. 1–30, 2011. [28] M. Ott, C. Cardie, and J. Hancock, “Estimating the prevalence
[12] S. Shojaee, M. A. A. Murad, A. B. Azman, N. M. Sharef, and of deception in online review communities,” in Proceedings of
S. Nadali, “Detecting deceptive reviews using lexical and syn- the 21st international conference on World Wide Web,
tactic features,” in 2013 13th international conference on intel- pp. 201–210, Lyon, France, 2012.
ligent systems design and applications (ISDA), pp. 53–58, [29] M. Ott, C. Cardie, and J. T. Hancock, “Negative deceptive
Salangor, Malaysia, 2013. opinion spam,” in Proceedings of the 2013 conference of the
[13] Q. Xu and H. Zhao, “Using deep linguistic features for finding north american chapter of the association for computational
deceptive opinion spam,” Proceedings of COLING 2012: Post- linguistics: human language technologies, pp. 497–501, Atlanta,
ers, pp. 1341–1350, 2012. Georgia, 2013.
[14] M. Allahbakhsh, A. Ignjatovic, B. Benatallah, S. M. R. Beheshti, [30] S. Ahmad, M. Z. Asghar, F. M. Alotaibi, and I. Awan, “Detec-
N. Foo, and E. Bertino, “Detecting, representing and querying tion and classification of social media-based extremist affilia-
collusion in online rating systems,” 2012, https://arxiv.org/abs/ tions using sentiment analysis techniques,” Human-centric
1211.0963. Computing and Information Sciences, vol. 9, no. 1, p. 24, 2019.
[15] H. Duan and P. Yang, “Building robust reputation systems for [31] Understanding LSTM cells using C#https://msdn.microsoft
travel-related services,” in Proceedings of the 10th Annual Con- .com/en-us/magazine/mt846470.aspx.
Applied Bionics and Biomechanics 11