KEMBAR78
Advanced Persistent Threat Attack Detection Method | PDF | Computer Security | Security
0% found this document useful (0 votes)
105 views11 pages

Advanced Persistent Threat Attack Detection Method

1. The document proposes a deep learning method using an autoencoder and softmax regression algorithm to detect Advanced Persistent Threat (APT) attacks in cloud computing. 2. The autoencoder is used to study features from network traffic data in an unsupervised manner. Then a softmax regression layer is added to classify APT attacks. 3. The proposed method achieved an average detection accuracy of 98.32% on a database, outperforming existing methods. A two-factor authentication system using one-time passwords is also proposed to strengthen cloud security against APT attacks.

Uploaded by

Jitesh Sah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views11 pages

Advanced Persistent Threat Attack Detection Method

1. The document proposes a deep learning method using an autoencoder and softmax regression algorithm to detect Advanced Persistent Threat (APT) attacks in cloud computing. 2. The autoencoder is used to study features from network traffic data in an unsupervised manner. Then a softmax regression layer is added to classify APT attacks. 3. The proposed method achieved an average detection accuracy of 98.32% on a database, outperforming existing methods. A two-factor authentication system using one-time passwords is also proposed to strengthen cloud security against APT attacks.

Uploaded by

Jitesh Sah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Array 10 (2021) 100067

Contents lists available at ScienceDirect

Array
journal homepage: www.elsevier.com/journals/array/2590-0056/open-access-journal

Advanced Persistent Threat attack detection method in cloud computing


based on autoencoder and softmax regression algorithm
Fargana J. Abdullayeva
Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141, Baku, Azerbaijan

A R T I C L E I N F O A B S T R A C T

Keywords: APT (Advanced Persistent Threat) is a complex type of attack that steals personal data by staying in the infected
Advanced persistent threat system for a long time. When APT attacks take place in a dynamic and complex infrastructure such as the cloud,
Cloud computing their detection by traditional methods is very difficult. To overcome the limitations of existing methods the paper
Cybersecurity
proposes autoencoder based deep learning approach for APT attack detection. The advantage of this model is that
Two-factor authentication
One-time password
it achieves a high classification result by identifying complex relationships between features in a database.
Deep learning Additionally, the model simplifies the process of classifying large volumes of data by reducing the size of data in
Autoencoder the encoder. Here, first of all, the autoencoder neural network was applied, and informative features were studied
Softmax regression algorithm from the network traffic data in an unsupervised manner. After the informative feature study, softmax regression
layer was added to the top layer of the constructed autoencoder network to classify APT attacks. In this study, a
deep neural network model constructed by adding different layers was tested on a database open to scientific
research and compared to existing methods; the proposed method gave superior results in detection of APT at-
tacks. The average detection accuracy of the proposed APT detection framework was achieved of 98.32%. A
model for the application of the proposed approach to the cloud environment has been developed, and a two-
factor authentication system based on the OTP (One-Time Password) mechanism has been proposed to
strengthen the security of the cloud information system against APT attacks.

1. Introduction components of an APT attack: the detection of malicious PDF files in


phishing emails; detection of malicious SSL certificates during Command
APT is one of the fastest growing information security threats faced by and Control (C&C) communications; detection of data leakage in the final
organizations today. They are carried out by the most experienced and stage of APT attack.
well-funded attackers and target the confidential information of private In a situation where the APT is successfully organized and ready to
organizations. The purpose of APTs is to exfiltrate information to external attack, it is considered too late to build the defense in different phases,
hosts (data theft, data exfiltration). The theft of 9 GB of encrypted especially in the last phase. In the context of such long-term and incon-
password data from Adobe Leak in 2013 and the theft of 40 GB of Ashley spicuous attacks, it is necessary to develop new artificial intelligence
Madison’s database in 2015 were registered as APT facts [1,2]. methods and appropriate analytical technologies that intelligently collect
As APTs attempt to remain anonymous and typically use Zero-Day threats in order to detect APT-type attacks and protect from them before
attacks (faults in software which have not been discovered by applica- exfiltration is carried out.
tion developers or hardware vendors and can be exploited), they cannot Simple and deep neural network models have been developed to
be detected using existing IDS solutions. Most of these attacks go unno- detect cyberattacks on hosts and network systems. The ANN, a simple
ticed for many years. For example, the APT attack known as Red October neural network, consists of one or two hidden layers, while the deep
has been active for more than five years [3]. network has a large number of hidden layers and different architectures.
APT attacks are continuous, targeted attacks on any specific organi- Deep learning is widely used by researchers because of its ability to study
zation and are carried out in several stages [4]. The APT attack is in-depth the computational process that mimics the natural behavior of
modeled from six components [5]. The detection of APT in the literature the human brain [6]. However, these methods have very high false alarm
sources raises the issue of detection of threats in the relevant compo- rates and have difficulty detecting new types of attacks.
nents. Numerous studies have been conducted to identify the main Deep learning was used in the presented study due to its ability to

E-mail address: a_farqana@mail.ru.

https://doi.org/10.1016/j.array.2021.100067
Received 29 October 2020; Received in revised form 8 March 2021; Accepted 12 April 2021
Available online 21 April 2021
2590-0056/© 2021 Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
F.J. Abdullayeva Array 10 (2021) 100067

analyse the network data in-depth and automatically derive the feature and exfiltrate confidential information of organizations and government
vector itself. bodies.
The idea of deep learning, first used by Hinton as well as the study of APTs usually use attachments consisting of a malicious application
features, implies the study of features descriptions from introductory being able to compromise the system or spear-phishing emails containing
data [7]. The idea of the approach proposed in the article is that in the a link [12].
present case if the features are physically selected, the information is APTs establish continuous and covert connections with information
automatically selected using a deep learning method. According to the technologies structures of an organization chosen as a target in order to
approach, the information encoder is applied to the trained data, first, the obtain (exfiltrate) information that can destroy critical aspects of the
informative features are studied, then the Softmax regression layer is organization or create an obstacle. APTs use several attack steps (for
used to classify the types of APT. instance, social engineering, C&C communications) in order to pass by
The main contributions of the work include: security solutions [13].
It is attempted to explain APT with abbreviations used in this
1. An APT attack detection model has been established. expression:
2. The multi-step structure of APTs is described.
3. The architecture of the APT detection system is proposed. Advanced – expresses that attackers are well-taught, well-organized
4. The application model of the APT detection system in the cloud and well-funded, and network infiltration technologies are utilized in
environment has been established. full spectrum.
5. A two-factor authentication system based on the OTP mechanism has Persistent – reflects the continuous character of these attacks. In this
been proposed to ensure that the cloud information system is resistant case, attackers establish a long-term network presence and attempt to
to APT attacks. compromise the system seriously. So far, the longest-term APT attack
6. The proposed method for detecting APT has been tested on network has been the APT1 group and it lasted four years 10 months.
traffic. Threat – reflects the exfiltration of classified data of an organization
possessing strategic information. As the aim is to steal confidential
The paper is structured as follows. Section 2 describes the computer data, APT attacks usually cause large-scale damage to a target.
network attack categories, presents an overview of the Advanced
Persistent Threats concept, and defines the distinctive features of APT The analysis of specific APT samples shows that there is no resem-
attacks. Section 3 reviews related work literature. Section 4 defines the blance among all APT attacks and they are specifically tailored to each
problem statement. Section 5 presents the proposed APT detection model target [14].
based on Autoencoder, describes the deployment model of the proposed However, the steps of APT attacks are similar in most attacks and
approach to the cloud computing environment, and presents the archi- those differ according to a specific method used at each step. During all
tecture of the defense system against the APT attack. Section 6 presents work processes of APTs, a threat agent initially collects open sources in
the results of the experimental evaluation on real network traffic data. order to determine targets. After the attacker obtains secret key and es-
Finally, Section 7 presents the conclusion of the work. tablishes a support point, it escalates its privileges, spreads in the
cybernetwork and eventually gains access to confidential information.
2. Categories of computer network attacks Different steps of APT attacks are given in various sources according
to the implementation sequence of an APT attack [5,15].
Attacks to computer networks are classified into two categories [8]: The multistep structure of APTs can be described as in Fig. 1.

1) Active attacks. An attacker manipulates compromised data or hard- 2.1.1. Reconnaissance


ware of the network in case of an active attack. Information injection, At this stage, attackers collect information about resources, em-
data modification, package attacks pertain to active attacks. ployees of a target organization and their relations with other subjects
2) Passive attacks. Passive attacks perform the collection of critical in- utilizable for reaching the target. They scan the network in order to
formation, network characteristics, or the study of data to be trans- determine open services of the network, security systems used by the
ferred via sniffing and eavesdropping. network perimeter and employees that have access to target information.
Thereafter, the attackers create profiles for each target employee by
APT as an advanced attack category is created by comprising several utilizing open information in social networks (for instance, LinkedIn,
attack components. During this attack, the exfiltration of information is Facebook) where they can open an account.
carried out in form of a passive attack in order to study the state of the
system and its bugs, and an active attack is carried out as a result for the 2.1.2. Initial compromise
purpose of serious system damage. APT comprises several attack A spear-phishing email is created at this stage by using information
methods. collected at reconnaissance stage. This email can include an invitation to
an event organized by the organization and URL address for a person to
2.1. Overview of Advanced Persistent Threats upload documents related to this event. When the phishing letter with an
attachment is opened, the attacker compromises the machine by using
APT attacks have first been carried out against military organizations zero-day vulnerability in order to enter the target network. The malware
as a cyber intervention [9]. One of the first APT attacks was Moonlight installed to hardware of a victim such as remote access trojan (RAT) or
Maze developed in 1996. This attack targeted several military and state Remote Administration Tool. Electronic mail is a common access vector
networks of the United States. However, these attacks have later been used by a malware for entering the organization’s infrastructure, how-
expanded and now target industrial and state organizations [10,11]. ever, other channels can be used for this purpose as well (e.g. USB-based
Moreover, education, finance, cosmic exploration and aviation, energy malware, trojans activated with timers).
supply, chemistry, telecommunication, medicine and consulting areas
have become targets of APT attacks as well. 2.1.3. Capturing passwords and maintaining access
APT attacks exploit the vulnerabilities of the device of various types A malware is installed and activated after being downloaded to the
in order to perform the attack. Due to its characteristics, APT attacks hardware of the target employee in organization’s network. Thereafter,
target personal computers and mobile devices. Attackers control the malware establishes C&C connection from victim’s hardware to a
compromised hardware remotely by utilizing complex attack techniques remote master. After establishing a connection, the attackers continue to

2
F.J. Abdullayeva Array 10 (2021) 100067

Fig. 1. Multistep structure of APTs.

collect information about security configurations of the hardware and individuals, but by cybercriminal organizations [15]. Each member of
related system information covertly, capture passwords and collect user the group possesses specific experience and knowledge.
emails in order to perform next attacks and collect network users’ names Unlike DDoS attacks, APTs are types of attacks organized via com-
and directory lists in general network folders. At this stage, RAT connects puter viruses, trojans and worms and usually try to disguise in network
to C&C server of the attacker in order to receive commands to be devices (personal computers, servers, mobile devices). The feature of
executed on target network. The characteristics of this stage is that the APT is to exfiltrate data inside the network to external devices of the
connection attempt is not carried out by the attacker, but by the victim network.
host. While DDoS attacks are large-scale and destructive, APTs have an
opposite nature; they are unnoticeable, covert and can be organized as
2.1.4. Privilege escalation small- or large-scale attacks. The goal of long-term attacks is to remain
This stage provides long-term persistent presence in organization’s unnoticeable as long as possible in order to achieve maximum exfiltra-
network. Attackers move horizontally in the network, detect servers tion. APT is not considered a distributed attack. The feature of APT at-
storing sensitive information and users authorized with priority access tacks is to be covert.
and create a strategy to collect and export target information. Operators An expert attacker imitates a normal attacker’s behavior in APT and
regularly target privileged users with phishing letters. If exploits are does not attempt to spread the virus widely by only compromising a
successful, they escalate their access privilege to information. limited number of hosts. However, the opposite is the case of traditional
malware which try to spread as much as possible unlike the APTs.
2.1.5. Data collection lateral movement There are substantial differences between botnet attacks and APTs.
At this stage, operators attempt to maintain access to target infor- Thousands or millions of hosts participate in botnet, while APTs are at-
mation by using collected account data of priority users at the previous tacks oriented to a specific organization and controlled by a specific
stage. By using complex tools, attackers establish superfluous C&C person. Botnet approaches, the aim of which is to detect similar behav-
channels if sudden changes will take place in security configuration of the iors (via the clustering of traffic features) in hosts’ groups, cannot be
organization. When access is gained to target information, one or mul- applied in an APT domain. It is due to the fact that APT can only
tiple superfluous copies of this information is created in servers acting as compromise a given number of hosts while C&C servers use the protocols
“staging points”. Information is segmented, defragmented and coded of a subset of victim hosts. Hence, it is not possible to carry out wide-
before being exfiltrated at this stage. The attacker attempts to gain access range cluster analysis suggested in botnet detection approaches in case
to other hosts within the target network via more escalated privileges of APT in order to determine multiple hosts which are abnormal traffic
requirable for the access to valuable resources at this stage. For example, templates. Moreover, the compromise strategies are different: APTs uti-
RAT can perform internal scanning in the network where it is present or lize spear-phishing and zero-day exploits in order to compromise victim
initiate a new connection in other internal hosts (via Secure Shell, SSH). computers while botnets replicate themselves in a more aggressive
manner.
2.1.6. Exfiltration Insider threats have some common features with APT attacks. APT
At this stage, information collected and packaged in staging point attempts to gain control over the real host within organization, but an
servers are transmitted via encrypted channels to several external servers attacker tries to imitate a normal behavior to avoid being detected. The
acting as drop points. The utilization of multiple drop point servers is a main difference of APTs from insider threats is that insiders do not carry
deceptive strategy in order to prevent the detection of final drop point of out the exfiltration of data via network. Hence, the majority of ap-
data by researchers. Stolen data is sent to one or several remote servers proaches for detecting the insider threats is based on the analysis of host-
managed by the attacker at this stage. Information can either be exfil- based log files and honeypot strategies. Unlike APT detection the analysis
trated completely at once or leaked covertly and at low speed if the of network traffic is not carried out in detecting insider threats [16].
attacker aims to steal data persistently.
3. Related work
2.2. Distinctive features of APT attacks
Neupane et al. [17] propose an approach called Dolus for detecting
The possession of specific characteristics by APTs makes their target-oriented attacks (DDoS and APT) towards services hosted in cloud
detection very hard. Unlike traditional attacks, they usually use zero-day platforms. The detection of DDOS attacks in the system of two-stage
vulnerabilities and target a specific organization. APTs’ simulation of Dolius ensemble learning proposed in this study utilizes the data min-
normal behavior and use of social engineering strategies complicates ing of threats. The first stage encompasses the detection of anomalies for
their detection. the implementation of detection of noticeable events (port exhaustion).
Another distinctive feature is that APTs are performed not by private The second stage serves to distinguish the event of DDoS attack within

3
F.J. Abdullayeva Array 10 (2021) 100067

the main five attack vectors. ADAPT (Automated Defense against proposed for the ranking of internal hosts engaged in data exfiltration of
Advanced Persistent Threats) strategy is applied in the Dolus system in APT. The behavior of each host is modeled as a feature point in a
order to carry out the resistance to APT attacks. The goal of the ADAPT multidimensional space. Thereafter, a score value is assigned to each
module is to detect devices compromised by APT by tracking the data internal host based on the suspiciousness positions in feature space and
exfiltrated outside the boundaries of the corporate network. Suspi- ranking is conducted. Johnson et al. [27] propose an assessment method
ciousness scores are used in order to detect APTs and determine systems of the growth risk of privileges of network users based on graph theory
compromised by APT. The suspiciousness score is assigned to each device for detecting APT at the stage of privilege escalation. Vance et al. [28]
within or outside the network. The score value is determined based on develop an approach for detecting APT attacks by applying a method of
the number of unique drop points, total number of connections, and the statistical detection of anomalies for the purpose of the analysis of
total number of transmitted bytes. External devices considered as sus- network communications.
picious are eventually isolated from devices of the internal network. A Xiao et al. [29] provide prospect theoretic study on APT defense. This
multivariate Gaussian algorithm is used to detect anomalies. In order to study discloses the impact of the subjective view of an APT attacker on
develop an ensemble approach, averaging or Bayesian-based majority the data safety levels of a cloud storage. In this paper, an asymmetric
voting method is employed. evolutionary game between the APT attacker and the cloud storage de-
Stojanovi’c et al. [18] conduct an analysis of existing databases in the fender was formulated to find the evolutionary stable strategies in the
area of detecting APT attacks and studies APT attacks carried out in large APT defense games. Rosenberg et al. [30] propose DeepAPT model based
corporate networks, cyber-physical systems, cloud computing systems, on deep neural networks to detect APT attacks. The difference between
financial networks and networks of the internet of things. Distinctive this work and ours is that it does not provide information on the stage at
stages of this type of APT attack are described. Ghafir et al. [19] develop a which APT is detected, and the features used to detect APT do not reflect
system titled MLAPT (machine learning based APT) based on machine the actual features of APT. Additionally, an approach based on the syn-
learning in order to detect APT attacks. MLAPT consists of three blocks: thesis of a softmax classifier with an autoencoder model in the field of
detection of a threat, correlation of events and forecasting of an attack. detection of APT attacks was proposed firstly in our study, and the
The function of the block of the correlation of events is to create a cor- approach showed a high result with a classification accuracy of 98.32%.
relation between detected events and the types of APT attacks. The The advantage of using an autoencoder is that this model can operate
rationale for using the correlation approach is to reduce the detection of efficiently and faster with large amounts of data by reducing the size in
false positive rates of the MLAPT detection system. Giura et al. [20] the encoder. Additionally, the model achieves a high classification result
model APT as an attack pyramid. The goal of the attack is placed on the by identifying complex relationships between features in a dataset.
upper layer of the attack pyramid and lateral planes describe environ- Existing approaches are confined to detecting only one step of APT
ments where attack-related events can be recorded (for instance, phys- and disregard other APT activities. This implies that if a detection system
ical, user, network, application planes, etc.). The layers of the attack disregards any malicious block of APT, the complete APT scenario will
pyramid are constituted of steps of the APT attack. The proposed remain undetected. Moreover, the detection of separate malicious ac-
detection scheme correlates all events relevant to the recorded security in tivities at different APT stages such as data exfiltration, malicious URL
the organization. Huang et al. [21] propose a dynamic approach based on connection, etc. cannot be considered as the complete detection of APT.
game theory which detects a long-term mutual connection between a Another shortcoming of these methods is that these methods allow high
covert attacker and proactive defender in cyber-physical systems. false positive detection errors while detecting APT attacks due to the
Andrew [22] proposes a detection approach based on the APT matching of real and anomalous events.
network flows. The approach carries out the detection via the statistical
modeling of APT communications. Zimba et al. [23] propose a method 4. Problem statement
for weighted modeling of attack routes based on Bayesian network for
the modeling of mutually connected attack routes generated by APT at- Assume that training data D ¼ fðxi ; yi Þgni¼1 consisting of n number of
tackers via the exploitation of faults in cloud components. The penetra-
APT samples are given. Here, xi 2 RD is a D-dimensional APT attack
tion to the components of the cloud with faults during the attack
vector and yi 2 ½1; K are corresponding types of attack classes. It is
generates virtual attack routes. The faults of the target system are
required to detect new unknown attacks not participating in a given
described as attack graphs. The nodes and arrows of each attack route are
dataset at learning stage.
determined in constructed acyclically oriented graph. These nodes and
arrows are deemed important for selecting resistance strategies. An
optimization algorithm is proposed to find the shortest attack route. 5. Proposed APT detection model based on autoencoder
Usually, all APT attacks start by incentivizing the users to use social
phishing, e-mail spam, e-mail phishing. Here, phishing directs the users The architecture of APT detection system represented in Fig. 2.
to fake domains and users end up downloading malware as a result. The proposed detection framework consists of two blocks: extraction
Hence, the detection of APT attacks by determining unknown domains of features and classification of attacks. Firstly, features are extracted by
plays an important role. Cho et al. [24] propose a method based on the applying autoencoder neural network to the network traffic in order to
monitoring of accesses to unknown domains. Here, when unknown do- detect the techniques used in APT lifecycle. The vector of the features of
mains are detected, warning signals are generated for users. Zimba et al. events generated by separate techniques are produced as module output.
[25] propose a semi-supervised machine learning method in order to Generated vector of features is transmitted to the classification module
detect APT attacks. The target network here is modeled as a global thereafter. The vector of features more similar to the same APT attack
network, while the detected APT attack network is modeled as a scenario are categorized in this module.
scale-free network. The transition states of nodes in time domain are The presented article utilizes deep autoencoder neural network for
modeled as finite automata in order to characterize the state changes the purpose of learning training data fxi gni¼1 . Autoencoder is a symmetric
during APT attack. Marchetti et al. [26] review the issue of detecting neural network and usually studies the features of databases in an un-
suspicious hosts. The approach proposed here detects APT attacks at the supervised manner. Autoencoder builds the description of features by
exfiltration stage. Traffic data are collected and features pertaining to the reconstructing input data xi . Sometimes autoecoder is used for reducing
exfiltration stage are determined for this purpose. An approach is the scale as in PCA. PCA employs a linear function for carrying out the
data transformation, while autoencoder utilizes a nonlinear one. In the

4
F.J. Abdullayeva Array 10 (2021) 100067

Fig. 2. Proposed APT detection framework.

s  2
simplest sense, autoencoder is comprised of encoder, hidden and decoder 1X  
Eðx; ~xÞ ¼ xi  ~xi  (3)
layers (Fig. 3). 2 i
Non-linear function fΘ is applied in order to map access vector xi to
hidden layer zi in encoder part: Θ ¼ fW; bg ¼ argminΘ Eðx; ~xÞ (4)
fΘ ðxi Þ ¼ SðWxi þ bÞ (1) The proposed APT detection model has utilized softmax regression
layer in order to carry out the multiclass classification (logistic regression
where, Wis the weight matrix of encoder, b is the bias vector of encoder, can be used for a binary classification) (Fig. 4).
S is a sigmoid activation function and Θ are mapping parameters ½W; b. As network has large size, it is important to increase the effectiveness
Sigmoid function is calculated as S ¼ 1=ð1 þ expð  vÞÞ. of detection in order to reduce the size of this data. The data volume in
At the decoding stage, to reconstract the input data xi , the input data the proposed approach is reduced via the hidden layer of autoencoder. By
is mapped into the hidden layer via nonlinear activation function: using a nonlinear function in the encoder layer, multiple features are
 0 0
 transformed into feature set. The selection of features is carried out via an
gΘ0 ðxi Þ ¼ S Wzi þ b (2) algorithm without using the human knowledge. The goal of choosing the
features is to find better learned observations.
0 0 0
where, W is dh  do dimensional weight matrix, b is a bias vector, Θ are
0 0
mapping parameters ½W ; b . 5.1. Application of APT detection system to cloud environment
In the autoencoder model, the input data is compressed first and
thereafter, these data are used as input data of decoder in order to repair The developed attacks detection system must be located correctly in
the original data for the purpose of learning the hidden layer. It is order to provide the protection of cloud environment against the impact
attempted to minimize reconstruction errors (the difference between of attacks. As cloud systems transmit the data to users via Internet, it is
original data and its small-scale reconstruction) during the training deemed necessary to locate the detection module on the transmission line
process. This difference is calculated for straining data as follows: between Internet and cloud. The application model of the proposed
detection module to cloud environment is described in Fig. 5.

5.2. Architecture of protection module against APT

During the preparation, APT attacks perform the stealth of the user’s
password who has access to information in order to obtain confidential
information (third stage of APT). In APT login credentials can be ac-
quired either by social engineering, by using some form of side channel
attack, by eavesdropping (unprotected) communication and by guessing

Fig. 3. Autoencoder neural network. Fig. 4. Constructed APT detection autoencoder neural network model.

5
F.J. Abdullayeva Array 10 (2021) 100067

Fig. 5. Deployment model of the proposed approach on cloud computing environment.

Fig. 6. Two-Factor Authentication (2FA) of the user into cloud sensitive data.

passwords. During APT attack, an attacker acquires user’s passwords and carry out user authentication.
exfiltrates the confidential information by logging into user’s computer OTP is only useable for one-time utilization and those are usually
in his/her behalf. The article proposes an architecture of the security limited by time factor. OTPs have a dynamic character. Each new OTP is
system of cloud services resilient to APT attacks in order to prevent the generated based on a query as a unique sequence of numbers. OTP is
occurrence of APT attack (Fig. 6). entered to the web-terminal by user. The possession of information about
The architecture is based on a two-factor authentication mechanism authentic OTP code by the user confirms that the account belongs to the
and utilizes a traditional static and one-time passwords (OTP) in order to user for authentication systems. The utilization of OTP with a traditional

6
F.J. Abdullayeva Array 10 (2021) 100067

Table 1
The features of the MalwareTrainingSets dataset.
# Feature # Feature # Feature

1 file_access 28 critical_process 96 ransomware_file_modifications


2 infostealer_ftp 49 service_start 97 antivm_vbox_files
3 sig_modifies_hostfile 50 net_dns 98 static_pe_anomaly
4 removes_zoneid_ads 51 ransomware_files 99 copies_self
5 disables_uac 52 virus 100 antianalysis_detectfile
6 static_versioninfo_anomaly 53 file_write 101 antidbg_devices
7 stealth_webhistory 54 antisandbox_suspend 102 file_drop
8 reg_write 55 sniffer_winpcap 103 driver_load
9 network_cnc_http 56 antisandbox_cuckoocrash 104 antimalware_metascan
10 api_resolv 57 file_delete 105 modifies_certs
11 stealth_network 58 antivm_vmware_devices 106 antivm_vpc_files
12 antivm_generic_bios’: 6, 59 ransomware_recyclebin 107 stealth_file
13 polymorphic 60 infostealer_keylog 108 mimics_agent
14 antivm_generic_disk 61 clamav 109 disables_windows_defender
15 antivm_vpc_keys 62 packer_vmprotect 110 ransomware_message
16 antivm_xen_keys 63 antisandbox_productid 111 network_http
17 creates_largekey 64 persistence_service 112 injection_runpe
18 exec_crash 65 antivm_generic_diskreg 113 antidbg_windows
19 antisandbox_sboxie_libs 66 recon_checkip 114 antisandbox_sleep
20 mimics_icon 67 ransomware_extensions 115 stealth_hiddenreg
21 stealth_hidden_extension 68 network_bind 116 disables_browser_warn
22 modify_proxy 69 antivirus_virustotal 117 antivm_vmware_files
23 office_security 70 recon_beacon 118 infostealer_mail
24 bypass_firewall 71 deletes_shadow_copies 119 ipc_namedpipe
25 encrypted_ioc 72 browser_security 120 persistence_autorun
26 dropper 73 modifies_desktop_wallpaper 121 stealth_hide_notifications
27 delete 74 network_torgateway 122 service_create
28 mimics_filetime 75 injection_createremotethread 123 reads_self
29 banker_zeus_url 76 imports 124 mutex_access
30 origin_langid 77 process_interest 125 antiav_detectreg
31 antiemu_wine_reg 78 bootkit 126 antivm_vbox_libs
32 process_needed 79 reg_read 127 antisandbox_sunbelt_libs
33 antisandbox_restart 80 stealth_window 128 antiav_detectfile
34 recon_programs 81 downloader_cabby 129 reg_access
35 str 82 multiple_useragents 130 stealth_timeout
36 antisandbox_unhook 83 pe_sec_character 131 antivm_vbox_keys
37 antiav_servicestop 84 disables_windowsupdate 132 persistence_ads
38 antivm_generic_system 85 packer_upx 133 fraudguard_threat_intel_api
39 cmd_exec 86 disables_system_restore 134 deepfreeze_mutex
40 net_con 87 ransomware_radamant 135 modify_uac_prompt
41 bcdedit_command 88 infostealer_browser 136 api_spamming
42 pe_sec_entropy 89 injection_rwx 137 modify_security_center_warnings
43 pe_sec_name 90 deletes_self 138 antivm_generic_disk_setupapi
44 creates_nullvalue 91 file_read 139 pony_behavior
45 packer_entropy 92 recon_fingerprint 140 banker_zeus_mutex
46 origin_resource_langid 93 antivm_vmware_keys 141 net_http
47 rat_spynet 94 infostealer_bitcoin 142 dridex_behavior
48 cryptAM 95 antiemu_wine_func 143 internet_dropper
144 label

static password can be the most effective security mechanism for cloud Nowadays, all systems are protected with username and password
infrastructure against APT attacks. The work principle of the proposed which is a one-factor authentication mechanism. However, one-factor
system is as follows: authentication systems are not deemed satisfactory for the protection
of cloud infrastructure against cyberattacks which steal personal
Step 1. Users send their user name and password to server while information.
logging in. Several shortcomings exist related to the use of passwords and those
Step 2. Server checks whether the user is registered in the system, are considered as a weak authentication mechanism [31]. Two-factor
generates one-time OTP password based on user information and authentication is used in order to eliminate the problems of passwords.
recorded time and sends it to user’s mobile phone via SMS. In this case, a special algorithm is adopted which generates a one-time
Step 3. Thereafter, the user enters the OTP code received via SMS password sent to a mobile phone via SMS in order to provide
(Short Message Service) to the system. two-factor authentication.
Step 4. If a one-time password generated by the client matches a one-
time password generated by the server, the user is authenticated. 6. Experiments

The use of username and password together with OTP code compli- 6.1. Dataset description
cates the maintaining of access to the system by APT attacks and theft of
individual’s personal information. The goal of OTP generator is to further The detection of APT attacks is considered to be a complex and the
complicate the theft of unauthorized access to limited resources, for most popular research area in scientific community and the lack of
instance, to confidential database. It becomes impossible for the attackers benchmark datasets in this area causes serious problems.
to steal users’ identification data during the application of this approach. MalwareTrainingSets dataset has been used in this study in order to

7
F.J. Abdullayeva Array 10 (2021) 100067

conduct experiments [32]. 292, 2024, 434, 2014 samples are included to Azerbaijan National Academy of Sciences (AzScienceNet) with the
the compiled dataset for APT1, Crypto, Locker, Zeus malwares, following characteristics (Ubuntu 16.04.3 LTS AMD64 system with
respectively. 331.2-GB memory and 2933.437-MHz CPU).
The features of the dataset are given in Table 1. In this paper, the Autoencoder model was built using Theano’s library
Behavioral characteristics of the Network were used to detect APT. of the Python package and was illustrated in Fig. 7. One part of the
These features are transmitted in the vector form of the input of the autoencoder model architecture is an encoder, and another part is a
classification algorithm. Given samples for properties used are the decoder. The encoder part of the network consists of 5 layers. The
following: decoder performs the reverse operation to the encoder and consists of 5
layers.
1) The megabytes size of the data transmitted from internal hosts to The encoding part comprises five layers with 14, 7, and 7 nodes.
external hosts. This feature indicates the change in the amount of Encoding architecture is connected to latent view space comprising of 3
information transmitted. If we observe a sharp increase in the amount nodes which is then connected to decoding architecture with 7, 7, and 14
of information transmitted from any host, then it can be considered nodes. The final layer comprises exact number of nodes as the input layer.
affected by the APT attack. In this study the number of nodes of the input layer are 3. Relu is used as
2) The number of connections initiated from internal hosts to external the activation function in the encoding and decoding layers. In the final
hosts. This feature indicates the data transferring process initiated by layer where we reconstruct the input data we use softmax activation
internal hosts. Because exfiltration is initiated by internal hosts. Here function.
we consider rarely established relationships as the APT attacker tries The summary of the constructed Autoencoder model is shown in
to create a small number of communications to avoid detection. Table 2.
3) The number of external IP addresses connected to the communica- The parameters given in Table 2 are generated as a result of experi-
tions initiated by internal hosts. This feature refers to the change in ments conducted on MalwareTrainingSets dataset. By application of
the number of different destination points associated with each in- autoencoder model into the considered dataset the obtained prediction
ternal host. If the number of external IP addresses connected to the results are shown in Table 3.
internal host remains unchanged and the number of transmitted bytes In the conducted experiments, the autoencoder model is provided the
increases significantly, this may indicate that the host has been best results over the dataset. From Table 3, we can see that the method
affected by an APT attack and the data exfiltration has occurred. provided the best results, and RMSE values of the mentioned method
over train and test process achieved 0.0010 and 0.0011 values, respec-
Three data classes are constructed in order to conduct experiments in tively. From these values, the advantage of the constructed autoencoder
this study: APT1, Crypto and other types of attacks. In the data pre- model is obvious. On the contrary in Convolutional Neural Network and
processing stage separating the data into train and test set is provided. Simple Neural Network, these values are worth.
Besides this, to equate the data effect on the results of the classification, As shown in Table 3, it is seen that the autoencoder algorithm has
their normalization was carried out. The difference between the training trained the neural network with little loss and high accuracy (training
and test dataset is that we use the training data to fit the model and loss ¼ 0.0010, training accuracy ¼ 0.9932). During the testing process,
testing data to test it. To implement this process the dataset is divided this model is also achieved low loss and high accuracy values (training
into train and test set to check accuracies, precisions by training and loss ¼ 0.0011, training accuracy ¼ 0.9897). These results show that the
testing model on dataset. In this study 80% of the data was taken for neural network has not suffered a great loss during prediction and almost
training and 20% for testing. carried out the prediction (testing accuracy ¼ 0.9897) well. As the
During the experiment, autoencoder neural network is firstly trained number of iterations has increased during the training of the developed
on MalwareTrainingSets dataset and the attacks are classified by trans- autoencoder model, the model produces more accurate results with
mitting the results obtained as the output of the autoencoder to the input
of softmax regression algorithm thereafter. We note that the experiments
Table 2
are carried out on Data Center of Institute of Information Technology of Constructed Autoencoder model.
Layer (type) Output Shape Param #

input_23 (InputLayer) (None, 3) 0


dense_89 (Dense) (None, 14) 56
dense_90 (Dense) (None, 7) 105
dense_91 (Dense) (None, 7) 56
dense_92 (Dense) (None, 3) 24
Total params: 241
Trainable params: 241
Non-trainable params: 0

Table 3
Prediction results of the autoencoder model on MalwareTrainingSets dataset.
Method Metrics Values

Autoencoder Train loss 0.0010


Train accuracy 0.9932
Test loss 0.0011
Test accuracy 0.9897
Convolutional Neural Network (CNN) Train loss 1.5943
Train accuracy 0.5208
Test loss 1.6912
Test accuracy 0.5103
Simple Neural Network (SNN) Train loss 0.7479
Train accuracy 0.6613
Test loss 0.7881
Test accuracy 0.6906
Fig. 7. Constructed autoencoder model.

8
F.J. Abdullayeva Array 10 (2021) 100067

Fig. 8. Loss and accuracy functions of the proposed method.

Table 4
Attacks classification accuracy.
Attack types Autoencoder KNN SVM CNN SNN Number of samples

APT1 (0) 0.9832 0.3321 0.3741 0.4233 0.7311 119


Crypto (1) 0.6088 0.5834 0.7301 0.0741 0.5912 795
OtherType Attacks (2) 0.1268 0.7342 0.5724 0.1112 0.6523 1041

minimal loss. The situation can clearly be observed visually in Fig. 8.


In good prediction models, the dynamics of the testing line must be in
the direction of the training line and should be as close to it as possible.
As seen from Fig. 8, the training curve almost completely overlaps the
testing curve.
In Table 4, a comparative analysis of autoencoder results with classic
classification algorithms on accuracy metric is described.
As seen in Table 4, the results of the autoencoder algorithm outper-
form the results of other algorithms. As observed, the APT1 attack class
detection efficiency of the autoencoder algorithm is superior to the KNN
algorithm. Thus, the accuracy of the KNN algorithm was 0.3321, while
this measure achieved 0.9832 in the autoencoder algorithm. Autoen-
coder has shown good performance in the classification of Crypto (1)
attack type. Such that, the algorithm has been able to recognize the
points from this class with 0.6088 accuracies, while KNN has been able to
recognize with 0.5834 accuracies based on the accuracy metrics. The
autoencoder has performed poorly in recognizing the data pertaining to
the third class. Practically, this algorithm has not been able to recognize
the points from this class. It is related to the fact that the third-class attack
has been shaped with the combination of tens of attacks with different
Fig. 9. Comparison of the methods on MalwareTrainingSets dataset. characteristics. KNN has produced the false recognition error and
recognized this class of attacks with 0.7342 accuracies. The same land-
scape is observed in other algorithms too. Besides this, the CNN

Fig. 10. Confusion matrix of the Autoencoder.

9
F.J. Abdullayeva Array 10 (2021) 100067

Fig. 11. Confusion matrix of the KNN algorithm.

algorithm couldn’t work on this dataset and as a result, it couldn’t References


recognize the samples of all classes.
For a better demonstration of the results, Fig. 9 visually illustrates the [1] World most popular data breaches. 2015 [http://www.informationisbeautiful. net/
visualizations/worlds- biggest- data- breaches- hacks/)].
comparison of methods. [2] The 15 biggest data breaches of the 21st century, https://www.csoonline.com/artic
The confusion matrix of the autoencoder algorithm on Malware- le/2130877/the-biggest-data-breaches-of-the-21st-century.html.
TrainingSets is represented in Fig. 10. [3] Virvilis N, Gritzalis D, Apostolopoulos T. Trusted computing vs. Advanced persistent
threats: can a defender win this game?. In: Proc. of the IEEE 10th international
As can be seen from Fig. 10, the autoencoder algorithm has been able conference on Ubiquitous intelligence and computing; 2013. p. 396–403.
to collect the data along the diagonal line of the confusion matrix and [4] Ghafir I, Prenosil V. Advanced persistent threat attack detection: an overview.
produced a smaller number of false recognition of points. This is the International Journal of Advances in Computer Networks and Its Security 2014;
4(4):50–4.
result we expect. However, the confusion matrix of the KNN algorithm [5] Paul Giura, Wang Wei. A context-based detection framework for advanced
has produced a very high number of false recognitions and practically, persistent threats. In: Proc. Of the international conference on cyber security; 2012.
has not been able to recognize the points from the APT1 class (Fig. 11). p. 69–74.
[6] Dijk CV, Williams P. The history of artificial intelligence. Expert systems in auditing
1990;Part 1. 21-16.
7. Conclusion [7] Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural
networks. Science 2006;313(5786):504–7.
[8] Obaidat MS, Nicopolitidis P, Zarai F. Modeling and simulation of computer
APT attacks use various more complex methods and tools while
networks and systems: methodologies and applications. 2015. p. 924.
attacking targets with the purpose of the theft of confidential and sen- [9] Chen P, Desmet L, Huygens C. A study on advanced persistent threats. In: Proc. Of
sitive information. APT is one of the types of advanced attacks with the IFIP international conference on communications and multimedia security;
multiple steps and specific strategies. 2014. p. 63–72.
[10] McWhorter D. Mandiant exposes APT1 - one of China’s cyber espionage units &
The presented article has proposed a deep autoencoder neural releases 3,000 indicators. 2013. https://www.fireeye.com/blog/threat-research/
network for the classification of types of APT attacks. The gist of the 2013/02/mandiant-exposes-apt1-chinas-cyber-espionage-units.html.
approach proposed in this article is that, while the features are selected [11] Villeneuve N. Operation Ke3chang: targeted attacks against ministries of foreign
affairs. 2013.
physically in existing studies, this study employs deep learning for [12] FireEye advanced threat report. 2014. p. 22.
choosing informative features automatically. According to this approach, [13] NIST SP 800-39. Managing information security risk. 2011. p. 88.
informative features are first learned with the application of the [14] Assessing outbound traffic to uncover advanced persistent threat. SANS Technology
Institute; 2011. p. 35.
autoencoder to training data, and the Softmax regression layer is [15] Brewer R. Advanced persistent threats: minimising the damage. Netw Secur 2014;4:
employed in order to classify APT types thereafter. 5–9.
In future studies, we are planning to develop new methods for [16] Marchetti M, Pierazzi F, Colajanni M, Guido A. Analysis of high volumes of network
traffic for advanced persistent threat detection. Comput Network 2016;109:127–41.
detecting APT attacks using other types of classifiers (convolution
[17] Neupane RL, Neely T, Calyam P, Chettri N, Vassell M, Durairajan R. Intelligent
autoencoders) based on autoencoders on various databases. Additionally, defense using pretense against targeted attacks in cloud platforms. Future Generat
one of the main purposes is the development of new innovative self- Comput Syst 2019;93:609–26.
[18] Stojanovi'c B, Hofer-Schmitz K, Kleb U. APT datasets and attack modeling for
training methods for ranking hosts involved in data exfiltration under
automated detection methods: a review. Comput Secur 2020;92:1–66.
the influence of APT. [19] Ghafir I, Hammoudeh M, Prenosil V, Han L, Hegarty R, Rabie K, Aparicio-
Navarro FJ. Detection of advanced persistent threat using machine-learning
correlation analysis. Future Generat Comput Syst 2018;89:349–59.
Author statement
[20] Giura P, Wang W. A context-based detection framework for advanced persistent
threats. In: Proc. Of the international conference on cyber security; 2012. p. 69–74.
Fargana J. Abdullayeva: Conceptualization, Investigation, Method- [21] Huang L, Zhu Q. A dynamic games approach to proactive defense strategies against
ology, Implementation, Writing. Advanced Persistent Threats in Cyber-Physical Systems. Comput Secur 2020;89:
1–16.
[22] Andrew V. Flow based analysis of advanced persistent threats: detecting targeted
Declaration of competing interest attacks in cloud computing. In: Proc. Of the first international scientific-practical
conference on problems of infocommunications science and Technology,; 2014.
p. 173–6.
The authors declare that they have no known competing financial [23] Zimba A, Chen H, Wang Z. Bayesian network based weighted APT attack paths
interests or personal relationships that could have appeared to influence modeling in cloud computing. Future Generat Comput Syst 2019;96:525–37.
[24] Cho DX, Nam HH. A method of monitoring and detecting APT attacks based on
the work reported in this paper.
unknown domains. Proc. of the 13th International Symposium Intelligent Systems
2019:316–23.
Acknowledgment [25] Zimba A, Chen H, Wang Z, Chishimba M. Modeling and detection of the multi-
stages of Advanced Persistent Threats attacks based on semi-supervised learning
and complex networks characteristics. Future Generat Comput Syst 2020;106:
This work was supported by the Science Development Foundation 501–17.
under the President of the Republic of Azerbaijan – Grant No. EIF-BGM-4- [26] Marchetti M, Pierazzi F, Colajanni M, Guido A. Analysis of high volumes of network
RFTF-1/2017–21/08/1. traffic for advanced persistent threat detection. Comput Network 2016;109:127–41.

10
F.J. Abdullayeva Array 10 (2021) 100067

[27] Johnson JR, Hogan EA. A graph analytic metric for mitigating advanced persistent [30] Rosenberg I, Sicard G, David EO. DeepAPT: nation-state APT attribution using end-
threat. Proc. of the IEEE International Conference on Intelligence and Security to-end deep neural networks. Proc. of the International Conference on Artificial
Informatics 2013:129–33. Neural Networks (ICANN) 2017;10614:91–9.
[28] Vance A. Flow based analysis of advanced persistent threats detecting targeted [31] Kora D, Simi D. Fishbone model and universal authentication framework for
attacks in cloud computing. In: Proc. Of the IEEE first international scientific- evaluation of multifactor authentication in mobile environment. Comput Secur
practical conference on problems of infocommunications science and Technology; 2019;85:313–32.
2014. p. 173–6. [32] Ramilli M. Malware Training Sets: a machine learning dataset for everyone. 2016.
[29] Xiao Liang, et al. Cloud storage defense against advanced persistent threats: a https://marcoramilli.com/2016/12/16/malware-training-sets-a-machine-learni
prospect theoretic study. IEEE J Sel Area Commun 2017;35(3):534–44. ng-dataset-for-everyone.

11

You might also like