KEMBAR78
Confidential Computing | PDF | Cloud Computing | Encryption
0% found this document useful (0 votes)
55 views3 pages

Confidential Computing

Uploaded by

mariatresabinu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views3 pages

Confidential Computing

Uploaded by

mariatresabinu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

2023 IEEE 16th International Conference on Cloud Computing (CLOUD)

Towards Confidential Computing: A Secure Cloud


Architecture for Big Data Analytics and AI
Naweiluo Zhou, Florent Dufour, Vinzent Bode, Peter Zinterhof, Nicolay J Hammer, Dieter Kranzlmüller
Leibniz Supercomputing Centre (LRZ), Munich, Germany
Email: naweiluo.zhou@ieee.org, Firstname.Lastname@lrz.de

Abstract—Cloud computing provisions computer resources at encrypted, and only temporarily decrypted for use within an
2023 IEEE 16th International Conference on Cloud Computing (CLOUD) | 979-8-3503-0481-7/23/$31.00 ©2023 IEEE | DOI: 10.1109/CLOUD60044.2023.00042

a cost-effective way based on demand. Therefore it has become isolate before being encrypted again for store [4].
a viable solution for big data analytics and artificial intelligence In this paper, we prototype a secure cloud system that finds
which have been widely adopted in various domain science. Data
security in certain fields such as biomedical research remains a a trade-off in terms of security, performance, maintenance and
major concern when moving their workflows to cloud, because usability. This cloud architecture can safeguard user workflows
cloud environments are generally outsourced which are more in transit, in use and at rest. Furthermore, it offers the choice
exposed to risks. We present a secure cloud architecture and of different security levels depending on the data sensitivity.
describes how it enables workflow packaging and scheduling Three main encryption technologies are adopted to achieve
while keeping its data, logic and computation secure in transit,
in use and at rest. this goal: encrypted containers, encrypted VMs and encrypted
Index Terms—Secure Cloud, Confidential Computing, Open- storage system in addition to resource isolation. The rest of the
stack, Container, VM, Encryption paper is organised as follows. Firstly, Section II briefly reviews
related work. Next, the proposed architecture is described
I. I NTRODUCTION in Section III. Followed, some preliminary results are given
Cloud computing offers on-demand compute and storage in Section IV. Lastly, Section V concludes the paper and
resources at a cost-effective way, and therefore has become a proposes future work.
viable solution for big data analytics and Artificial Intelligence
(AI). The two technologies have been applied in various II. R ELATED W ORK
domain science to process massive amounts of data generated Höb et al. [5] presented a software framework that enables
by scientific experiments and simulations, and to facilitate automatic converting of Docker [6] container images into
decision making. In biomedical research, for instance, AI Charliecloud [7] solutions; and scheduling thereof onto high
is utilised to detect and predict diseases, and improve per- performance computing (HPC) systems. CharlieCloud is a
sonalised medicine [1]. Concern over data security is raised container engine specifically designed for HPC systems. We
when moving such applications to the cloud because of data adopt the concept of automatic container image generation and
sensitivity in these fields. Data anonymisation is a typical way scheduling. Additionally, we include container encryption and
to preserve data privacy, however, certain information such adapt the scheduling strategies for Cloud systems.
as genome that is unique to any individual, requires further Nolte et al. [8] described a secure workflow targeted for
protection strategies against threats in cloud. HPC systems aiming to process sensitive data on a system
Often, hardware resources are shared among users in a cloud that presumes to be untrusted. Workflow security is realised
environment so as to achieve cost efficiency and maximise by isolating compute nodes; encapsulating workflow inside
resource usage, e.g. multiple virtual machines (VM) share encrypted containers; and requiring decryption keys to mount
one physical host. Virtualisation not only offers resource data into the workflow containers. Asymmetric encryption and
efficiency, but also preserves isolation at software level. Re- decryption keys are generated and distributed to users via a key
source isolation reduces likelihood of a breach and limits the management system Vault1 . Similarly, our architecture adopts
scope of damage when a breach occurs [2]. Unfortunately, the idea of container encryption and makes use of Vault to
virtualisation also introduces new risks. For example, VMs ascertain application security.
are normally managed by hypervisors, and a compromised
hypervisor can seize control of all the VMs and accesses III. D ESIGN OF THE S ECURE C LOUD
their data. Data encryption plays a crucial part in preserving This section describes our cloud architecture as illustrated
workflow security, while the other imperative part is to encrypt in Fig. 1 including hardware and software stack. Table I
workflow logic and computation [3]. Conventionally, workflow lists the main software. The hardware incorporates a compute
encryption deals with its status at rest (on disk) and in node cluster, storage located within the same network as
transit (transfer between devices). Confidential computing [4] the compute nodes (referred as local storage herein, limited
addresses security issues of data and code in use (e.g. in capacity), and storage (large capacity) located in a separate
memory and register). The above three technologies enable
a workflow moving freely from device to device by being 1 https://github.com/hashicorp/vault (accessed on 17/03/2023)

2159-6190/23/$31.00 ©2023 IEEE 293


DOI 10.1109/CLOUD60044.2023.00042
pubkey for encryption Resource management Software stack
private key Vault
fordecryption

Encrypted VMs
Middleware system client
ssh Middleware system server Quobyte Client
Workflow workflow encryption and
encapsulation workflow encryption scheduling access to Quobyte
decryption
data store

Quobyte storage system


compute compute compute
node node node
transfer data
User network
local storage local storage local storage

mount
Compute node cluster   Storage cluster (encrypted)

Fig. 1. Architecture of the secure cloud powered by confidential computing technologies.

network. The software stack performs three main functions: image. The encryption and decryption are performed asymmet-
resource provision, threat defence, and program encapsulation rically via a RSA key pair (i.e. pubkey and private key). The
and scheduling. pubkey is generated and offered to users via a key management
system Vault. Similarly, LUKS images are encrypted and
Software name Description decrypted using the same key pair.
OpenStack Resource provision 2) Encryption In Use: Container decryption only occurs
AMD SEV VM encryption for data in use
Middleware system Containerisation, encryption and scheduling shortly at runtime within kernel, and the container image stays
Singularity Containerisation, encryption encrypted when stored on disk. Vault passes the decryption
Vault Key management system key to the user’s VMs. This key only resides on the encrypted
Quobyte Storage management system; data encryption at rest
storage system mounted to the VM nodes (see section III-C3).
TABLE I
T HE SOFTWARE STACK OF THE C LOUD ARCHITECTURE . When a higher security level is required, which is likely to
bring performance vicissitude, users can choose to spawn the
VM nodes powered by the AMD SEV (Secure Encrypted
A. Resource Provision Virtualisation) technology [11]. AMD SEV is a pioneer
technology towards confidential computing through memory
Our architecture utilises OpenStack to (1)provision the VM
encryption, which isolates a hypervisor from its VMs. When
resources and local storage; (2)offer network isolation; and
a hypervisor reads the VM memory, it only sees encrypted
(3)enforce role-based access control and organisation-based
bytes. This protects VMs from malicious administrators and
access control.
defends attacks to VMs from a compromised hypervisor.
B. Workflow Containerisation And Scheduling 3) Encryption At Rest: The storage is split into two isolated
The Middleware system as illustrated in Fig. 1 is composed networks for security reasons. The local store is a general
of two parts: the middleware system client and the middleware filesystem where a user’s workflow at rest is encrypted inside
system server. The client is distributed to users for encapsu- a Singularity image and a LUKS image. The data for long-term
lating their workflow inside a Singularity [9] container image. and more secure store can be moved to the Quobyte storage
Containers enable application portability and environment system where the disk is encrypted with symmetric AES-XTS
compatibility. In case of big data sets, it is packed in a [12] algorithm using a 128-bit or 256-bit cryptographic key.
separate LUKS (Linux Unified Key Setup) filesystem image. Quobyte2 also partitions the storage into isolated domains,
The server schedules the containerised workflow to multiple which enables organisation-wise accesses restrictions. Without
VM nodes via invocation of MPI (Message Passing Interface) Quobyte, user workflows experience reduced security.
[10], and passes the private key to decrypt the containers and IV. P RELIMINARY P ERFORMANCE A NALYSIS
LUKS images. The preliminary results show performance costs brought by
C. Security Measures To Defend Threats the two encryption technologies in our architecture: Quobyte
This section describes three types of data encryption strate- system encryption and Singularity encryption. To evaluate
gies. Furthermore, network separation, storage partition and encryption cost introduced by Quobyte, we measure the I/O
isolation (III-C3) are proposed to protect data at rest. bandwidth fluctuations using the IOR benchmark3 . The ex-
1) Workflow Encryption In Transit: Besides ssh, which is ecution time difference caused by Singularity encryption is
the de facto standard secure way to connect remote machines, evaluated with an MPI benchmark: BPMF [13]. The BPMF
workflows are encrypted during transmission from a user benchmarks are containerised inside Singularity.
network to the cloud to ward off network security breaches. 2 https://www.quobyte.com/ (accessed on 13/03/2023)
The middleware system client builds an encrypted Singularity 3 https://ior.readthedocs.io/en/latest/intro.html (accessed on 13/03/2023)

294
Fig. 2 gives the comparison on I/O bandwidth for the local 400

t
storage and Quobyte-managed storage (i.e. encryption with

yp
cr
t
yp

en
cr
128-bit AES-XTS and encryption disabled). We herein only

un
en
show the speed of write operations. The local storage shows a 300

execution time(second)
higher bandwidth comparing with Quobyte-managed storage,
however, its magnitude is insignificant. In contrast to a priori

pt
reasons, the performance impact caused by Quobyte encryp-

ry
t
yp

c
en
cr
en

un
tion is negligible. This may be emerge from network latency 200

that costs more than encryption. Fig. 3 presents the execution

t
yp
time of BPMF with and without Singularity encryption. No

cr
yp

en
cr
en

un

t
visible performance degradation is introduced by container

yp
pt

cr

t
ry

yp
100

en

t
c

t
yp

cr

yp
encryption.

en

un

pt
en
cr

r
nc
ry
en

un

e
en

un
0
1Process 2Process 4Process 6Process 8Process 10Process
800
760 local storage (1P)
720 local storage (10P)
quobyte (1P)
680
quobyte (10P)
640 quobyte AES−XTS (1P)
600 quobyte AES−XTS (10P) Fig. 3. Execution time of BPMF encapsulated in encrypted and unencrypted
560
520
containers running with different process numbers on one single node.
write bw(MiB/s)

480
440
400
360 R EFERENCES
320
280
[1] A. Harvey, A. Brand, S. T. Holgate, L. V. Kristiansen, H. Lehrach,
240
200 A. Palotie, and B. Prainsack, “The future of technologies for person-
160 alised medicine,” New Biotechnology, vol. 29, no. 6, pp. 625–633, 2012.
120 Molecular Diagnostics & Personalised Medicine.
80
40
[2] A. Randal, “The Ideal Versus the Real: Revisiting the History of Virtual
0 Machines and Containers,” ACM Comput. Surv., vol. 53, feb 2020.
[3] M. Barika, S. Garg, A. Y. Zomaya, L. Wang, A. V. Moorsel, and
400k 800k 1600k 3200k 6400k R. Ranjan, “Orchestrating Big Data Analysis Workflows in the Cloud:
transfer size Research Challenges, Survey, and Future Directions,” ACM Comput.
Surv., vol. 52, Sept. 2019.
[4] D. P. Mulligan, G. Petri, N. Spinale, G. Stockwell, and H. J. M.
Fig. 2. The I/O bandwidth comparison for the normal local filesystem, Vincent, “Confidential Computing –a brave new world,” in 2021 In-
Quobyte storage system (with and without 128-bit AES-XTS encryption). The ternational Symposium on Secure and Private Execution Environment
bandwidth shows the operations of transferring various sizes of data blocks Design (SEED), pp. 132–138, 2021.
with 1 process and 10 processes. Local storage refers to the storage within [5] M. Höb and D. Kranzlmüller, “Enabling EASEY Deployment of Con-
the same network as the compute node. tainerized Applications for Future HPC Systems,” in Lecture Notes
in Computer Science, pp. 206–219, Springer International Publishing,
2020.
V. C ONCLUSION AND F UTURE W ORK [6] D. Merkel, “Docker: Lightweight Linux Containers for Consistent
Development and Deployment,” Linux J., vol. 2014, pp. 76–90, Mar.
This article presented a secure cloud architecture that 2014.
safeguards user workflows and enables moving them freely [7] R. Priedhorsky and T. Randles, “Charliecloud: Unprivileged Containers
between user and cloud. A salient feature of this cloud system for User-Defined Software Stacks in HPC,” in Proceedings of the
International Conference for High Performance Computing, Networking,
is that it offers the flexibility to provision resources as a general Storage and Analysis, SC 17, (New York, NY, USA), Association for
purpose cloud or a highly-secured cloud to host domain Computing Machinery, 2017.
science workflows that have different data sensitivity. [8] H. Nolte, S. H. S. Sabater, T. Ehlers, and J. Kunkel, “A Secure Workflow
Next, VM encryption cost will be benchmarked, which for Shared HPC Systems,” in 2022 22nd IEEE International Symposium
on Cluster, Cloud and Internet Computing (CCGrid), pp. 965–974, May
is estimated to give a non-trivial impact on performance. 2022.
To alleviate the performance decline, future work will focus [9] G. M. Kurtzer, V. V. Sochat, and M. Bauer, “Singularity: Scientific
on scalability of compute resources and introduce GPU sup- containers for mobility of compute,” in PloS one, (San Francisco,
California, United States), PLOS, 2017.
ports. Performance evaluation will be carried out to compare [10] Message Passing Interface Forum, “MPI: A Message-Passing Interface
scheduling efficiency of applications in big data analytics and Standard,” June 2021.
AI when different security levels are featured. [11] “ Strengthening VM isolation with integrity protection and more,” tech.
rep., AMD, Jan. 2020.
ACKNOWLEDGMENT [12] “The XTS-AES Tweakable Block Cipher,” Apr. 2008.
[13] T. V. Aa, I. Chakroun, and T. Haber, “Distributed Bayesian probabilis-
We kindly acknowledge the support of Bavarian State Min- tic matrix factorization,” in CLUSTER, pp. 346–349, IEEE Computer
istry of Health and Care who funded this work with DigiMed Society, 2016.
Bayern (Grant No: DMB-1805-0001) within its Masterplan
Bayern Digital II.

295

You might also like