A Syscall-Level Binary-Compatible Unikernel
A Syscall-Level Binary-Compatible Unikernel
9, SEPTEMBER 2022
Abstract—Unikernels are minimal single-purpose virtual machines. They are highly popular in the research domain due to the benefits
they provide. A barrier to their widespread adoption is the difficulty/impossibility to port existing applications to current unikernels.
HermiTux is the first unikernel providing system call-level binary compatibility with Linux applications. It is composed of a hypervisor
and a lightweight kernel layer emulating the load- and runtime Linux ABI. HermiTux relieves application developers from the burden of
porting software, while providing unikernel benefits such as security through hardware-assisted virtualized isolation, swift boot time,
and low disk/memory footprint. Fast system calls and kernel modularity are enabled through binary rewriting and analysis techniques,
as well as shared library substitution. HermiTux’s design principles are architecture-independent and we present a prototype on both
the x86-64 and ARM aarch64 ISAs, targeting various cloud as well as edge/embedded deployments. We demonstrate HermiTux’s
compatibility over a range of native C/C++/Fortran/Python Linux applications. We also show that it offers a similar degree of
lightweightness compared to other unikernels, and that it performs similarly to Linux in many cases: its performance overhead averages
3% in memory- and compute-bound scenarios, and its I/O performance is acceptable.
[11] offer such binary compatibility by interfacing at the level be executed as a virtualized guest on top of a hypervisor.
of the C library, acting similarly to a dynamic loader. This Unikernels are qualified as: (A) single purpose: a unikernel
prevents them for supporting a wide range of applications contains only one application; and (B) single address space:
requiring OS services through system calls made without because of (A), there is no need for memory protection
going through the C library. Contrary to these works, and in within the unikernel, consequently the application and the
order to maximize compatibility, HermiTux is compatible at kernel share a single address space and all the code executes
the system call level which is a standardized interface used by with the highest privilege level.
all applications and libraries compiled for Linux. Such a model provides significant benefits. In terms of
The first challenge HermiTux tackles is how to provide sys- security, the strong isolation between unikernels provided
tem call-level binary compatibility? To that end, HermiTux by the hypervisor makes them good candidates for cloud
sets up the execution environment and emulates OS interfa- deployments. Moreover, a unikernel contains only the nec-
ces at runtime in accordance with Linux’s Application essary software needed to run a given application. Com-
Binary Interface (ABI). A custom hypervisor-based ELF bined with the very small size of the kernel, this leads to a
loader is used to run a Linux binary alongside a minimal significant reduction in the application attack surface com-
kernel in a single address space Virtual Machine (VM). Sys- pared to regular VMs [12]. Some unikernels are also written
tem calls made by the program are redirected to the imple- in languages providing memory-safety guarantees [2]. Con-
mentations the unikernel provides. A second challenge cerning performance, unikernel system calls are fast
HermiTux faces is how to maintain unikernel benefits while pro- because they are common function calls: there is no costly
viding such binary compatibility? Some come naturally (small world switch between privilege levels [5]. Context switches
disk/memory footprints, virtualization-enforced isolation), are also swift as there is no page table switch or TLB flush.
while others (fast system calls and kernel modularity) pose In addition to the codebase reduction due to small kernels,
technical challenges when assuming no access to sources. unikernel OS layers are generally modular: it is possible to
To enable such benefits, HermiTux uses binary rewriting configure them to include only the necessary features for a
and analysis techniques for static executables, and substi- given application. Small size and modularity lead to a
tutes at runtime a unikernel-aware C library for dynami- reduction in resource usage (RAM, disk), which translates
cally linked executables. Finally, HermiTux is optimized for into cost reduction for the cloud user, and high per-host VM
low disk/memory footprint and attack surface, which are density for the cloud provider [1].
as low as or lower than existing unikernel models. All these benefits make that the application domains for
Because of the wide range of unikernel application cases, unikernels are plentiful. They are a perfect fit for the data-
HermiTux aims to be compatible in both server and embed- center [1], [2] that runs the majority of cloud applications
ded virtualization scenarios. Thus, our system is developed requiring high degree of isolation, or compute-intensive
for the Intel x86-64 and ARM aarch64 (ARM64) Instruction jobs necessitating high performance and low OS overheads.
Set Architectures (ISAs). The fundamental principles of Furthermore, the reduced resource usage of unikernels
HermiTux’s design are architecture independent. However, make them uniquely suited for embedded virtualization [4],
its implementation as well as the design of the binary [12], a domain of growing importance with the emergence
rewriting/analysis techniques we use to bring back uniker- of paradigms such as edge computing and IoT. Because the
nel benefits are ISA specific. These differences are described application domains of unikernels include both server and
in this paper. embedded machines, the system presented in this paper tar-
The contributions presented in this paper are: gets two ISAs: Intel x86-64, which is unarguably the domi-
nant architecture in the datacenter, and aarch64, widely
A new unikernel model designed to execute native used in embedded devices.
Linux executables while maintaining the classical
unikernel benefits;
Two prototype implementations of that model on the 2.2 Porting Existing Applications to Unikernels
x86-64 and aarch64 architectures; Porting existing software to run as a unikernel in order to
An evaluation of these prototypes comparing their reap these benefits can be difficult or even impossible. First,
performance to Linux, containers, and other uniker- in some situations, the unavailability of an application’s
nel models: OSv [10], Rump [9] and Lupine sources (proprietary software) makes porting it to any exist-
Linux [11]. ing unikernel impossible, as all require recompilation/
This paper is organized as follows: we give some back- relinking. Second, porting legacy software to a unikernel
ground and motivation in Section 2. In Section 3, we present that supports only modern programming languages
the design of HermiTux, then give implementation details requires a full application rewrite in that target language [2],
in Section 4. A performance evaluation is presented in Sec- which in many scenarios is unacceptable. Third, consider-
tion 5. We present related works in Section 6, before con- ing the unikernels supporting legacy languages, the task
cluding in Section 7. still represents a significant challenge [1], [6], [8] for multi-
ple reasons. A given unikernel supports a limited set of ker-
nel features and software libraries. If a feature, library, or a
2 BACKGROUND AND MOTIVATION particular version of a library required for an application is
2.1 Unikernels not supported, the application would need to be adapted [6].
A unikernel [2] is an application statically compiled with the In many cases the lack of a feature/library means that the
necessary libraries and a thin OS layer into a binary able to application cannot be ported at all. Moreover, unikernels
Authorized licensed use limited to: Universidad Del Norte Biblioteca. Downloaded on October 23,2023 at 01:58:53 UTC from IEEE Xplore. Restrictions apply.
2118 IEEE TRANSACTIONS ON COMPUTERS, VOL. 71, NO. 9, SEPTEMBER 2022
use complex build infrastructures and it can be burdensome C Library (libc) level: the unikernel includes a dynamic loader
to port the large build infrastructures of some legacy appli- that catches at runtime the calls to the libc functions such as
cations (large/generated Makefiles, autotools/cmake envi- printf, fopen and redirects them to the kernel.
ronments) to unikernel toolchains. The same goes for Such an method of interfacing implies the assumption
changing the compiler or build options. that all syscalls are made through the libc, which does not hold
We believe that this large porting cost, combined with true when considering the wide variety of modern applica-
the fact that it is the responsibility of the application pro- tion binaries. We analyzed the entirety of Debian 10 x86-64
grammer, represents an important factor explaining the repositories (main, contrib and non-free) and counted
slow adoption of unikernels in the industry. One solution is 553 ELF executables including at least one invocation of the
to have a unikernel provide binary compatibility for regular syscall instruction: these represent programs that perform
executables while still keeping the classical unikernel bene- system calls without going through the libc, and that as such
fits such as small codebase/footprint, fast boot times, mod- would not be supported by libc-level binary compatible uni-
ularity, etc. This new model allows unikernel developers to kernels. This limited libc-level compatibility prevents these
work on generalizing the unikernel layer to support a maxi- systems from running a relatively large range of applications
mum number of applications, and relieves application that would highly benefit from execution as unikernels. Just
developers from any porting effort. Such an approach to give a few examples, a plethora of cloud services are writ-
should also support developer tools such as debuggers. In ten in Go, a language that performs most system calls with-
that context, HermiTux allows running Linux binaries as out going through a standard C library. Furthermore, due to
unikernels, while maintaining the aforementioned benefits. the lack of compatibility at the system call level, OSv does
not support the most popular HPC shared memory pro-
2.3 The Lightweight Virtualization Design Space gramming framework, OpenMP.1 Finally, libc-interfacing
precludes support for static binaries.
The lightweight virtualization design space includes uniker-
HermiTux represents an attempt to push the degree of
nels, security-oriented LibOSes such as Graphene [13], [14],
compatibility of unikernels further by interfacing at a much
and containers with software [15] and hardware [16] harden-
more standard and unanimously used interface: the system
ing techniques. HermiTux requires no application porting
call level.
effort, and further differs from other binary-compatible sys-
tems because of 2 reasons. First, as a unikernel HermiTux
runs hardware-enforced (Extended Page Tables) VMs, an iso- 3 SYSTEM DESIGN
lation mechanism that is fundamentally stronger than soft- The design of HermiTux is based on the following assump-
ware-enforced isolation (containers/software LibOS). It is tions: we assume that the sources of the binaries we con-
shown by the current trend of running containers within VMs sider are unavailable. We make no assumption about the
for security (clear containers [17]) and the efforts to strengthen compiler used, the level of optimizations, or whether the
containers’ isolation (such as gVisor [15]). This is generally binary is stripped or obfuscated. Thus, disassembling and
used as a security argument in favor of unikernels versus con- reassembling, generally considered quite unreliable [19],
tainers [1], [18]. Second, SGX-based isolation such as used in [20], is not a suitable solution. We rather decide to offer
Graphene-SGX [14] has a non-negligible performance impact binary compatibility with a commodity OS, Linux.
that is fundamentally higher than the very low performance The Linux ABI. To offer binary compatibility, HermiTux’s
overhead of current direct-execution, hardware-enforced vir- kernel needs to comply with the set of rules constituting the
tualization techniques leveraged in HermiTux. Linux ABI [21]. These rules are partially ISA-specific, and
HermiTux enables a wide range of applications to trans- can be broadly classified into load-time and runtime rules.
parently reap unikernel benefits without any porting effort: Load-time rules include the binary format supported (ELF),
there is no need for code modification and the potential which area of the 64 bit address space is accessible to the
complexities of maintaining a separate branch. Given the application, the method of setting up the address space by
security and footprint reduction features provided by uni- loading segments from the binary file, and the particular
kernels, this is highly valuable in today’s computer systems register state (ISA-specific) and stack layout (command line
landscape where software and hardware vulnerabilities reg- arguments, environment variables, ELF auxiliary vector)
ularly make the news, and where datacenter architects are expected at the application entry point. Runtime rules
seeking ways to increase consolidation and reduce include the instruction used to trigger a system call, and the
resource/energy consumption. Being binary compatible registers containing its arguments and return value: these
allows HermiTux to be the only way to run proprietary soft- are obviously ISA-specific. Finally, Linux applications also
ware (whose sources are not available) as unikernels. expect to communicate with the OS by reading and writing
Finally, HermiTux allows commodity software to reap tra- various virtual filesystems (/proc, /sys, etc.) [22] as well
ditional benefits of VMs such as checkpoint/restart or as through a memory area shared with the kernel: the
migration without the associated overhead of a large disk/ vDSO/vsyscall.
memory footprint.
3.1 System Overview
2.4 System Call-Level Binary Compatibility HermiTux’s design objective is to emulate the Linux ABI at
Two existing unikernels already claim binary compatibility load- and runtime while providing unikernel principles. Load
with applications, OSv [10] and Lupine Linux [11]. It is impor-
tant to note that both offer binary compatibility at the standard 1. https://github.com/cloudius-systems/osv/issues/590
Authorized licensed use limited to: Universidad Del Norte Biblioteca. Downloaded on October 23,2023 at 01:58:53 UTC from IEEE Xplore. Restrictions apply.
OLIVIER ET AL.: SYSCALL-LEVEL BINARY-COMPATIBLE UNIKERNEL 2119
small subset of that interface [28]. It has also be shown that kernel, disabling any dependence on the host in that regard.
one can support 90% of a standard distribution’s binaries Building a full-fledged filesystem is out of the scope of this
by implementing as few as 200 system calls [22]. To support work, however MiniFS’s simple implementation is suffi-
the applications presented in the evaluation section (Sec- cient to run with honorable performance the benchmarks
tion 5), our prototype implements 107 system calls. Source- used in our performance evaluation (see Section 5), includ-
compatible unikernels such as OSv [10] or Rumprun [9] also ing Postmark. MiniFS also emulates pseudo files with con-
showed that supporting a relatively large portion of the figurable per-file read and write functions: we emulate
Linux system call API does not lead to a significantly large /dev/zero with a custom read function filling the user
codebase size or attack surface. buffer with zeros, /dev/cpuinfo with a read function
populating the buffer with Linux-like textual information
about the CPU, etc.
3.4 Unikernel Benefits & Isolation
System call latency in unikernels is low as they are common
function calls. Despite a system call handler optimized for 4 IMPLEMENTATION
the unikernel context, we observed that in HermiTux this HermiTux is build on top of HermitCore [5] with 15K addi-
latency still does not approach that of function calls: it is tional LoC on top of HermitCore’s 20K LoC. It supports
due to the instruction used to perform a system call used in both x86-64 and aarch64. Although our system’s design
unmodified Linux binaries. In both x86-64 and aarch64 principles are architecture independent, a (small) subset of
ISAs, that instruction relies on an exception. The latency of its implementation is architecture-specific.
such an operation is significantly higher than that of a com- Loading and Initialization. The hypervisor sets up the VM
mon call instruction. and loads both the kernel and the application in memory,
Without assuming access to the application sources, we according to the ELF metadata in the binaries. If the applica-
rely on two techniques to offer fast system calls in Hermi- tion supports PIC/PIE it is loaded at a random location.
Tux (Fastcall in Fig. 1). For static binaries, we use binary Next the kernel initializes and creates a page table defining
instrumentation to rewrite the system call instructions a single address space. After initialization, the kernel creates
found in the application with regular function calls to the a task for the application. The application will share its stack
corresponding unikernel implementations. This process is with the kernel, so the stack is filled with elements (com-
ISA-specific and is detailed for both x86-64 and aarch64 in mand line parameters, etc.) according to the ABI conven-
Section 4. For dynamically linked programs, we observe tion, with a series of push operations on x86-64. Doing the
that in a vast majority of application binaries, most of the same thing is not practical on aarch64 as this ISA only sup-
system calls are made by the standard C library. With that ports 16 bytes-aligned push operations, and many elements
in mind, in HermiTux dynamic binaries are linked at load we wish to push are 8 bytes in size. Thus, we fill a tempo-
time against a unikernel-aware C standard library that we rary buffer that ends up being copied to the stack with
designed, in which all the system calls are replaced by func- potentially one byte of padding.
tion calls to the kernel. We call this technique library substi- System Call Handling. The kernel installs and implements
tution. It bears a resemblance to the way the OSv [10] kernel a system call handler adhering to the Linux ABI: system call
and LibC interface with an application, however in our case number in %rax/%x8; arguments in %rdi, %rsi, %rdx,
we do not write a Libc from scratch but adapt automatically %r10, %r9, %r8/%x0-%x5; and return value in %rax/%r0
an existing one using a code transformation tool, Cocci- for x86-64/aarch64, respectively. The handler saves the
nelle [29]. Compared to writing a Libc from scratch, we register’s contents, calls the implementation of the invoked
believe this solution not only provides a larger support for system call, and restores the register before returning. It is
Libc functionalities, but is also more robust and future optimized: many ‘world switch’ operations are unnecessary
proof. in a unikernel (for example stack switches). When return-
Modularization is another important benefit of uniker- ing, we can also avoid costly instructions such as sysreton
nels. Because of our binary compatibility goal, the system x86-64 and replace it with a simple jump.
call codebase is relatively large in HermiTux. We design the HermiTux supports currently 107 system calls. Many are
kernel so that the implementation of each system call can be only partially supported, for example, ioctl only supports
compiled in or out at kernel build time. In addition to mem- the necessary commands for LibC initialization. 4K LoC are
ory footprint reduction, this has security benefits that are dedicated to the system call layer, showing that HermiTux
stronger than traditional system call filtering (such as sec- can keep a small unikernel codebase while supporting a
comp): it is not only impossible to call the concerned system wide range of applications as presented in the evaluation
calls, but the fact that their implementation is entirely section. With the supported system calls, HermiTux is able
absent from the kernel means that they cannot be used in to emulate Linux’s support of networking, filesystem, multi-
code-reuse attacks. To compile a tailored kernel for a given threading and synchronization, memory mappings, process
application whose sources are not necessarily available, we management, break management, signals, time manage-
designed a binary analysis tool able to scan an executable ment, and scheduling.
and detect the various system calls that can be made by that Fast System Calls. In its basic form, HermiTux uses a tra-
program. ditional system call handler and thus looses the unikernel
HermitCore forwarding filesystem calls to the host raises benefit of low-latency system calls. To recover that feature
obvious concerns about security/isolation. We imple- for dynamically compiled binaries, we link at runtime
mented a basic RAM filesystem, MiniFS, within HermiTux’s against a unikernel-aware standard C Library. The
Authorized licensed use limited to: Universidad Del Norte Biblioteca. Downloaded on October 23,2023 at 01:58:53 UTC from IEEE Xplore. Restrictions apply.
OLIVIER ET AL.: SYSCALL-LEVEL BINARY-COMPATIBLE UNIKERNEL 2121
unikernel-aware C library loaded at runtime with dynamic number of system calls such as exit never return thus loos-
binaries is adapted from Musl Libc. We use the Cocci- ing %x30 is acceptable. Combined, these 3 cases cover more
nelle [29] tool to describe high level code transformation than 90% of the system call invocation in a standard libc
rules updating system call invocations into function calls to (Musl). We used the angr [30] binary analysis tool to iden-
HermiTux’s kernel. With a small set of rules (80 lines), we tify them and perform safe replacements of system calls by
are able to update 97.5% of the 500+ system call invocation function calls. The 10% syscalls left go through the standard
within the entire library. We confirmed the success of this trap-based handling mechanism.
method over different versions of Musl released multiple System Call-Based Modularity. With the growing support
years apart. for the Linux ABI, the subset of HermiTux’s codebase that
Regarding static binaries, we resort to binary rewriting, concerns system call implementation is relatively large: it
realized statically to avoid any runtime overhead. Our goal currently represents about 25% of the entire unikernel code-
is to replace the occurrences of the system call instruction base. To bring back the “modularity” feature of unikernels
(syscall for x86-64, SVC for aarch64) by function calls to into HermiTux, we propose the compilation of tailored ker-
the kernel. For x86-64, an ISA whose instructions have vari- nels containing only the implementation of the system calls
able sizes, the main challenge lies in the small size of sys- required for an application. This is achieved by having as
call: 2 bytes. It is too small to allow replacement by any much as possible of each system call’s processing code
kind of call or jump-like instruction without overwriting implemented within its own compilation unit (C source
the next instruction(s) in the code segment. To address that file), and using preprocessor directives to enable/disable
issue, we overwrite each occurrence of the syscall calls to the system call implementations (sys_*) from the
instruction as well as the next instruction(s) with a jmp to a generic system call handler as necessary at build time.
snippet of code that we developed. This code is in charge of To leverage this functionality and build a kernel tailored
first adapting the Linux syscall ABI convention to the func- for a given application it is necessary to know the entire set
tion call system-V convention (i.e., moving %r10 to %rcx). of system calls that can possibly be invoked by the said
Second, the system call implementation in the kernel is application. To that aim we decide to rely on static analysis.
invoked with a common function call instruction, callq. As we do not assume access to the application source code,
Finally, the instructions following the syscall that were we resort to decompiling the binary. Armed with the
originally overwritten are replayed, before jumping back to knowledge of the system call invocation ABI convention,
instruction following the last overwritten instruction. we explore system call sites in the decompiled machine
Although this process includes a few operations in addition code and determine at these points what is the value present
to the function call, it is still much faster than a traditional in the register holding the system call identifier: %rax for
system call invocation. x86-64 and %r8 for aarch64. This technique works for both
On the other hand, aarch64 is a fixed-size instruction set statically and dynamically compiled binaries, as for the lat-
and thus does not suffer from the same issue as x86-64. Intu- ter it can be applied to the application binary as well as to
itively the system call instruction SVC can be simply over- libraries. We use Dyninst [31] for x86-64 and Angr [30] for
written with a function call, i.e., a BL (Branch and Link) aarch64 to decompile the binaries and obtain the CFG. We
without side-effects on the following instructions. The iterate on the instruction flow backwards until we find the
actual challenge for aarch64 lies in an important ABI point: value loaded in %rax/%r8 identifying a system call. In the
contrary to x86-64 that stores the return address on the vast majority of cases this identifier comes from a constant
stack, aarch64 holds it in the special register %x30. Thus, in the original code and this search is straightforward. For
when we replace the system call instruction SVC with the Glibc, we found one call site where this value came from
function call one BL, BL overwrites %x30 holding the return memory, making it impossible to identify statically. Look-
address of the current function with the address that the ing at the corresponding C code allows to easily determine
newly inserted function (i.e., the system call implementa- that it was in fact a read system call. To tackle such scenar-
tion) is supposed to return to. ios, we created a lookup table that returns the system calls
In that context, one may think that we would loose the being made by library functions that contain such statically
possibility to return from functions invoking system calls unidentifiable system calls.
which of course would break the program. However, we In addition to the syscall-based modularity, we also
realized that in many cases this issue could be tackled with- enabled modularized coarse-grained components that were
out resorting to a complex solution such as the one we used originally (in HermitCore) included in all builds, such as
for x86-64. First, in the common case, a function invoking a the LWIP TCP/IP stack.
system call is also calling other functions, which mandates
that the value of %r30 is saved on the stack by the com-
piler-generated code and restored at the time of returning 5 EVALUATION
from the function in question, thus even if overwriting SVC The objective of the performance evaluation is to answer the
with BL looses the value of %r30, it will be properly following questions: First, can HermiTux run native Linux
restored before returning. Second, in the relatively common binaries while still keeping the lightweightness benefits of uniker-
case where SVC is directly followed with a return instruc- nels, i.e., low disk/memory footprints and fast boot times? How
tion RET, then SVC can be overwritten with a simple branch does it compare to other lightweight virtualization solutions
instruction B: as this function preserve %x30, when the sys- regarding these metrics? (Section 5.1). Second, as we focus on
tem call implementation return, it will simply return to the native/legacy executables, can HermiTux execute binaries that
function initially invoking the system call. Third, a small are written in different languages, stripped, obfuscated, compiled
Authorized licensed use limited to: Universidad Del Norte Biblioteca. Downloaded on October 23,2023 at 01:58:53 UTC from IEEE Xplore. Restrictions apply.
2122 IEEE TRANSACTIONS ON COMPUTERS, VOL. 71, NO. 9, SEPTEMBER 2022
Fig. 2. Boot time, memory usage and image size comparison for several virtualization solutions running on x86-64 and aarch64. HermiTux runs on
Uhyve. Lupine, Alpine, and OSv run on Firecracker, and Rump runs on Solo5. NOKML means KML patch disabled.
with full optimizations and different compilers/libraries? (Sec- Boot and destruction latencies have been measured in vari-
tion 5.2). Finally, how does HermiTux’s performance compare ous ways in related work. Although hypervisor initializa-
with other lightweight virtualization solutions? (Section 5.3); tion time can sometimes be non negligible, guests can run
We evaluate HermiTux over multiple macro- and micro- on top of various virtual machine monitors and we chose to
benchmarks. The proposed system is compared to several exclude hypervisor initialization time from our study and
lightweight virtualization solutions, including a Linux VM only consider guest boot time. We thus define the boot time
running an Alpine distribution on top of the Firecracker as the latency between the moment the hypervisor starts to
hypervisor, Docker, and three unikernels models focusing execute guest code when the unikernel is launched and the
on compatibility with existing applications: Lupine moment when the first instruction of user code is run after
Linux [11], OSv [10] and Rumprun [9]. For each of them we guest kernel initialization. To that aim, we instrumented
use the latest version available on their respective git reposi- both the hypervisors (Uhyve, Firecracker and Solo5) and
tories. Note that on the contrary to HermiTux, none of these the guest kernels. The hypervisors are modified to take a
unikernels are binary compatible with Linux at the system timestamp right before the start of guest execution. The
call level (Lupine’s “pure” unikernel form is enabled by guest kernels are instrumented by inserting right after the
Kernel Mode Linux2 – KML that forces the interfacing to kernel boot process a trap to the hypervisor which in turn
take place at the level of the libc). Lupine and OSv run on takes a timestamp. For Docker, we used docker events to
top of Firecracker, and Rump on top of Solo5 for compatibil- compute the difference between the container start and con-
ity reasons. Lupine does not support aarch64. In network- tainer die events.
bound setups, we also run all VMs on top of Qemu for per- The results are presented on Fig. 2. HermiTux inherits
formance reasons. The macro-benchmarks we used include the fast and optimized boot time of its basis, HermitCore: 33
C/Fortran/C++/Python NPB [32], PARSEC [33] and the ms on x86-64, and 5ms on aarch64. On x86-64 it is moder-
Python Performance Benchmark Suite.3 We also build an ately slower than OSv (13 ms) and Rump (17 ms), but much
edge computing benchmark based on PARSEC’s Stream- faster on aarch64 (34 ms for OSv, 50 ms for Rump). Hermi-
Cluster compute kernel. Micro-benchmarks include redis- Tux also boots quite faster than Docker: 3x for x86-64, 26x
benchmark and LMbench4 to measure system call latency. for aarch64. Regarding Lupine, as mentioned in the related
We wish to assess HermiTux’s efficiency in both datacen- paper [11] its boot time is impacted by the KML patch: with
ter/cloud and edge contexts, and we run experiments on KML Lupine’s application can enjoy fast system calls how-
both x86-64 and aarch64 architectures. The x86-64 machine ever the boot time is 3x that of HermiTux: 94 ms. Without
is an Intel Xeon E5-2637 (3.0 GHz, 64 GB RAM), running KML, it drops to 41 ms. Unsurprisingly, the traditional ker-
Ubuntu Server 16.04 with Linux v4.4.0 as the host. It is a nel (Alpine) numbers are much higher, being 20x (x86-64)
typical server found in the datacenter. The aarch64 machine and 237x (aarch64) higher than HermiTux’.
is a LibreComputer LePotato single board computer, with Memory Usage. A low memory footprint is one of the
an aarch64 CPU clocked at 1.5 GHz and 2 GB of RAM. It promises of the unikernel model. Similar to boot time, vari-
runs Ubuntu 18.04 with Linux v4.19.0 as the host. It is repre- ous ways have been used by related works to measure
sentative of a certain class of low-power embedded systems RAM usage. Once again we chose to exclude hypervisors’
found at the edge of the cloud. Unless otherwise stated, the internal memory footprint and as such we define RAM
compilers used are GCC/G++ v6.3.0 (x86-64) and v8.3.0 usage as the minimal amount of memory one can give to a
(aarch64), and the -O3 level of optimizations is used. VM for the execution of a dummy “hello world” program.
In addition to the experiments presented here, we also We used this method for the unikernels and the Alpine VM,
validated our syscall-level binary compatibility by confirm- and use docker stat for Docker.
ing HermiTux’s basic support for additional languages such The results are presented on Fig. 2. The minimalist
as Rust, Lua, and Nim. design of HermiTux, inherited from HermitCore’s, allows
to offer a low memory footprint: 11 MB on x86-64 and
5.1 Lightweightness: Footprint Reduction & aarch64. Rump has a slightly smaller memory usage on x86
Boot Time (8 MB) but it is more than twice higher on aarch64 (24 MB).
Boot Time. This metric is critical for unikernels [1], [12], [34], OSv’s RAM footprint is also higher than HermiTux on both
in situations where reactivity and/or elasticity is required. ISAs: more than 2x on x86-64 and 1.3x on aarch64. On x86-
64 Lupine’s footprint is 1.8x higher than HermiTux’. Unsur-
prisingly, the Alpine VM has the higher memory usage on
2. http://www.yl.is.s.u-tokyo.ac.jp/tosh/kml/
3. https://pyperformance.readthedocs.io/ both ISAs, 34 MB, and the docker container has the lowest,
4. http://lmbench.sourceforge.net/ 6 MB.
Authorized licensed use limited to: Universidad Del Norte Biblioteca. Downloaded on October 23,2023 at 01:58:53 UTC from IEEE Xplore. Restrictions apply.
OLIVIER ET AL.: SYSCALL-LEVEL BINARY-COMPATIBLE UNIKERNEL 2123
TABLE 1
System Call-Based Modularity Efficiency
Fig. 7. LMbench system call latency on x86-64 (left) and aarch64 (right).
Lupine [11] is a unikernel version of Linux that reduces other unikernels (OSv, Rump) for unikernel-critical metrics.
kernel size through configuration and eliminating the user/ HermiTux is available online under an open-source license:
kernel boundary with the Kernel Mode Linux patch. https://ssrg-vt.github.io/hermitux/.
Although it claims binary compatibility, it is important to
note that contrary to HermiTux that is binary compatible at REFERENCES
the system call level, Lupine compatibility is achieved at the [1] F. Manco et al., “My VM is lighter (and safer) than your contain-
standard C library level through a dynamic loader and a er,” in Proc. 26th Symp. Oper. Syst. Princ., 2017, pp. 218–233.
[Online]. Available: http://doi.acm.org/10.1145/3132747.3132763
modified version of Musl Libc. As a result, contrary to Her- [2] A. Madhavapeddy et al., “Unikernels: Library operating systems
miTux, with Lupine some unikernels benefits (such as fast for the cloud,” in Proc. 18th Int. Conf. Architectural Support Program.
system calls) cannot be achieved for programs that do not Lang. Operating Syst., 2013, pp. 461–472.
dynamically link against Musl such as static binaries. [3] J. Martins et al., “Clickos and the art of network function
virtualization,” in Proc. 11th USENIX Conf. Netw. Syst. Des. Imple-
UKL [41] is another unikernel-version of Linux, however it mentation, 2014, pp. 459–473. [Online]. Available: http://dl.acm.
is still under development. As our experimental comparison org/citation.cfm?id¼2616448.2616491
with Lupine shows, it is unlikely that even a heavily [4] B. Duncan, A. Happe, and A. Bratterud, “Enterprise IoT security and
scalability: How unikernels can improve the status quo,” in Proc.
shrinked down version of large monolithic OS such as IEEE/ACM 9th Int. Conf. Utility Cloud Comput., 2016, pp. 292–297.
Linux can achieve the same degree of lightweightness as a [5] S. Lankes, S. Pickartz, and J. Breitbart, “HermitCore: A unikernel
unikernel built from scratch such as HermiTux. for extreme scale computing,” in Proc. 6th Int. Workshop Runtime
Graphene [13] is a LibOS running on top of Linux, capa- Operating Syst. Supercomputers, 2016, pp. 1–8, Art. no. 4. [Online].
Available: https://dl.acm.org/doi/10.1145/2931088.2931093
ble of executing unmodified, multi-process applications. [6] Porting native applications to OSv: Problems you may run into,
Graphene’s security can be enhanced with Intel SGX [14], Accessed: Feb. 05, 2018. [Online]. Available: https://github.com/
but this involves significant overhead (up to 2x). While cloudius-systems/osv/wiki/Porting-native-applications-to-OSv
binary compatibility comes for free in containers and in [7] S. Kuenzer et al., “Unikraft: Fast, specialized unikernels the easy
way,” in Proc. 16th Eur. Conf. Comput. Syst., 2021, pp. 376–394.
some software LibOSes such as Graphene, we show that it [8] Unikernels are secure, Accessed: Nov. 27, 2017. [Online]. Avail-
is also doable in unikernels. Unikernels such as HermiTux able: https://news.ycombinator.com/item?id¼14736909
are an interesting alternative to containers and software [9] A. Kantee and J. Cormack, “Rump kernels no OS? no problem!”
USENIX; login: Magazine, vol. 39, no. 5, pp 11–17, 2014. [Online].
LibOSes as they benefit from the strong isolation enforced Available: https://www.usenix.org/system/files/login/articles/
by hardware-assisted virtualization [1], [18], which comes login_1410_03_kantee.pdf
at a very low performance overhead. Google proposes gVi- [10] A. Kivity, D. L. G. Costa, and P. Enberg, “OSv - Optimizing the
sor [15], a Go framework addressing containers’ security operating system for virtual machines,” in Proc. USENIX Annu.
Tech. Conf., 2014, Art. no. 61.
concerns by providing some degree of software isolation [11] H.-C. Kuo, D. Williams, R. Koller, and S. Mohan, “A linux in uni-
through system call filtering/interposition. This frame- kernel clothing,” in Proc. 15th Eur. Conf. Comput. Syst., 2020, pp. 1–
works comes at a non-negligible performance overhead [42], 15.
[12] A. Madhavapeddy et al., “Jitsu: Just-in-time summoning of uni-
and is not able to reach the same level of isolation provided kernels,” in Proc. 12th USENIX Symp. Netw. Syst. Des. Implementa-
by the VMs used in the context of unikernels [43]. tion, 2015, pp. 559–573.
Dune [44] uses hardware-assisted virtualization to pro- [13] C.-C. Tsai et al., “Cooperation and security isolation of library
vide a process-like abstraction, and implements in particu- OSes for multi-process applications,” in Proc. 9th Eur. Conf. Com-
put. Syst., 2014, pp. 1–14, Art. no. 9. [Online]. Available: https://
lar a sandboxing mechanism for native Linux binaries. It is dl.acm.org/doi/10.1145/2592798.2592812
important to note that its isolation model is quite different [14] C.-C. Tsai, D. E. Porter, and M. Vij, “Graphene-SGX: A practical library
from HermiTux: Dune either redirects system calls to the OS for unmodified applications on SGX,” in Proc. USENIX Annu. Tech.
host kernel or blocks them, which limits compatibility when Conf., 2017, pp. 645–658, Art. no. 8. [Online]. Available: https://www.
usenix.org/system/files/conference/atc17/atc17-tsai.pdf
blocking or decreases isolation when redirecting. [15] Google, “Gvisor GitHub webpage,” Accessed: Mar. 05, 2018.
The authors of a Linux API study [22] on x86-64 classify [Online]. Available: https://github.com/google/gvisor
system calls by popularity. Such knowledge can be used to [16] S. Arnautov et al., “SCONE: Secure linux containers with Intel
SGX,” in Proc. 12th USENIX Conf. Oper. Syst. Des. Implementation,
prioritize system call development in HermiTux. A system 2016, vol. 16, pp. 689–703.
call binary identification technique is also mentioned, but [17] I. Corp, “Intel clear containers,” Accessed: Apr. 08, 2018.
few implementation details are given, and authors report [Online]. Available: https://clearlinux.org/documentation/clear-
that identification fails for 4% of the call sites. containers
[18] R. Pavlicek, “Containers 2.0: Why unikernels will rock the cloud,”
Finally, contrary to HermiTux, some of the systems refer- Accessed: May 08, 2018. [Online]. Available: https://techbeacon.
enced here (OSv, LightVM, X-Containers, Graphene-SGX, com/containers-20-why-unikernels-will-rock-cloud
Lupine) only support a single ISA, x86-64. [19] R. Wang et al., “Ramblr: Making reassembly great again,” in Proc.
Netw. Distrib. Syst. Secur. Symp., 2017, pp. 1–15, [Online]. Available:
https://www.ndss-symposium.org/wp-content/uploads/2017/09/
ndss2017_10-5_Wang_paper_0.pdf
7 CONCLUSION [20] S. Wang, P. Wang, and D. Wu, “Reassembleable disassembling,”
HermiTux runs native Linux executables as unikernels by in Proc. 24th USENIX Conf. Secur. Symp., 2015, pp. 627–642.
[21] M. Matz, J. Hubicka, A. Jaeger, and M. Mitchell, “System V applica-
providing binary compatibility, relieving application pro- tion binary interface-AMD64 architecture processor supplement,”
grammers from the effort of porting their software. In this Draft version 0.99.7, 2014. [Online]. Available: https://uclibc.org/
model, not only can unikernel benefits be obtained for free docs/psABI-x86_64.pdf
in unmodified applications, but it is also possible to run pre- [22] C.-C. Tsai, B. Jain, N. A. Abdul, and D. E. Porter, “A study of mod-
ern Linux API usage and compatibility: What to support when
viously un-portable software. HermiTux achieves this you’re supporting,” in Proc. 11th Eur. Conf. Comput. Syst., 2016,
goal with, in most cases, negligible to acceptable overhead pp. 1–16, Art. no. 16. [Online]. Available: https://dl.acm.org/
compared to Linux, and performs generally better than doi/10.1145/2901318.2901341
Authorized licensed use limited to: Universidad Del Norte Biblioteca. Downloaded on October 23,2023 at 01:58:53 UTC from IEEE Xplore. Restrictions apply.
OLIVIER ET AL.: SYSCALL-LEVEL BINARY-COMPATIBLE UNIKERNEL 2127
[23] Y. Zhang et al., “KylinX: A dynamic library operating system for [44] A. Belay, A. Bittau, A. J. Mashtizadeh, D. Terei, D. Mazieres, and
simplified and efficient cloud virtualization,” in Proc. USENIX C. Kozyrakis, “Dune: Safe user-level access to privileged CPU
Annu. Tech. Conf., 2018, pp. 173–185. features,” in Proc. 10th USENIX Conf. Oper. Syst. Des. Implementa-
[24] D. Gruss, J. Lettner, F. Schuster, O. Ohrimenko, I. Haller, and M. tion, 2012, vol. 12, pp. 335–348.
Costa, “Strong and efficient cache side-channel protection using
hardware transactional memory,” in Proc. 26th USENIX Conf. Pierre Olivier received the BS and MS degrees from the University of
Secur. Symp., 2017, pp. 217–233. Western Brittany, Brest, France, in 2009 and 2011, respectively, and the
[25] A. Arcangeli, I. Eidus, and C. Wright, “Increasing memory density PhD degree from the University of South Brittany, Lorient, France, in
by using KSM,” in Proc. Linux Symp., 2009, pp. 19–28. 2014. He was a postdoc from 2015 to 2018 and a research assistant pro-
[26] W. Dietz and V. Adve, “Software multiplexing: Share your librar- fessor from 2018 to 2019 at Virginia Tech, Blacksburg, Virginia, before
ies and statically link them too,” Proc. ACM Program. Lang., vol. 2, joining the University of Manchester, U.K., as a lecturer. His research
no. OOPSLA, 2018, Art. no. 154. interests include many areas of systems software.
[27] C. S. Collberg, J. H. Hartman, S. Babu, and S. K. Udupa, “Slinky:
Static linking reloaded,” in Proc. Annu. Conf. USENIX Annu. Tech.
Conf., 2005, pp. 309–322. Hugo Lefeuvre received the BS degree from the Karlsruhe Institute of
[28] A. Quach, R. Erinfolami, D. Demicco, and A. Prakash, “A multi- Technology, Germany, in 2020. He is currently working toward the PhD
OS cross-layer study of bloating in user programs, kernel and degree in computer systems at the University of Manchester, U.K. His
managed execution environments,” in Proc. Workshop Forming Eco- research interests include systems software, security, and networking.
system Around Softw. Transformation, 2017, pp. 65–70.
[29] Y. Padioleau, J. Lawall, R. R. Hansen, and G. Muller,
“Documenting and automating collateral evolutions in Linux Daniel Chiba received the MS degree in computer engineering from Vir-
device drivers,” in Proc. 3rd ACM SIGOPS/EuroSys Eur. Conf. Com- ginia Tech, Blacksburg, Virginia, in 2018, where his research centered
put. Syst., 2008, pp. 247–260. [Online]. Available: http://doi.acm. on virtualization and unikernels. He currently works in the graphics soft-
org/10.1145/1352592.1352618 ware team at Qualcomm, Boston, Massachusetts.
[30] Y. Shoshitaishvili et al., “SoK: (State of) The Art of War: Offensive
techniques in binary analysis,” in Proc. IEEE Symp. Security Pri-
vacy, 2016, pp. 138–157. Stefan Lankes received a conferral of a doctorate degree from the
[31] C. C. Williams and J. K. Hollingsworth, “Interactive binary RWTH Aachen University, Germany. Between 2007 and 2017, he was
instrumentation,” in Proc. 2nd Int. Workshop Remote Anal. Meas. academic councilor at Chair for Operation Systems at the RWTH
Softw. Syst., 2004, pp. 25–28. Aachen University, Germany. Since 2017, he is working as academic
[32] D. H. Baileyetal, “The NAS parallel benchmarks,” The Int. J. Super- director at the Institute for Automation of Complex Power Systems,
computing Appl., vol. 5, no. 3, pp. 63–73, 1991. RWTH Aachen University, Germany. His research interests include oper-
[33] C. Bienia, S. Kumar, J. P. Singh, and K. Li, “The PARSEC bench- ating systems, cloud computing and high performance computing.
mark suite: Characterization and architectural implications,” in
Proc. 17th Int. Conf. Parallel Architectures Compilation Techn., 2008,
pp. 72–81. Changwoo Min received the PhD degree from Sungkyunkwan Univer-
[34] V. Nitu et al., “Swift birth and quick death: Enabling fast parallel sity, South Korea, in 2014. He is currently an assistant professor of the
Electrical and Computer Engineering Department, Virginia Tech, Blacks-
guest boot and destruction in the xen hypervisor,” in Proc. 13th
burg, Virginia, where his research focuses on many-core scalability and
ACM SIGPLAN/SIGOPS Int. Conf. Virt. Execution Environ., 2017,
pp. 1–14. [Online]. Available: http://doi.acm.org/10.1145/ concurrency of in-memory and non-volatile memory systems. His
3050748.3050758 research interests include operating systems, storage systems, data-
[35] C. Lattner and V. Adve, “LLVM: A compilation framework for base systems, and system security. Before joining Virginia Tech, Blacks-
lifelong program analysis & transformation,” in Proc. Int. Symp. burg, Virginia, in 2017, he was a research scientist in computer science
Code Gener. Optim., 2004, pp. 75–86. at the Georgia Institute of Technology, Atlanta, Georgia. Before starting
his PhD, he developed various software products, including Linux-based
[36] P. Junod, J. Rinaldini, J. Wehrli, and J. Michielin, “Obfuscator-
LLVM – Software protection for the masses,” in Proc. IEEE/ACM mobile platform (Tizen), Java virtual machine (J9), and desktop operat-
1st Int. Workshop Softw. Protection, 2015, pp. 3–9. ing system (OS/2) in Samsung Electronics and IBM Korea.
[37] Micropython contributors, “Micropython webpage,” Accessed:
May 08, 2018. [Online]. Available: https://micropython.org/
[38] R. Pettersen, H. D. Johansen, and D. Johansen, “Secure edge com- Binoy Ravindran is currently a professor of electrical and computer
puting with ARM TrustZone,” in Proc. 2nd Int. Conf. Internet of engineering at Virginia Tech, Blacksburg, Virginia, where he leads the
Things Big Data Secur., 2017, pp. 102–109. Systems Software Research Group which conducts research on distrib-
[39] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge computing: uted systems, operating systems, virtualization, compilers, concurrency,
Vision and challenges,” IEEE Internet of Things J., vol. 3, no. 5, and verification. His group has published more than 290 papers in these
pp. 637–646, Oct. 2016. spaces, including eight best paper awards and nominations. Several of
[40] L. W. McVoy et al., “lmbench: Portable tools for performance ana- his group’s results have been transitioned to the U.S. DOD, in particular,
lysis,” in Proc. Annu. Conf. USENIX Annu. Tech. Conf., 1996, the Navy. He has mentored six research faculty members, 14 postdoc-
pp. 279–294. toral scholars, and 18 PhD students, ten of whom currently hold tenured
[41] A. Raza, “UKL: A unikernel based on Linux,” Accessed: Dec. 12, or tenure-track faculty positions. He is an ACM distinguished scientist, a
2018. [Online]. Available: https://next.redhat.com/2018/11/14/ former office of naval research faculty fellow, and serves or has served
ukl-a-unikernel-based-on-linux/ on the editorial boards of the IEEE Transactions on Computers, IEEE
[42] Z. Shen et al., “X-containers: Breaking down barriers to improve Transactions on Parallel and Distributed Systems, ACM Transactions
performance and isolation of cloud-native containers,” in Proc. on Embedded Computing Systems, IEEE Design & Test, and IEEE
24th Int. Conf. Architectural Support Program. Lang. Oper. Syst., Transactions on Sustainable Computing.
2019, pp. 121–135.
[43] H. Fingler, A. Akshintala, and C. J. Rossbach, “USETL: Unikernels " For more information on this or any other computing topic,
for serverless extract transform and load why should you settle
please visit our Digital Library at www.computer.org/csdl.
for less?” in Proc. 10th ACM SIGOPS Asia-Pacific Workshop Syst.,
2019, pp. 23–30.
Authorized licensed use limited to: Universidad Del Norte Biblioteca. Downloaded on October 23,2023 at 01:58:53 UTC from IEEE Xplore. Restrictions apply.