Docker Security for Developers
Docker Security for Developers
† Department of Computer and Cyber Sciences, United States Air Force Academy
Abstract—Docker is popular within the software development incorporating Docker into their software development life cycle,
community due to the versatility, portability, and scalability of as Docker allows developers to rapidly design, develop, test,
containers. However, concerns over vulnerabilities have grown and deploy applications while optimizing resource usage [1].
as the security of applications become increasingly dependent
on the security of the images that serve as the applications’ Due to Docker’s popularity, the development community
building blocks. As more development processes migrate to the has created repositories of Docker images, such as Docker
cloud, validating the security of images that are pulled from Hub,1 to increase reusability and to encourage file sharing.
various repositories is paramount. In this paper, we describe With the paradigm presented by containers and microservices,
a continuous integration and continuous deployment (CI/CD) software development is increasingly dependent on small,
system that validates the security of Docker images throughout
the software development life cycle. We introduce images with reusable components that are developed independently and
vulnerabilities and measure the effectiveness of our approach distributed by different organizations. This dependency, in
at identifying the vulnerabilities. In addition, we use dynamic turn, raises concerns regarding the security of the entire
analysis to assess the security of Docker containers based on Docker image distribution pipeline. Architects at Docker now
their behavior and show that it complements the static analyses encourage developers and publishers to include risk analysis
typically used for security assessments.
Index Terms that considers the entire distribution pipeline itself as actively
malicious [1]. Introducing a multi-layered security mechanism
at the Docker registry level may mitigate vulnerabilities from
I. I NTRODUCTION being introduced into Docker repositories.
Containers have gained significant traction within the soft- The contributions of our work are two-fold:
ware development community because they allow developers • We implement a multi-stage continuous integration and
to avoid the time consuming configuration of libraries and continuous deployment (CI/CD) pipeline to evaluate the
dependencies. An image is a file that contains the required code, security of Docker images. Our pipeline prevents the
configuration, and libraries to run an application. Containers publishing and consequent reuse of images with known
are instances of these images with every instance having the vulnerabilities.
same underlying dependencies. Because they only encompass • We demonstrate dynamic analysis for for developers to use
the instructions and code that are necessary for the application when selecting Docker images. Such dynamic analysis
to run, containers are lightweight compared to alternative may be incorporated into our CI/CD pipeline, and we
approaches such as virtual machines (VMs). Multiple containers evaluate it using images with malicious content to show
can run on a single physical or virtual machine, making them how it detects abnormal runtime behavior.
ideal for many phases of the software development cycle. In a nutshell, our work is designed to improve security practices
A microservice is a software architectural paradigm that is when using containers as part of software development.
commonly used with containers. Microservices aim to make
software development easier through optimal use of various The remainder of this paper is organized as follows. In Sec-
resources. In essence, each software function is provisioned tion II, we present an overview of various Docker mechanisms
in a single application component. The components then use used to enforce the security of images. In Section III, we
an application programming interface (API) to interact with explain how our CI/CD pipeline is an extension of single-
each other. Microservices serve as building blocks to design layered security tools that reduce the surface area where
and develop scalable and maintainable software. This approach vulnerabilities may be introduced into the image distribution
decreases the dependencies between major software functions pipeline. Section IV describes the implementation of our
and minimizes the code base with which developers interact. prototype using Amazon Web Services (AWS). Section V
Docker is a popular Platform as a Service (PaaS) for covers our experimental evaluation. Section VI summarizes
containers. Many organizations such as Google and Amazon are related work, and we conclude in Section VII.
1 https://hub.docker.com/
0975
II. BACKGROUND
Malware analysis usually takes the form of examining files
or executables to detect compromises. There are two categories
of this analysis—static analysis and dynamic (or behavioral)
analysis. This section describes these two types of analysis as
well as how they pertain to Docker images.
Historically, software and hardware vendors used various Fig. 1. CI/CD workflow for Docker image security
scoring metrics to measure software vulnerabilities. The
resulting lack of uniformity eventually led to the creation of Engine into a CI/CD pipeline lessens the burden on developers
the Common Vulnerabilities and Exposures (CVE) system [2]. by automating security analyses. In addition, we implement
CVEs provide a framework to quantify and assess vulnerabili- a service API that scans for malware on public images,
ties and exposures, and it also enables publicly sharing such which allows developers to research an image and verify its
information. One common way to prevent vulnerabilities from security before incorporating it into their products. Our service
being introduced to the distribution pipeline is to regularly minimizes setup costs and allows developers perform static and
scan Docker images against CVEs. Detecting vulnerabilities dynamic analyses quicker, easier, and with minimal resource
within Docker images encourages actions to address them [3]. investment.
Scanning for CVEs can actually be considered as a part of
static analysis, but the term static analysis covers a broader set A. CI/CD Pipeline
of actions. In static analysis, the content of data is examined We propose a multi-layered security approach for CI/CD
without executing the instructions that are captured in the data. during the development of Docker images. Our design enables
Static analysis has the capability to detect bugs in source code developers to flexibly incorporate desired security analysis tools
such as unreachable code, variable misuse, uncalled functions, into their development processes. In addition, it encourages the
improper memory usage, and boundary value violations. Static definition of well-defined security policies for Docker images
analysis also uses signatures based on file names, hashes, and used in their development environments.
file types to indicate if a file is malicious. Figure 1 illustrates the workflow of our CI/CD pipeline for
In comparison, dynamic analysis observes a container’s be- Docker images. The source Docker project initiates the CI/CD
havior. Some methods of dynamic analysis are port scans before process. Each pipeline may contain an arbitrary number of
or after execution, process monitoring, recording changes in stages with vulnerability scans and static or dynamic analysis—
firewall rules, registry changes, and network activity monitoring. whatever is appropriate for the development environment. When
While dynamic analysis typically takes longer than static the source code of a new Docker project enters the pipeline and
analysis, the results may be more intuitive. However, Docker on all future updates to the project’s source code, it triggers
containers must be launched in a confined sandbox so that the specified security checks. In order to give the developers
other services and resources in production are not impacted and administrators flexibility to extend our pipeline, each layer
by the container. may enforce its security checks in an arbitrary manner. If, and
Although there have been some efforts across the Docker only if, the source Docker project passes the security checks at
community to encourage security analysis by users, they each layer, then the project is published to the Docker registry.
are often ignored. Thus, it would be ideal to incorporate
security analysis tools into the development cycle of Docker B. Dynamic Analysis
images. Due to the rising concerns of vulnerabilities introduced Due to the increased risk of implementing dynamic analysis
by Docker images’ vulnerabilities, there are several open on the same system as the pipeline, we propose a second tool
source tools available that may be incorporated into such to accomplish this goal. We implement an API to automate
as process. CoreOS Clair2 is one such tool that performs and capture useful information about specified Docker images
static analysis of image vulnerabilities. Another tool that is to identify bugs and malicious content. This API service serves
currently available is Anchore Engine,3 which includes CVE- two purposes: first, it is a research tool for developers to
based reporting. Anchore Engine’s security policy enables analyze public images, and second, it serves as a second layer
users to have fine-grained control over security enforcement by of our multi-layered security mechanism. Our service allows
allowing customized security policies, helping users to achieve developers to analyze any public image for vulnerabilities and
NIST 800-190 compliance [4]. malware, using Clair for vulnerability analysis and scanning
for malicious content using VirusTotal.4 Lastly, it executes
III. D ESIGN basic dynamic analysis by running the image for a period of
To address these vulnerabilities and shortcomings, we time to capture file changes, network traffic, and list processes
propose an automated process for developers to scan and throughout the image’s execution.
analyze images for vulnerabilities. Combining Clair and Anchor Malicious images often include bash scripts that create
secure shell (SSH) tunnels back to a command-and-control
2 https://coreos.com/clair/
3 https://anchore.com/ 4 https://www.virustotal.com/
0976
threshold and vulnerability count, determine if an image passes
or fails the CVE scan. If the scan finds vulnerabilities that are
more severe than the threshold category and if the vulnerability
count exceeds the threshold count, then the CI/CD stage fails
on the image. One AWS availability zone runs services related
to Clair with a Fargate Elastic Container Service (ECS) cluster
in the private subnet. A load balancer distributes requests across
the ECS cluster.
If the source Docker project passes the first layer’s security
analysis, it goes through similar steps in the second layer using
Anchore Engine. Although a custom security policy my be
defined, our prototype uses Anchore’s default security policy,
which performs light vulnerability checks (e.g., the image does
not contain packages with critical vulnerabilities) and Dock-
Fig. 2. API service flow for developers to research public images erfile checks (e.g., port 22 should not be exposed). Although
Anchore Engine also performs CVE-based reporting, we felt
it would be insufficient to perform the entire CI/CD processes
server when the container runs. These malicious images try to only using CVE-based reporting. Due to the microservice nature
download additional binaries and install shell code. However, of how Anchore is deployed, a Fargate ECS cluster turned out
only dynamic analysis captures this behavior. To address this, to be infeasible. Thus, we used an ECS cluster that is capable
we developed a sub-service within the tool to automate the of launching c4.xlarge Elastic Compute Cloud (EC2) instances
dynamic examination of Docker images. The process uses and housing Anchore Engine’s microservices.
Docker-in-Docker (dind) to automate and capture such behavior
in a sandbox. Though Docker-in-Docker does provide some If an image passes all the security layers, the image is
isolation, it is still executed in a privileged mode and still published to a dedicated AWS Elastic Container Registry (ECR)
has access to the host and has similar permissions to Docker repository where it is available to others who have access to
running on the host. Thus, it is important to run this Docker- that repository. This operation is performed within the final
in-Docker sandbox in a hypervisor or a cloud service that layer of our pipeline.
provides the appropriate level of isolation. Our implementation significantly improves upon prior
IV. I MPLEMENTATION work [3], [5]. First, multiple layers are incorporated into our
CI/CD pipeline. Second, it requires significantly less manual
In this section, we describe the realization of our design
setup: for example, developers are not required to create an
using AWS.
ECR repository for each security analysis [3]. Using our
A. CI/CD Pipeline implementation, developers and administrators are able to
A two-stage pipeline was implemented to demonstrate our provision all of the required services and resources for their
CI/CD pipeline. Clair and Anchore Engine provide the security CI/CD process using a single AWS CloudFormation template.
analyses. Figure 3 illustrates our architecture. A Virtual Private In addition, our load balancers use auto-scaling so that the
Cloud (VPC) logically isolates the various resources within the system scales horizontally based on load.
AWS ecosystem. We further isolate Clair and Anchore Engine
using two availability zones. Each availability zone has a public
subnet and a private subnet, and within each public subnet,
the gateway allows components within the private subnet to
access the Internet. A PostgreSQL database cluster spans the
two private subnets.
To store the source content of Docker project, we use
an AWS CodeCommit repository. Whenever the state of the
repository changes, the event triggers the execution of the
security layers in our CI/CD pipeline.
We perform vulnerability scanning within our first layer using
CoreOS Clair. Clair provides seamless integration with our
pipeline by providing API endpoints where client applications
can run vulnerability scanning. Clair categorizes CVEs into
seven categories: unknown, negligible, low, medium, high,
critical, and defcon1 with unknown being least severe and
defcon1 being most severe. Two configuration values, category Fig. 3. AWS architecture diagram for backend services
0977
B. Dynamic Analysis TABLE I
D OCKER IMAGES USED IN THE EVALUATION
A simple way to determine if an image contains malicious
code is using a virus scanner. We use VirusTotal because Image Tag
of the large number of virus scanners that it supports. The postgres 11.5
downside of relying on antivirus is that most virus scanners only
ubuntu 18.04
perform static analysis and look for signatures while some do python 3.6-alpine
perform dynamic analysis with their own emulators. However, node 10.16.0-alpine
there is evidence that malware is starting to evade antivirus nginx 1.17.2
buamod/eicar latest
emulators [6]. By running the image dynamically in Docker- zoolu2/jauto latest
in-Docker, the image is running within a privileged mode so
there is no emulation. The image is essentially running as if it
is on the host, which should allow more accurate analysis.
A. CI/CD Pipeline
In order to capture traffic of an image, another container
with tcpdump runs alongside the target container. These We use two Docker projects to evaluate our CI/CD security
two containers share network settings. Starting the tcpdump pipeline (see Table I). One project, an nginx website5 that
container requires a few seconds before it captures traffic, so exposes port 80 and 443 for HTTP and HTTPS traffic, is the
any initial traffic may be lost. There are some cases where the same as used in prior work [3]. We chose PostgreSQL 11 as
container of interest does not run long enough for tcpdump our second Docker project because we believed that the image
container to start. In cases of malicious images, the containers would be relatively secure against vulnerabilities. For CoreOS
often run for longer periods of time because they need to Clair, an image passes if, and only if, the image did not flag
communicate back to a control server for further instructions. fifty or more vulnerabilities that were categorized as “high.”
We save the network traffic in a packet capture (pcap) file and Figure 5 displays the vulnerability count that was produced
push it to Simple Storage Service (S3) for later analysis. by CoreOS Clair in our CI/CD pipeline. The result validates
Docker has the option to log file system changes, which our prediction of PostgreSQL 11’s security: the image was
we also use in our dynamic analysis. The difference is the not flagged for any vulnerabilities. The nginx website image,
changes made to the base image from when the container was however, was flagged for 143 total vulnerabilities. Even with
running. This log helps when analyzing malicious images that all these vulnerabilities, the image still passed because it did
download and execute new binaries that are not part of the not exceed the threshold of fifty or more vulnerabilities that
original image. Another analysis capability is listing all the were “high” or more severe.
processes in the container. The dynamic portion of the API The second layer of our pipeline enforces Anchore Engine’s
performs a process listing every second during the run. default security policy, which includes light vulnerability checks
and Dockerfile checks. The PostgreSQL 11 image met all of
V. E VALUATION
the security requirements of the default security policy and
This section evaluates our design, starting with the CI/CD was published to our Docker registry. Conversely, the sample
pipeline before progressing to the dynamic analysis. nginx website did not pass the security policy and was not
published to our registry.
5 https://github.com/aws-samples/aws-codepipeline-docker-vulnerability-
scan
55
46
Vulnerabilities
40
25
20 16
0 0 0 0 0 1
0
PostgreSQL 11 nginx website
Fig. 4. Screenshot of ftfy.io, the web portal for dynamic analysis Fig. 5. Clair vulnerability scan results (lower is better).
0978
55 55 56 55 number of file modifications, running processes, and Domain
52 Name System (DNS) queries for each image following 30
48
Antivirus Engines
41 seconds of execution.
40 The file system changes relate to the functionality of the
image and what actions the image is trying to perform. It is
especially helpful when examining malicious files and in the
20 16
case of the test images. For example, the zoolu2/jauto image
writes two scripts, cmd.sh and cmd1.sh, in /root. The
0 0 1 0
0 image also surveys the system and saves information such as
CPU metadata. This image also performs some IP searches
x
o
n
ar
tu
in
ho
ut
d
ic
un
no
positive
/ ja
ng
and scans using Shodan, a search engine for Internet connected
t
/e
py
ub
u2
od
negative devices. A larger red flag appeared when the image started SSH
ol
am
zo
bu
20 DNS queries 17
that safe Docker images should pass at least a virus scan and
10 make few modifications to the file system.
6
3 VI. R ELATED W ORK
000 000 000 10 000
0 Many tools exist to scan for CVEs. For example, OpenSCAP6
x
o
on
ar
u
in
ut
t
ic
un
th
no
ng
/ja
/e
u2
od
ol
am
6 https://www.open-scap.org/
Fig. 7. Summary of results using the dynamic analysis API 7 https://nvd.nist.gov/
0979
images. We could easily incorporate such tools in our CI/CD A. Recommendations
pipeline, but similar functionality is provided by CoreOS Clair There is currently no formal process to report malicious
and Anchore Engine, and our CI/CD pipeline is extensible, images to Docker Hub. It would be beneficial for Docker Hub
not being limited to only CVE scanning. The Docker Trusted to create a reporting process and to investigate such images. In
Registry offers Docker Security Scanning, which scans images previously reported cases, Docker Hub took over eight months
for vulnerabilities listed in a CVE database. Regrettably, this to act and remove an account, potentially allowing users to pull
service requires an enterprise license and additional security malicious images unknowingly in the interim [10]. Creating a
scanning extension. Our CI/CD pipeline could be deployed by dedicated reporting mechanism not only mitigates the risk of
organizations who desire a free alternative to this service. such attacks but also encourages users to be wary of what they
Adethyaa and Jernigan [3] demonstrate a CI/CD process for are downloading and possibly to perform their own security
Docker images that uses AWS resources. Valance [5] performs checks.
similar analysis using the Anchore Engine. Both use a single
B. Future Work
stage security mechanism (CoreOS Clair or the Anchore Engine
respectively) that executes static security analysis on Docker We would like to integrate our dynamic analysis with our
images. A major limitation of Adethyaa and Jernigan’s work is CI/CD pipeline, which requires creating a sandbox for Docker
the requirement for manual provisioning of the AWS services images so that it cannot affect other services or resources.
and an inability to define custom security policies. Valance’s Exploring how to use Anchore Engine to perform dynamic
approach lacks a source that initiates the entire CI/CD process analysis within our CI/CD process may be worthwhile. Another
and lacks auto-scaling for the static analyses. In comparison, interesting extension is to apply machine learning to the
our approach is extensible, supporting complex workflows with dynamic analysis portion of the API service. Instead of users
multiple security analysis tools, and is scalable. manually reviewing file system changes and network traffic
Related to our dynamic analysis, Wan et al. [7] sandbox logs that the API outputs, machine learning automates the
containers by mining rules based on legitimate system calls classification of images as malicious or benign based on their
enountered in automated testing; in production, the sandbox activity. The file modifications, running processes, and DNS
restricts system calls that have not been whitelisted. C IM - queries are an ideal starting point for signatures.
PLIFIER [8] uses dynamic analysis to debloat (i.e., remove R EFERENCES
unnecessary files from) containers and partition them according [1] S. Winkel, “Security Assurance of Docker Containers: Part 1,” ISSA
to the principle of least privilege. Both sandbox mining Journal, April 2017.
and C IMPLIFIER require automated tests to identify required [2] P. Mell, K. Scarfone, and S. Romanosky, “The Common Vulnerability
Scoring System (CVSS) and Its Applicability to Federal Agency Systems,”
resources, and, in later work [9], static analysis and symbolic National Institute of Standards and Technology, Tech. Rep. Interagency
execution improve coverage when the automated tests are Report 7435, August 2007.
incomplete. The dynamic analysis used in these works is [3] V. Adethyaa and T. Jernigan, “Scanning Docker Images for
Vulnerabilities using Clair, Amazon ECS, ECR, and AWS
comparable to our own (i.e., recording system calls and files CodePipeline,” AWS Compute Blog, November 2018, online:
being accessed), but our dynamic analysis tool is designed for https://aws.amazon.com/blogs/compute/scanning-docker-images-for-
users to explore an unknown container where automated tests vulnerabilities-using-clair-amazon-ecs-ecr-aws-codepipeline/.
[4] J. Valance, “Using Anchore Policies to Help Achieve the CIS Docker
may not be available. Thus, our goal differs in that our focus Benchmark,” Anchore Blog, May 2019, online: https://anchore.com/cis-
is the exploration of a container rather than hardening one. docker-benchmark/.
[5] ——, “Adding Container Security and Compliance Scanning to
your AWS CodeBuild pipeline,” Anchore Blog, February 2019,
online: https://anchore.com/adding-container-security-and-compliance-
VII. C ONCLUSION scanning-to-your-aws-codebuild-pipeline/.
[6] J. Blackthorne, A. Bulazel, A. Fasano, P. Biernat, and B. Yener, “AVLeak:
Developers using containers are currently vulnerable to mal- Fingerprinting Antivirus Emulators through Black-Box Testing,” in 10th
USENIX Workshop on Offensive Technologies. Austin, TX: USENIX
ware and lack tools that effectively quantify this risk. Existing Association, Aug. 2016. [Online]. Available: https://www.usenix.org/
tools, while effective, are time consuming and challenging to conference/woot16/workshop-program/presentation/blackthorne
implement and may introduce new risks if not implemented [7] Z. Wan, D. L. Lo, X. Xia, L. Cai, and S. Li, “Mining Sandboxes for Linux
Containers,” in Proceedings of the 2017 IEEE International Conference
correctly. Our work addresses these issues by creating user- on Software Testing, Verification and Validation, ser. ICST ’17, March
friendly tools to detect vulnerabilities and malicious code. 2017, pp. 92–102.
Our results show that virus scans and dynamic analysis are [8] V. Rastogi, D. Davidson, L. De Carli, S. Jha, and P. McDaniel, “Cimplifier:
Automatically Debloating Containers,” in Proceedings of the 11th Joint
effective at detecting malicious behavior in Docker containers. Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2017.
In particular, safe images show few file modifications, running New York, NY, USA: ACM, September 2017, pp. 476–486.
processes, and DNS queries whereas malicious images tend to [9] V. Rastogi, C. Niddodi, S. Mohan, and S. Jha, “New directions for
container debloating,” in Proceedings of the 2017 Workshop on Forming
download and execute files not initially present in the image. an Ecosystem Around Software Transformation, ser. FEAST ’17. New
By using our API service for dynamic analysis, developers York, NY, USA: ACM, November 2017, pp. 51–56.
are better equipped to make decisions regarding which base [10] D. Goodin, “Backdoored images downloaded 5 million times finally
removed from Docker Hub,” Online: https://arstechnica.com/information-
images to use. Automating the static and runtime checks also technology/2018/06/backdoored-images-downloaded-5-million-times-
frees developers to build more secure applications. finally-removed-from-docker-hub/, June 2018.
0980