KEMBAR78
Automated hardware testing using docker for space | PDF
CHRISTOPHER HEISTAND
DART FSW Lead, JHUAPL
Automated Hardware Testing
Using Docker for Space
DART Devops Team:
Justin Thomas
Andrew Badger
Austin Bodzas
v
Double Asteroid Redirection Test
DART
NASA Planetary Defense Coordination Office
• DART is a tech demonstration to hit
a representative asteroid
• Mission managed by Johns Hopkins
Applied Physics Lab
• The PDCO is responsible for:
• Finding and tracking near-Earth objects
that pose of hazard of impacting Earth;
• Characterizing those objects
• Planning and implementation of
measures to deflect or disrupt an object
on an impact course with Earth
Step 1: Build the spacecraft
Roll Out Solar Arrays
NEXT-C Ion Thruster
High Gain AntennaDRACO Imager
1. Launch 2. Cruise / Calibration 3. Target Detection
/ Coarse Acquisition
4. Scene Classification
5. Target Selection 6. Deploy
Selfie-Sat
8. Impact Assessment
Flyby of PHA allows sensor
calibration and control-gain
tuning
Seeker counts and classifies closely
spaced objects
With sufficient confidence, seeker selects
target and locks on
Earth tracking & Selfie-Sat images quantify
intercept success
Selfie-Sat releases and executes a separation
maneuver to trail DART
Weeks prior to impact, seeker
detects primary<7 months until impact>
<108 km from target>
<30 days until impact>
<107 km from target>
<3 hours until impact>
<65,000 km from target>
<1.5 hours until impact>
<32,000 km from target>
<~1.4 hour until impact>
<~30,000 km from target>
<Up to 3 months>
7. Homing Until Intercept
Pro-Nav executes precision engagement and is robust
to target uncertainties
<Executed until final 2 minutes>
<6.0 km/s Impact>
Low Energy Escape
With Rideshare
<Jun 15 – Oct 15 2021>
<108 km from target>
Rideshare
Orbit
* 16 months total flight time
Step 2: Hit the target
Goddard Space Flight Center
Johnson Space Center
Langley Research Center
Glenn Research Center
Marshall Space Flight Center
Planetary Defense Coordination Office
Step 3: Save the world
(by validating the kinetic impact technique)
v
Why Dockercon?
Space is hard!
All factors drive:
• Cost
• Reliability
• Low Memory (~16MB)
• No virtual memory
• 32 bit CPU (~100MHz)
• Process
• Testing. And more testing
Vacuum
Radiation
Extreme distances
(and timelines)
Power
Mass
Single shot
New Horizons - JHUAPL
There are no space mechanics (yet) and turning it off and on again is NOT cool
Infrequent Communication
Thermal
What are we trying to solve?
• Hardware Scarcity
• Testbeds cost > $300K
• Configuration management is
painful
• Every developer/subsystem
wants one
• What is the holy grail?
• Hardware emulation!
• Develop in software land
• Test on real hardware
• CD to other teams/real
spacecraft
Enablers for DART
• NASA Core Flight Executive (Hardware/OS Abstraction)
• Atlassian Bamboo (CI/CD)
• Network architecture (SpaceWire)
• COSMOS (Ground System)
• Docker (Containers!)
v
Dev Setup
Container setup
• 4 repos - Flight SW, Testbed SW, COSMOS, Docker_Env
• 4 containers – Flight SW, Testbed SW, COSMOS, VNC
• Run-time voluming of source code, containers are stateless
• Provides outside dev environment with docker build capabilities
• Keeps cross-compile toolchains standardized
Directory Structure with
Submoduled repos
Network setup
• One instance comprises 4
containers (docker-compose)
• UDP SpaceWire abstraction
between FSW and TBSW
• TCP radio abstraction between
Ground and TBSW
• Xforwarding Ground to X11
Server to VNC
VNC window
• Shameless plug for the creator – thanks Jan!
• https://github.com/suchja/x11client
• https://github.com/suchja/x11server
• X11Server focuses on VNC and X setup
• X11Client focuses on the application (COSMOS)
• Brought up with compose, share xauth cookie through voluming
• Runs X virtual frame buffer with Openbox
• Contains the X security issues to the containers (we think)
X11Server
(container)
COSMOS
(container)
VNC Viewer
(dev machine)
Xauth
Eclipse and Debugging
• Eclipse Integration using Docker Tooling (Linux Tools project)
• CDT Build within Docker Container (including cross compiling)
• Run/Debug FSW (x86 Linux) in Docker Container
• Visually Debug FSW (LEON3 RTEMS) on Custom Flight Hardware
• Run Multi-Container App and System Tests (Docker Compose)
v
Demo
v
CI/CD for Space
Software CI
• Goal: Parallel software testing of our software sim
• Limitation: We only had one server to prototype on
• Execution:
• Bamboo with multiple agents on single server
• Runs same setup as dev except for X11Server
• Binaries/workbooks are passed through the chain, not containers
• Re-tagged each docker image so there was no mangling with different branches
• Docker-compose run with –p to provide unique keyed containers
Server
Agent 1 Agent 2
Agent 4Agent 3
Hardware CI/CD
Five complete sets of hardware (Testbeds)
Three flows, similar steps:
• Binary cross-compiled inside container
• Loaded to single board computer via GRMON
• Serial output piped back via SSH/Screen
• L3 InControl used as Ground System
v
Was it hard?
Lessons Learned
• If (Dev_env == CI_env); then debuggable = true;
• Permissions can be problematic
• When editing volume from outside, specify your user to run the container
• Static IPs cause endless headaches
• IP address spaces were not getting garbage collected, required daemon
restart
• Docker-networks can’t handle overlapping IPs/subnets
• Bamboo assumes sandboxed code, Docker is global
• Two layers of dependencies, jobs in a plan and branches in a plan
• Dockercompose –p is magical
• Our server can only handle 4 instances of our setup
• Docker abstracts NEARLY* everything, but not everything
• Linux Message Queues appear abstracted but are globally held in the
kernel
Lessons Learned
• Bamboo latches the git commit once started
• This is great for consistency, provides problems when tagging
containers on commit hash “DETACHED HEAD”
• Lock your Dockerfile FROM: version
• Ubuntu:latest can change under the hood - lock a working version
• Signal handling must go all the way down the rabbit hole
• When using start scripts, signals must be propagated to the end
application, particularly for graceful shutdown
• Log Buffering – use “unbuffer” for better timestamps
• Background processes output buffered causing timestamp bunching
• Not everything HAS to get containerized
• Docker bridge networks can be sniffed by host wireshark
• Permissions and display forwarding proved more pain than worth
What is next?
• Move past single server: Docker Registry
• Hardware stack trace
• Clean up tagging scheme (possibly obsolete with –p)
• Release manager/artifact handler?
• Any brilliant ideas picked up at Dockercon
Recap
• DART
• Why is space hard?
• Voluming source code can be super helpful in development
• VNC finally provides an easy window into containers
v
The Final Moments…
Goddard Space Flight Center
Johnson Space Center
Langley Research Center
Glenn Research Center
Marshall Space Flight Center
Planetary Defense Coordination Office
Questions?

Automated hardware testing using docker for space

  • 1.
    CHRISTOPHER HEISTAND DART FSWLead, JHUAPL Automated Hardware Testing Using Docker for Space DART Devops Team: Justin Thomas Andrew Badger Austin Bodzas
  • 2.
  • 3.
    NASA Planetary DefenseCoordination Office • DART is a tech demonstration to hit a representative asteroid • Mission managed by Johns Hopkins Applied Physics Lab • The PDCO is responsible for: • Finding and tracking near-Earth objects that pose of hazard of impacting Earth; • Characterizing those objects • Planning and implementation of measures to deflect or disrupt an object on an impact course with Earth
  • 4.
    Step 1: Buildthe spacecraft Roll Out Solar Arrays NEXT-C Ion Thruster High Gain AntennaDRACO Imager
  • 5.
    1. Launch 2.Cruise / Calibration 3. Target Detection / Coarse Acquisition 4. Scene Classification 5. Target Selection 6. Deploy Selfie-Sat 8. Impact Assessment Flyby of PHA allows sensor calibration and control-gain tuning Seeker counts and classifies closely spaced objects With sufficient confidence, seeker selects target and locks on Earth tracking & Selfie-Sat images quantify intercept success Selfie-Sat releases and executes a separation maneuver to trail DART Weeks prior to impact, seeker detects primary<7 months until impact> <108 km from target> <30 days until impact> <107 km from target> <3 hours until impact> <65,000 km from target> <1.5 hours until impact> <32,000 km from target> <~1.4 hour until impact> <~30,000 km from target> <Up to 3 months> 7. Homing Until Intercept Pro-Nav executes precision engagement and is robust to target uncertainties <Executed until final 2 minutes> <6.0 km/s Impact> Low Energy Escape With Rideshare <Jun 15 – Oct 15 2021> <108 km from target> Rideshare Orbit * 16 months total flight time Step 2: Hit the target
  • 6.
    Goddard Space FlightCenter Johnson Space Center Langley Research Center Glenn Research Center Marshall Space Flight Center Planetary Defense Coordination Office Step 3: Save the world (by validating the kinetic impact technique)
  • 7.
  • 8.
    Space is hard! Allfactors drive: • Cost • Reliability • Low Memory (~16MB) • No virtual memory • 32 bit CPU (~100MHz) • Process • Testing. And more testing Vacuum Radiation Extreme distances (and timelines) Power Mass Single shot New Horizons - JHUAPL There are no space mechanics (yet) and turning it off and on again is NOT cool Infrequent Communication Thermal
  • 9.
    What are wetrying to solve? • Hardware Scarcity • Testbeds cost > $300K • Configuration management is painful • Every developer/subsystem wants one • What is the holy grail? • Hardware emulation! • Develop in software land • Test on real hardware • CD to other teams/real spacecraft
  • 10.
    Enablers for DART •NASA Core Flight Executive (Hardware/OS Abstraction) • Atlassian Bamboo (CI/CD) • Network architecture (SpaceWire) • COSMOS (Ground System) • Docker (Containers!)
  • 11.
  • 12.
    Container setup • 4repos - Flight SW, Testbed SW, COSMOS, Docker_Env • 4 containers – Flight SW, Testbed SW, COSMOS, VNC • Run-time voluming of source code, containers are stateless • Provides outside dev environment with docker build capabilities • Keeps cross-compile toolchains standardized Directory Structure with Submoduled repos
  • 13.
    Network setup • Oneinstance comprises 4 containers (docker-compose) • UDP SpaceWire abstraction between FSW and TBSW • TCP radio abstraction between Ground and TBSW • Xforwarding Ground to X11 Server to VNC
  • 14.
    VNC window • Shamelessplug for the creator – thanks Jan! • https://github.com/suchja/x11client • https://github.com/suchja/x11server • X11Server focuses on VNC and X setup • X11Client focuses on the application (COSMOS) • Brought up with compose, share xauth cookie through voluming • Runs X virtual frame buffer with Openbox • Contains the X security issues to the containers (we think) X11Server (container) COSMOS (container) VNC Viewer (dev machine) Xauth
  • 15.
    Eclipse and Debugging •Eclipse Integration using Docker Tooling (Linux Tools project) • CDT Build within Docker Container (including cross compiling) • Run/Debug FSW (x86 Linux) in Docker Container • Visually Debug FSW (LEON3 RTEMS) on Custom Flight Hardware • Run Multi-Container App and System Tests (Docker Compose)
  • 16.
  • 17.
  • 18.
    Software CI • Goal:Parallel software testing of our software sim • Limitation: We only had one server to prototype on • Execution: • Bamboo with multiple agents on single server • Runs same setup as dev except for X11Server • Binaries/workbooks are passed through the chain, not containers • Re-tagged each docker image so there was no mangling with different branches • Docker-compose run with –p to provide unique keyed containers Server Agent 1 Agent 2 Agent 4Agent 3
  • 19.
    Hardware CI/CD Five completesets of hardware (Testbeds) Three flows, similar steps: • Binary cross-compiled inside container • Loaded to single board computer via GRMON • Serial output piped back via SSH/Screen • L3 InControl used as Ground System
  • 20.
  • 21.
    Lessons Learned • If(Dev_env == CI_env); then debuggable = true; • Permissions can be problematic • When editing volume from outside, specify your user to run the container • Static IPs cause endless headaches • IP address spaces were not getting garbage collected, required daemon restart • Docker-networks can’t handle overlapping IPs/subnets • Bamboo assumes sandboxed code, Docker is global • Two layers of dependencies, jobs in a plan and branches in a plan • Dockercompose –p is magical • Our server can only handle 4 instances of our setup • Docker abstracts NEARLY* everything, but not everything • Linux Message Queues appear abstracted but are globally held in the kernel
  • 22.
    Lessons Learned • Bamboolatches the git commit once started • This is great for consistency, provides problems when tagging containers on commit hash “DETACHED HEAD” • Lock your Dockerfile FROM: version • Ubuntu:latest can change under the hood - lock a working version • Signal handling must go all the way down the rabbit hole • When using start scripts, signals must be propagated to the end application, particularly for graceful shutdown • Log Buffering – use “unbuffer” for better timestamps • Background processes output buffered causing timestamp bunching • Not everything HAS to get containerized • Docker bridge networks can be sniffed by host wireshark • Permissions and display forwarding proved more pain than worth
  • 23.
    What is next? •Move past single server: Docker Registry • Hardware stack trace • Clean up tagging scheme (possibly obsolete with –p) • Release manager/artifact handler? • Any brilliant ideas picked up at Dockercon
  • 24.
    Recap • DART • Whyis space hard? • Voluming source code can be super helpful in development • VNC finally provides an easy window into containers
  • 25.
  • 26.
    Goddard Space FlightCenter Johnson Space Center Langley Research Center Glenn Research Center Marshall Space Flight Center Planetary Defense Coordination Office Questions?