Subject: DEVOPS B.Tech III year II sem.
Class: CSM 3-2
Unit-1 Notes
Introducing DevOps
DevOps is, by definition, a field that spans several disciplines. It is a field that is very practical
and hands-on, but at the same time, you must understand both the technical background and the
nontechnical cultural aspects. This book covers both the practical and soft skills required for a
best-of-breed DevOps implementation in your organization.
The word "DevOps" is a combination of the words "development" and "operation". This
wordplay already serves to give us a hint of the basic nature of the idea behind DevOps. It is a
practice where collaboration between different disciplines of software development is
encouraged.
The origin of the word DevOps and the early days of the DevOps movement can be tracked rather
precisely: Patrick Debois is a software developer and consultant with experience in many fields within IT.
He was frustrated with the divide between developers and operations personnel. He tried getting people
interested in the problem at conferences, but there wasn't much interest initially.
The DevOps movement has its roots in Agile software development principles. The Agile Manifesto was
written in 2001 by a number of individuals wanting to improve the then current status quo of system
development and find new ways of working in the software development industry.
DevOps can be said to relate to the first principle, "Individuals and interactions over processes and tools."
Another core goal of DevOps is automation and Continuous Delivery. Simply put, automating repetitive
and tedious tasks leaves more time for human interaction, where true value can be created.
Our definition of DevOps focuses on the goals, rather than
the means.
DevOps is a set of practices intended to reduce the time between
committing a change to a system and the change being placed into
normal production, while ensuring high quality.
DevOps and Agile
One of the characterizations of DevOps emphasizes the relationship of DevOps practices to agile
practices. In this section, we overlay the DevOps practices on IBM’s Disciplined Agile Delivery.
Our focus is on what is added by DevOps, not an explanation of Disciplined Agile Delivery. For
that, see Disciplined Agile Delivery: A Practitioner’s Approach. As shown in Figure 1.2, Disciplined
Agile Delivery has three phases—inception, construction, and transition. In the DevOps context,
we interpret transition as deployment.
FIGURE 1.2 Disciplined Agile Delivery phases for each release. (Adapted
from Disciplined Agile Delivery: A Practitioner’s Guide by Ambler and
Lines)
DevOps practices impact all three phases.
1. Inception phase. During the inception phase, release planning and initial requirements
specification are done.
a. Considerations of Ops will add some requirements for the developers. We will see these in
more detail later in this book, but maintaining backward compatibility between releases and
having features be software switchable are two of these requirements. The form and content of
operational log messages impacts the ability of Ops to troubleshoot a problem.
b. Release planning includes feature prioritization but it also includes coordination with
operations personnel about the scheduling of the release and determining what training the
operations personnel require to support the new release. Release planning also includes ensuring
compatibility with other packages in the environment and a recovery plan if the release fails.
DevOps practices make incorporation of many of the coordination-related topics in release
planning unnecessary, whereas other aspects become highly automated.
2. Construction phase. During the construction phase, key elements of the DevOps practices are
the management of the code branches, the use of continuous integration and continuous
deployment, and incorporation of test cases for automated testing. These are also agile practices
but form an important portion of the ability to automate the deployment pipeline. A new
element is the integrated and automated connection between construction and transition
activities.
3. Transition phase. In the transition phase, the solution is deployed and the development team is
responsible for the deployment, monitoring the process of the deployment, deciding whether to
roll back and when, and monitoring the execution after deployment. The development team has
a role of “reliability engineer,” who is responsible for monitoring and troubleshooting problems
during deployment and subsequent execution.
DevOps and ITIL
ITIL, which was formerly known as Information Technology Infrastructure Library, is a practice
used by many large and mature organizations.
ITIL is a large framework that formalizes many aspects of the software life cycle. While DevOps
and Continuous Delivery hold the view that the changesets we deliver to production should be
small and happen often, at first glance, ITIL would appear to hold the opposite view. It should
be noted that this isn't really true. Legacy systems are quite often monolithic, and in these cases,
you need a process such as ITIL to manage the complex changes often associated with large
monolithic systems.
If you are working in a large organization, the likelihood that you are working with such large
monolithic legacy systems is very high.
In any case, many of the practices described in ITIL translate directly into corresponding DevOps
practices. ITIL prescribes a configuration management system and a configuration management
database. These types of systems are also integral to DevOps,
The DevOps process and Continuous Delivery – an overview
An example of a Continuous Delivery pipeline in a large organization is introduced in the
following image:
While the basic outline of this image holds true surprisingly often, regardless of the organization.
There are, of course, differences, depending on the size of the organization and the complexity of
the products that are being developed.
The early parts of the chain, that is, the developer environments and the Continuous Integration
environment, are normally very similar.
The number and types of testing environments vary greatly. The production environments also
vary greatly.
In the following sections, we will discuss the different parts of the Continuous Delivery pipeline.
The developers
The developers (on the far left in the figure) work on their workstations. They develop code and
need many tools to be efficient.
The following detail from the previous larger Continuous Delivery pipeline overview illustrates
the development team.
Ideally, they would each have production-like environments available to work with locally on
their workstations or laptops. Depending on the type of software that is being developed, this
might actually be possible, but it's more common to simulate, or rather, mock, the parts of the
production environments that are hard to replicate. This might, for example, be the case for
dependencies such as external payment systems or phone hardware.
When you work with DevOps, you might, depending on which of its two constituents you
emphasized on in your original background, pay more or less attention to this part of the
Continuous Delivery pipeline. If you have a strong developer background, you appreciate the
convenience of a prepackaged developer environment, for example, and work a lot with those.
This is a sound practice, since otherwise developers might spend a lot of time creating their
development environments. Such a prepackaged environment might, for instance, include a
specific version of the Java Development Kit and an integrated development environment, such
as Eclipse. If you work with Python, you might package a specific Python version, and so on.
Keep in mind that we essentially need two or more separately maintained environments. The
preceding developer environment consists of all the development tools we need. These will not
be installed on the test or production systems. Further, the developers also need some way to
deploy their code in a production-like way. This can be a virtual machine provisioned with
Vagrant running on the developer's machine, a cloud instance running on AWS, or a Docker
container: there are many ways to solve this problem.
The revision control system
The revision control system is often the heart of the development environment. The code that
forms the organization's software products is stored here. It is also common to store the
configurations that form the infrastructure here. If you are working with hardware development,
the designs might also be stored in the revision control system.
The following image shows the systems dealing with code, Continuous Integration, and artifact
storage in the Continuous Delivery pipeline in greater detail:
For such a vital part of the organization's infrastructure, there is surprisingly little variation in
the choice of product. These days, many use Git or are switching to it, especially those using
proprietary systems reaching end-of-life.
Regardless of the revision control system you use in your organization, the choice of product is
only one aspect of the larger picture.
You need to decide on directory structure conventions and which branching strategy to use.
If you have a great deal of independent components, you might decide to use a separate
repository for each of them.
The build server
The build server is conceptually simple. It might be seen as a glorified egg timer that builds your
source code at regular intervals or on different triggers.
The most common usage pattern is to have the build server listen to changes in the revision
control system. When a change is noticed, the build server updates its local copy of the source
from the revision control system. Then, it builds the source and performs optional tests to verify
the quality of the changes. This process is called Continuous Integration
The artifact repository
When the build server has verified the quality of the code and compiled it into deliverables, it is
useful to store the compiled binary artifacts in a repository. This is normally not the same as the
revision control system.
In essence, these binary code repositories are filesystems that are accessible over the HTTP
protocol. Normally, they provide features for searching and indexing as well as storing
metadata, such as various type identifiers and version information about the artifacts.
In the Java world, a pretty common choice is Sonatype Nexus. Nexus is not limited to Java
artifacts, such as Jars or Ears, but can also store artifacts of the operating system type, such as
RPMs, artifacts suitable for JavaScript development, and so on.
Amazon S3 is a key-value datastore that can be used to store binary artifacts. Some build
systems, such as Atlassian Bamboo, can use Amazon S3 to store artifacts. The S3 protocol is
open, and there are open source implementations that can be deployed inside your own
network. One such possibility is the Ceph distributed filesystem, which provides an S3-
compatible object store.
Package managers
Linux servers usually employ systems for deployment that are similar in principle but have
some differences in practice.
Red Hat-like systems use a package format called RPM. Debian-like systems use the .deb format,
which is a different package format with similar abilities. The deliverables can then be installed
on servers with a command that fetches them from a binary repository. These commands are
called package managers.
On Red Hat systems, the command is called yum, or, more recently, dnf. On Debian-like
systems, it is called aptitude/dpkg.
The great benefit of these package management systems is that it is easy to install and upgrade a
package; dependencies are installed automatically.
Test environments
After the build server has stored the artifacts in the binary repository, they can be installed from
there into test environments.
The following figure shows the test systems in greater detail:
Test environments should normally attempt to be as production-like as is feasible. Therefore, it is
desirable that the they be installed and configured with the same methods as production servers.
Staging/production
Staging environments are the last line of test environments. They are interchangeable with
production environments. You install your new releases on the staging servers, check that
everything works, and then swap out your old production servers and replace them with the
staging servers, which will then become the new production servers. This is sometimes called the
blue-green deployment strategy.
The exact details of how to perform this style of deployment depend on the product being
deployed. Sometimes, it is not possible to have several production systems running in parallel,
usually because production systems are very expensive.
At the other end of the spectrum, we might have many hundreds of production systems in a
pool. We can then gradually roll out new releases in the pool. Logged-in users stay with the
version running on the server they are logged in to. New users log in to servers running new
versions of the software.
The following detail from the larger Continuous Delivery image shows the final systems and
roles involved:
Release management
We have so far assumed that the release process is mostly automatic. This is the dream scenario
for people working with DevOps.
This dream scenario is a challenge to achieve in the real world. One reason for this is that it is
usually hard to reach the level of test automation needed in order to have complete confidence in
automated deploys. Another reason is simply that the cadence of business development doesn't
always the match cadence of technical development. Therefore, it is necessary to enable human
intervention in the release process.
How this is done in practice varies, but deployment systems usually have a way to support how
to describe which software versions to use in different environments.
The integration test environments can then be set to use the latest versions that have been
deployed to the binary artifact repository. The staging and production servers have particular
versions that have been tested by the quality assurance team.
Scrum, Kanban, and the delivery pipeline
Scrum focuses on sprint cycles, which can occur biweekly or monthly. Kanban can be said to
focus more on shorter cycles, which can occur daily.
The philosophical differences between Scrum and Kanban are a bit deeper, although not
mutually exclusive. Many organizations use both Kanban and Scrum together.
From a software-deployment viewpoint, both Scrum and Kanban are similar. Both require
frequent hassle-free deployments. From a DevOps perspective, a change starts propagating
through the Continuous Delivery pipeline toward test systems and beyond when it is deemed
ready enough to start that journey. This might be judged on subjective measurements or
objective ones, such as "all unit tests are green."
Our pipeline can manage both the following types of scenarios:
The build server supports the generation of the objective code quality metrics that we
need in order to make decisions. These decisions can either be made automatically or be the basis
for manual decisions.
The deployment pipeline can also be directed manually. This can be handled with an
issue management system, via configuration code commits, or both.
So, again, from a DevOps perspective, it doesn't really matter if we use Scrum, Scaled Agile
Framework, Kanban, or another method within the lean or Agile frameworks. Even a traditional
Waterfall process can be successfully managed—DevOps serves all!
Wrapping up – a complete example
To make it more clear, let's have a look at what happens to a concrete change as it propagates
through the systems, using an example:
The development team has been given the responsibility to develop a change to the
organization's system. The change revolves around adding new roles to the authentication
system. This seemingly simple task is hard in reality because many different systems will be
affected by the change.
To make life easier, it is decided that the change will be broken down into several smaller
changes, which will be tested independently and mostly automatically by automated regression
tests.
The first change, the addition of a new role to the authentication system, is developed
locally on developer machines and given best-effort local testing. To really know if it works, the
developer needs access to systems not available in his or her local environment; in this case, an
LDAP server containing user information and roles.
If test-driven development is used, a failing test is written eve9n before any actual code is
written. After the failing test is written, new code that makes the test pass is written.
The developer checks in the change to the organization's revision control system,
a Git repository.
• The build server picks up the change and initiates the build process. After unit
testing, the change is deemed fit enough to be deployed to the binary repository, which is
a Nexus installation.
• The configuration management system, Puppet, notices that there is a new
version of the authentication component available. The integration test server is
described as requiring the latest version to be installed, so Puppet goes ahead and installs
the new component.
• The installation of the new component now triggers automated regression tests.
When these have been finished successfully, manual tests by the quality assurance team
commence.
• The quality assurance team gives the change its seal of approval. The change
moves on to the staging server, where final acceptance testing commences.
• After the acceptance test phase is completed, the staging server is swapped into
production, and the production server becomes the new staging server. This last step is
managed by the organization's load-balancing server.
Identifying bottlenecks
As is apparent from the previous example, there is a lot going on for any change that propagates
through the pipeline from development to production. It is important for this process to be
efficient.
As with all Agile work, keep track of what you are doing, and try to identify problem areas.
When everything is working as it should, a commit to the code repository should result in the
change being deployed to integration test servers within a 15-minute time span.
When things are not working well, a deploy can take days of unexpected hassles. Here are some
possible causes:
Database schema changes.
Test data doesn't match expectations.
Deploys are person dependent, and the person wasn't available.
There is unnecessary red tape associated with propagating changes.
Your changes aren't small and therefore require a lot of work to deploy safely. This might
be because your architecture is basically a monolith.