FP Complete builds on cutting-edge open-source devops
technologies, providing devops solutions and consulting to a number
of companies in life sciences & health, financial services, and
secure Internet services. This exposure has given us a chance to
work with some of the best engineering practices in devops.
As we bring more companies forward into the world of devops, we
will continue to share lessons learned and best practices with the
IT community. Today’s best practice: immutability.
What is immutable?
As a software engineering concept, immutability means that once
you assign a value or configuration to some entity, you never
modify it in place -- if you want it to change, you make a new one
and (optionally) tear down the old one.
As we know from functional programming (FP) and our work with
Haskell, immutability boosts the reliability and predictability of
system behavior -- preventing bugs and downtime, and increasing the
speed and reproducibility of software development and deployment. A
variable, program, or server is immutable if we guarantee, once
it’s set up, that we won’t modify it in place. We can do this if
replacements and deployments are cheap.
Devops has made it so much cheaper to build and deploy new
software services and even whole servers and clusters, some
companies are now taking advantage of immutability -- leading to
much more reliable online services with less downtime and more
frequent, reliable, repeatable updates.
Old-fashioned servers
are mutable
Mutability describes how most traditional software, and most
traditional server operations, are done. It means that once you
create something, to avoid the terrible cost of creating another
one, you just keep modifying the one you already put in place.
Software patches, configuration file changes, even changing the
value of a variable that’s already in use -- all these are examples
of mutability. It’s always been a bit risky, but can be economical
when (1) the cost of creating an entity is very high, and (2) the
cost of bugs and mistakes is very low.
Unfortunately, mutability is a key cause of bugs and
mistakes.
Consider this with a person instead of a computer. If I break my
arm (an accidental mutation of my state), we could try to fix the
problem using an immutable method: make a new Aaron, identical to
the old one, but with a non-broken arm -- then tear down the old
Aaron who is no longer needed. Obviously we just don’t have the
technology to do this -- it’s beyond unaffordable -- so we are
forced to use mutability. We patch up my arm, and wait for it to
heal.
That’s way better than giving up, but now our managed service
(Aaron) is in an unprecedented, irreproducible state. For weeks my
arm is offline, in repair mode, while the rest of me runs. And for
the rest of my life I may have to keep track of the fact that this
arm is a little weaker, and there are some things I cannot do. My
boss now has to remember: Aaron has this special flag that says he
can’t lift some kinds of heavy boxes. What a pain. At least humans
are flexible, so my colleagues won't just break with an "Error 504"
when they try to shake my hand.
If only I could reconstruct the arm in its original state -- or
even a whole new Aaron -- life would be so much easier. We may
never have that for humans, certainly not in my lifetime. But
thanks to modern devops technologies, we do have that option for
servers. They don’t need to be modified in place, and they don’t
need to run in unprecedented, irreproducible configurations that
lead to many of today’s sysadmin emergencies, security breaches,
and downtime.
How do we make servers
immutable?
Our FP Deploy approach to devops is based on heavy use of
containers (notably Docker), virtual machines, and where feasible,
virtual networking, cloud computing (AWS, Azure, etc.), and virtual
private clouds (VPCs). Every one of these technologies has
something in common: it allows us to abstract away the work of
creating a running online service. Configurations can be written
declaratively, put under source control (say, in a Git repository),
and run at any time (using, say, Kubernetes).
You want another server? Just run the deploy command again. You
want another whole cluster (a “device” made of multiple servers and
associated networking and data connections)? Just run that deploy
command again.
By slashing the cost of deployment, we make it possible to
create a whole new server painlessly, cheaply, and reproducibly.
Developers just delivered a bug fix? Don’t patch the application
server! Bring up a new instance based on the new software build.
Once you’re happy that it’s running properly, bring down the old,
less-good instance.
(In a future post we’ll talk about blue-green deployments
and canary deployments -- cost-effective, easy techniques
for making this transition cautiously and gracefully.)
An immutable server has a known, well-understood,
source-controlled, reproducible configuration. No footnotes. We can
be confident that the new production servers are the same as the
engineering test servers, because they were created by running
exactly the same deployment files -- not by a series of manual
admin tweaks that could be incorrect or have latent side
effects.
This also makes it easy to recover from disaster, or scale up
for increased load, by redeploying a new server from the same
deployment files.
We can afford to do this only because modern devops makes it so
cheap to create new servers. When doing it right is automated,
predictable, repeatable, and inexpensive, there’s no longer any
reason to do it wrong.
Can
whole clusters, or distributed devices, be immutable?
It’s easy to assert that an application server can be made
immutable, because well-architected web app servers are fairly
self-contained and fairly stateless. But what if you are making
larger changes to a whole distributed device, consisting of many
servers and network connections? What if, for example, you have
made matching changes in both your front-end server and your
back-end server? Or in any group of services in a service-oriented
architecture (SOA)?
Then you step up to immutable devices, immutable clusters or
VPCs or distributed systems, using exactly the same methodology. At
FP Complete we routinely create whole 10+ virtual machine devices
on command, even just for testing purposes, because we’ve automated
it with FP Deploy. Again, why not do it right? Why not be confident
that the whole distributed system is in a known, reproducible
state? We can be much more assured that what worked in test is
going to work in production.
However, sometimes we don’t have that luxury, for example if we
are retrofitting modern devops onto parts of an older system that
has not been uniformly upgraded -- or in any case where the service
we are upgrading is far cheaper to redeploy than another, perhaps
more stateful, service in the same distributed system. That’s ok:
we can have immutable servers as parts of a mutable device. The
administrator of the distributed system now has to track which
services have been replaced with newer versions, but at least for
any given service the servers can be immutable.
What about the database
server?
Once we have made it cheap to rebuild and replace servers at any
time, immutability can be a reality, and reliability and
reproducibility go way up. However, not all servers are so easily
replaced. In fact, servers providing key input and output channels
may be outside the scope of our control -- so all we can do is
treat them as external to the immutable device, and understand that
hooking up the inputs and outputs needs to be part of the
declarative script used to bring up any new version of our
device.
The strongest example of this may be an enterprise database
server. We typically have no intention of building a whole new
database as part of our application build-and-deploy process.
Databases are typically long-running and, fundamental to their
purpose, they are extremely stateful, extremely mutable. Cloud
services such as RDS make it easy to spin up new database
instances, but often the contents of an enterprise database are too
large or fast-changing to want to rehost -- or we just don’t have
permission to do so. Instead, we leave it in place and accept that
its contents are very mutable.
So even when we use immutability to make our application server
clusters easier to upgrade and less prone to errors, we need to
understand that they will almost always be connected up to other
servers that lack this golden property. Automated deployment with
modern devops, ideally with truly immutable servers, can ensure
that your system looks just like the system that worked in test --
eliminating a lot of surprise downtime and deployment failures. But
even at companies with modern devops, failures still happen -- and
when they do, it’s because the test system was not exposed to the
same kinds of inputs and outputs, and the same database state, as
the production system. In a future blog post, we’ll look at some
best practices for testing and quality control.
Have a look at your own software deployment practices. Are there
too many (more than zero) manual steps involved in bringing up a
new or updated server? Do you allow, or even require, sysadmins to
make changes to servers that are already up and running? Maybe it's
time to use automated, reproducible deployment, and make the move
to immutable servers.
For more assistance
In addition to our customizable FP Deploy devops solution, FP
Complete offers consulting services, from advice to hands-on
engineering, to help companies migrate to modern devops.
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.