Application development has changed drastically in the past decade.
Client-side applications used to be just that: client-side. Desktop
applications used to be self-contained. They sat in sharp contrast to
web applications, which inherently required some hybrid of server- and
client-side programming. But the world has changed. Whether it's for
automated cloud backup, more advanced server-side computation,
telemetry, or collaboration, desktop applications are increasingly
dependent on a server-side component to help them out.
As mentioned, web applications already blurred this line significantly.
Many of the features of web development are now sought in desktop apps,
such as rolling updates. As mobile apps have begun to dominate the
landscape, they have jumped into this hybrid world from the beginning.
Unfortunately, while client- and server-side development are both
programming, there are large differences in the details. For someone
familiar with desktop, mobile, or even frontend web development, it's
easy to get tripped on the server, both at the software and
infrastructure level.
This post is going to provide an overview of what is involved in a
cloud software deployment pipeline that provides:
-
High availability
-
Fault tolerance
-
Scalability
-
Auto recovery
Are you interested in more posts on related topics? Please let us know what you'd like to hear about.
Cloud versus desktop
When you write a desktop application, you have several goals in mind.
You may be trying to provide an intuitive user interface that works
across different screen sizes and handles both mouse-based and
touch-based interaction. You may need to support different versions of
an operating system or entirely different operating systems. You may
want to reduce the downloaded executable size and simplify the
installation process.
Server-side, cloud software deployments typically face radically different
constraints. Let's see some examples.
Side note I'm talking about desktop versus cloud here, but many of
the same concepts carry over to mobile and frontend web development as
well. The desktop provides for the starkest comparison, which is why
I've focused on it. Similarly, though we're talking about the cloud,
on-prem deployments fit most of the same models. But again, the cloud
really emphasizes the differences.
Code size
Typically, application size is close to irrelevant. Sure, it makes a
difference if your application is 10MB vs. 100GB. But ephemeral cloud
storage is cheap, and bandwidth within a cloud is plentiful. Servers
don't typically care too much about that difference.
On the other hand, minimizing startup time is crucial (we'll see why in
a bit). An installer that spends 30 seconds unzipping a compressed
payload would be better served by skipping compression and using more
bandwidth, for instance.
Compatibility
You don't need to support multiple versions of operating systems. You
have full control of where your application is deployed. (And as we'll
see later, containers make this even more flexible.) What Ubuntu 18.04?
Windows Server 2016 with a specific set of libraries pre-installed? No
problem!
But unlike a desktop application, you're not going to have a human being
performing installation. A failed or unreliable installation step won't
result in a few frustrating emails. It will result in downtime for
potentially every user of your service.
Hardware
On the desktop, you have no control over hardware reliability. A user's
machine can crash. It can have a complete hard drive failure. Users may
accidentally delete data. The user may have a power outage. All of these
represent frustrations for a user but are outside of your
responsibilities.
Not so on the server. You are fully responsible for setting up and
running your server infrastructure. The cloud has excellent fault
tolerance properties and boasts the ability to scale up and down based
on demand easily. But that doesn't come for free. You need to set up
your servers to follow automated deployment techniques, leverage
autoscaling features, set up reasonable metrics to guide them, and span
multiple Availability Zones to protect you from cloud failures.
What is the cloud?
People throw around the term "cloud" often. But what exactly
distinguishes the cloud? Rentable hardware, such as dedicated hosts and
Virtual Private Servers (VPSs), have been around for a long time. In
some ways, you can look at the cloud as an extension of those existing
technologies. But in many ways, the cloud represents a deeper shift.
This shift focuses on two innovations in the cloud development space:
-
Automated infrastructure provisioning where machine creation and
destruction is a normal part of business. With previous rentable
hardware, most people tended to create, manually provision, and
manually maintain machines. The cloud encourages fundamentally
different approaches.
-
A wealth of cloud services to provide functionality and reliability
you couldn't easily provide yourself. The best example I can think
of here is cloud blob storage. Providing high availability data
storage with near-100% durability guarantees is a challenging
problem, especially with rented virtual machines. Cloud providers
step in and offer baked solutions to these kinds of challenging
problems.
The concepts of cloud computing have become ubiquitous in server
development and management these days. Existing data center providers
have begun offering more cloud-like functionality. Virtual clouds for
internal data centers are becoming commonplace and a standard. In other
words, these techniques are valuable, they work, and they apply in many
places.
A solid cloud software deployment
There are many moving parts to perfecting a cloud deployment. The
techniques vary depending on variables like the development language,
geographic distribution goals, regulations (like data privacy), and
trade-offs between uptime guarantees and hardware costs.
That said, at FP Complete, we've found that there are a few common
themes that underly almost all our cloud deployments. Let's see what
those are.
Automated provisioning
Setting up a machine should be a fully automated process. Operators used
to spend significant time manually installing base operating systems,
installing packages, running updates, configuring firewalls, and more.
But in a world of machines that are created and destroyed at will, this
isn't an option.
Our recommendation is to focus on the techniques of immutable
infrastructure.
It would be best if you had scripts that automate the
installation of all dependencies. These scripts may take significant
time to run. If you run these installation steps while creating new
machines, it may delay how long before you have a working machine.
Instead, it's best to run these steps as part of a Continuous
Integration (CI system. And the result should be captured as a Docker
container image or, in some cases, a virtual machine image.
By pushing provisioning to an earlier step, you can test this process
more easily, spin up new machines more quickly, and risk fewer surprises
in productionrisk fewer surprises in production. Which brings us to...
Autoscaling
Most server traffic ends up following a bursty workload pattern. It's
unusual to have the same level of traffic throughout the day. Instead,
traffic may spike to ten times its norm during a few hours in the middle
of the workday. Weekends may show almost no traffic.
In a pre-cloud world, the standard approach to this is to provision
enough machine capacity at all times to handle the peak traffic you'll
ever see. This approach avoids downtime and latency but increases
hardware costs. It also demands knowing in advance what your peak
workload will be, which may not be possible.
In the cloud, it is common to use services like Auto-scaling Groups
(ASGs) to creating an elastic number of machines. This service will
typically look something like:
-
I want a minimum of three machines, running in three different
availability zones to provide resilience against cloud failures
-
I want to run a specific machine image on each of these machines.
That machine image should be preconfigured with our application and
all its dependencies following a CI pipeline.
-
I want to run a health check by making an HTTP request to the
service. If the health check fails, destroy the machine, and then
replace it.
-
Monitoring CPU load across all machines. If it ever goes above 80%
for five minutes, add another machine to the group. If it drops
below 60%, remove a machine. (Monitoring may also depend on things
like request latency or other metrics.)
This kind of setup provides autoscaling, high availability,
auto-healing, and machine failure resilience. Then we put a load
balancer in front of the group of nodes, and clients automatically
connect to a healthy machine.
Containers and orchestration
Creating machine images and deploying brand new machines can be
time-intensive and difficult to test on a local machine. And for many
workloads (like microservices architectures), having a separate machine
for each service is overkill. Instead, we'd like to use techniques that
more easily pack multiple services onto a single machine and allow for
local development and running without a full virtual machine.
Containers and container orchestration have emerged as a standard
approach, with Docker and Kubernetes as the de facto standard solutions.
Tools like this not only reduce hardware costs with more efficient
service packing but provide for great features like green/blue
deployments out of the box.
We're big believers in this approach to server management at FP
Complete. Our flagship product, Kube360, focuses on accelerating your
adoption of this technology in your organization.
Continuous Deployment
Automation underlies most of what makes cloud deployments bulletproof.
Continuous Integration focuses on building and testing your software in
an automated fashion with binaries artifacts that can be deployed.
Continuous Deployment, by contrast, focuses on automating the deployment
of these artifacts onto servers.
Most cloud deployments involve multiple cloud services, such as blob
storage, DNS, virtual machines, load balancers, and relational
databases. We recommend following declarative infrastructure techniques,
where you specify the desired state of the system in a config format and
allow a tool to provision your cloud resources. We'll often use
Terraform for this at the infrastructure level. In addition, Kubernetes
manifest files are another form of declarative infrastructure.
Getting it wrong
There are lots of moving pieces to get right. Many of us cut our teeth
deploying server software the old way. This typically involves manually
configuring a machine, building an executable on your local machine,
FTPing the binary over to the server, and running it in a terminal
session. While simpler in many ways, following this kind of approach
opens you up to many failure modes:
-
It's easy to make an operator mistake
-
It's difficult to roll back to a known good version of the server
-
It's difficult to get a zero-downtime upgrade
-
You're not protected against machine failure
-
Your service will not gracefully handle increases in traffic
I come from a developer, not an operator, background. I used to deploy
all my software this way. I've come around to the idea that spending
time on a proper cloud deployment strategy is a worthwhile investment.
Conclusion
While many ideas carry over from client-side to server-side development,
there is still a lot of headroom to master. Cloud-based deployment is a
science unto itself, with a new set of tools and requirements, and a
regularly shifting landscape of best practices.
Our focus at FP Complete is leveraging our experience to help companies
integrate best practices in their deployments, employ best-in-class
DevOps practices, and leverage best-in-class tools and programming
languages for developing reliable, scalable, and secure server-side
software.
Need help with a server problem? We'd love to chat. You can also check out our DevOps offerings.
Want to learn more about how to do DevOps? We recommend the following
articles:
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.