Understanding cloud auth

The topics of authentication and authorization usually appear simple but turn out to hide significant complexity. That's because, at its core, auth is all about answering two questions:

Who are you
What are you allowed to do

However, the devil is in the details. Seasoned IT professionals, software developers, and even typical end users are fairly accustomed at this point to many of the most common requirements and pain points around auth.

Cloud authentication and authorization is not drastically different from non-cloud systems, at least in principle. However, there are a few things about the cloud and its common use cases that introduce some curve balls:

As with most auth systems, cloud providers each have their own idiosyncracies
Cloud auth systems have almost always been designed from the outset to work API first, and interact with popular web technologies
Security is usually taken very seriously in cloud, leading to workflows arguably more complex than other systems
Cloud services themselves typically need some method to authenticate to the cloud, e.g. a virtual machine gaining access to private blob storage
Many modern DevOps tools are commonly deployed to cloud systems, and introduce extra layers of complexity and indirection

This blog post series is going to focus on the full picture of authentication and authorization, focusing on a cloud mindset. There is significant overlap with non-cloud systems in this, but we'll be covering those details as well to give a complete picture. Once we have those concepts and terms in place, we'll be ready to tackle the quirks of individual cloud providers and commonly used tooling.

Goals of authentication

We're going to define authentication as proving your identity to a service provider. A service provider can be anything from a cloud provider offering virtual machines, to your webmail system, to a bouncer at a bartender who has your name on a list. The identity is an equally flexible concept, and could be "my email address" or "my user ID in a database" or "my full name."

To help motivate the concepts we'll be introducing, let's understand what goals we're trying to achieve with typical authentication systems.

Allow a user to prove who he/she is
Minimize the number of passwords a user has to memorize
Minimize the amount of work IT administrator have to do to create new user accounts, maintain them, and ultimately shut them down
- That last point is especially important; no one wants the engineer who was just fired to still be able to authenticate to one of the systems
Provide security against common attack vectors, like compromised passwords or lost devices
Provide a relatively easy-to-use method for user authentication
Allow a computer program/application/service (lets call these all apps) to prove what it is
Provide a simple way to allocate, securely transmit, and store credentials necessary for those proofs
Ensure that credentials can be revoked when someone leaves a company or an app is no longer desired (or is compromised)

Goals of authorization

Once we know the identity of something or someone, the next question is: what are they allowed to do? That's where authorization comes into play. A good authorization provides these kinds of features:

Fine grained control, when necessary, of who can do what
Ability to grant common sets of permissions as a bundle, avoiding tedium and mistakes
A centralized collection of authorization rules
Ability to revoke a permission, and see that change propagated quickly to multiple systems
Ability to delegate permissions from one identity to another
- For example: if I'm allowed to read a file on some cloud storage server, it would be nice if I could let my mail client do that too, without the mail program pretending it's me
To avoid mistakes, it would be nice to assume a smaller set of permissions when performing some operations
- For example: as a super user/global admin/root user, I'd like to be able to say "I don't want to accidentally delete systems files right now"

In simple systems, the two concepts of authentication and authorization is straightforward. For example, on a single-user computer system, my username would be my identity, I would authenticate using my password, and as that user I would be authorized to do anything on the computer system.

However, most modern systems end up with many additional layers of complexity. Let's step through what some of these concepts are.

Users and policies

A basic concept of authentication would be a user. This typically would refer to a real human being accessing some service. Depending on the system, they may use identifiers like usernames or email addresses. User accounts are often times given to non-users, like automated processes or Continuous Integration (CI) jobs. However, most modern systems would recommend using a service account (discussed below) or similar instead.

Sometimes, the user is the end of the story. When I log into my personal Gmail account, I'm allowed to read and write emails in that account. However, when dealing with multiuser shared systems, some form of permissions management comes along as well. Most cloud providers have a robust and sophisticated set of policies, where you can specify fine-grained individual permissions within a policy.

As an example, with AWS, the S3 file storage service provides an array of individual actions from the obvious (read, write, and delete an object) to the more obscure (like setting retention policies on an object). You can also specify which files can be affected by these permissions, allowing a user to, for example, have read and write access in one directory, but read-only access in another.

Managing all of these individual permissions each time for each user is tedious and error prone. It makes it difficult to understand what a user can actually do. Common practice is to create a few policies across your organization, and assign them appropriately to each user, trying to minimize the amount of permissions granted out.

Groups

Within the world of authorization, groups are a natural extensions of users and policies. Odds are you'll have multiple users and multiple policies. And odds are that you're likely to have groups of users who need to have similar sets of policy documents. You could create a large master policy that encompasses the smaller policies, but that could be difficult to maintain. You could also apply each individual policy document to each user, but that's difficult to keep track of.

Instead, with groups, you can assign multiple policies to a group, and multiple groups to a user. If you have a billing team that needs access to the billing dashboard, plus the list of all users in the system, you may have a BillingDashboard policy as well as a ListUsers policy, and assign both policies to a BillingTeam group. You may then also assign the ListUsers policy to the Operators group.

Roles

There's a downside with this policies and groups setup described above. Even if I'm a superadmin on my cloud account, I may not want to have the responsibility of all those powers at all times. It's far too easy to accidentally destroy vital resources like a database server. Often, we would like to artificially limit our permissions while operating with a service.

Roles allow us to do this. With roles, we create a named role for some set of operations, assign a set of policies to it, and provide some way for users to assume that role. When you assume that role, you can perform actions using that set of permissions, but audit trails will still be able to trace back to the original user who performed the actions.

Arguably a cloud best practice is to grant users only enough permissions to assume various roles, and otherwise unable to perform any meaningful actions. This forces a higher level of stated intent when interacting with cloud APIs.

Service accounts

Some cloud providers and tools support the concept of a service account. While users can be used for both real human beings and services, there is often a mismatch. For example, we typically want to enable multi-factor authentication on real user accounts, but alternative authentication schemes on services.

One approach to this is service accounts. Service accounts vary among different providers, but typically allow defining some kind of service, receiving some secure token or password, and assigning either roles or policies to that service account.

In some cases, such as Amazon's EC2, you can assign roles directly to cloud machines, allowing programs running on those machines to easily and securely assume those roles, without needing to store any kinds of tokens or secrets. This concept nicely ties in with roles for users, making role-based management of both users and services and emerging best practice in industry.

RBAC vs ACL

The system described above is known as Role Based Access Control, or RBAC. Many people are likely familiar with the related concept known as Access Control Lists, or ACL. With ACLs, administrators typically have more work to do, specifically managing large numbers of resources and assigning users to each of those per-resource lists. Using groups or roles significantly simplifies the job of the operator, and reduces the likelihood of misapplied permissions.

Single sign-on

Most modern DevOps platforms have multiple systems, each requiring separate authentication. For example, in a modern Kubernetes-based deployment, you're likely to have:

The underlying cloud vendor
- Both command line and web based access
Kubernetes itself
- Both command line access and the Kubernetes Dashboard
A monitoring dashboard
A log aggregation system
Other company-specific services

That's in addition to maintaining a company's standard directory, such as Active Directory or G Suite. Maintaining this level of duplication among user accounts is time consuming, costly, and dangerous. Furthermore, while it's reasonable to securely lock down a single account via MFA and other mechanisms, expecting users to maintain such information for all of these systems securely is unreasonable. And some of these systems don't even provide such security mechanisms.

Instead, single sign-on provides a standards-based, secure, and simple method for authenticating to these various systems. In some cases, user accounts still need to be created in each individual system. In those cases, automated user provisioning is ideal. We'll talk about some of that in later posts. In other cases, like AWS's identity provider mechanism, it's possible for temporary identifiers to be generated on-the-fly for each SSO-based login, with roles assigned.

Deeper questions arise about where permissions management is handled. Should the central directory, like Active Directory, maintain permissions information for all systems? Should a single role in the directory represent permissions information in all of the associated systems? Should a separate set of role mappings be maintained for each service?

Typically, organizations end up including some of each, depending on the functionality available in the underlying tooling, and organizational discretion on how much information to include in a directory.

Going deeper

What we've covered here sets the stage for understanding many cloud-specific authentication and authorization schemes. Going forward, we're going to cover a look into common auth protocols, followed by a review of specific cloud providers and tools, specifically AWS, Azure, and Kubernetes.

Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.

Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.