Years ago, Travis CI introduced a method for passing secret values from your repository into the Travis CI system. This method relies on encryption to ensure that anyone can provide a new secret, but only the CI system itself can read those secrets. I've always thought that the Travis approach to secrets was one of the best around, and was disappointed that other CI tools continued to use the more standard "set and update secrets in a web interface" approach. (We'll get into the advantages of the encrypted-secrets approach a bit later.)
Fast-forward to earlier this year, and for running Kube360 deployment jobs, we found that the secrets-in-CI-web-interface approach simply wasn't scaling. So I hacked together a quick script that used GPG and symmetric key encryption to encrypt a secrets.sh
file containing the relevant secrets for CI (or, really, CD in this case). This worked, but had some downsides.
A few weeks ago, I finally bit the bullet and rewrote this ugly script. Instead of using GPG and symmetric key encryption, I used sodiumoxide
and public key encryption. This addressed essentially all the pain points I had with our CD setup. However, this tool was very much custom-built for Kube360.
Over the weekend, I extracted the general-purpose components of this tool into a new open source repository. This blog post is announcing the first public release of Amber, a tool geared at CI/CD systems for better management of secret data over time. There's basic information in that repo to describe how to use the tool. This blog post is intended to go into more detail on why I believe encrypted-secrets is a better approach than web-interface-of-secrets.
The pain points
There are two primary issues with the standard CI secrets management approach:
- It can be tedious to manage a large number of values inside a web interface. I've personally made mistakes copy-pasting values. And if you ever need to run a script locally for testing purposes, copying all the values out each time is an even bigger pain. (More on that below.)
- It's completely reasonable for secret values to change over time. However, there's no evidence of this in the source repository feeding into the CI system. Instead, the changes happen opaquely, and can never be observed as having changed, nor an old build faithfully reproduced with the original values. (This is pretty similar to why we believe your CI build process should be in your code repository.)
With encrypted values within a repository, both of these things change. Adding new encrypted values is now a command line call, which for many of us is less tedious and more foolproof than web interfaces. The encrypted secrets are stored in the Git repository itself, so as values change over time, the files provide evidence of that fact. And checking out an old commit from the repository will allow you to rerun a build with exactly the same secrets as when the commit was made.
Why public key
One of the important changes I made from the GPG script mentioned above was public key, instead of symmetric key, encryption. With symmetric key encryption, you use the same key to encrypt and decrypt data. That means that all people who want to encrypt a value into the repository need access to a piece of secret data. While encrypting new secret values isn't that common an activity, requiring access to that secret data is best avoided.
Instead, with public key encryption, we generate a secret key and public key. The public key lives inside the repository, in the same file as the secrets themselves. With that in place, anyone with access to the repo can encrypt new values, without any ability to read existing values.
Further, since the public key is available in the repository, Amber is able to perform sanity checks to ensure that its secret key matches up with the public key in the repository. While the encryption algorithms we use provide the ability to ensure message integrity, this self-check provides for nicer diagnostics, clearly distinguishing "message corrupted" from "looks like you're using the wrong secret key for this repository."
Minimizing deltas
Amber is optimized for the Git repository case. This includes wanting to minimize the deltas when updating secrets. This resulted in three design decisions:
-
The config file format is YAML. Its whitespace-sensitive formatting makes it a great choice to minimize the number of lines affected when updating a secret. While other formats (like TOML) would have been great choices too, I stuck with YAML as, anecdotally, it seems to have stronger overall language support for people wishing to write companion tools.
-
In addition to storing the secret name and encrypted value (the ciphertext), Amber additionally includes a SHA256 digest of the secret. This means that, if you encrypt the same value twice, Amber can detect this and avoid generating a new ciphertext. This has the additional benefit of letting users check if they know the secret value without being able to decrypt the file.
-
The most natural representation of this data would be a YAML mapping, something like:
secrets:
NAME1:
sha256: deadbeef
cipher: abc123
However, in most languages, the ordering of keys in a mapping is arbitrary. This makes it harder to read these files, and means that arbitrary minor changes may result in large deltas. Instead, Amber stores secrets in an array:
secrets:
- name: NAME1
sha256: deadbeef
cipher: abc123
This all works together to achieve what for me is the goal of secrets in a repository: you can trivially see in a git diff
which secrets values were added, removed, or updated.
Local running
Ideally production deployments are only ever run from the official CI/CD system designated for that. However:
- Sometimes during development it's much easier to iterate by doing non-production deployments from your local system.
- As a realist, I have to admit that even the best run DevOps teams may occasionally need to bend the rules for expediency or better debugging of a production issue.
For Kube360, it wasn't unreasonable to have about a dozen secret values for a standard deployment. Copy/pasting all of those to your local machine each time you want to debug an issue wasn't feasible. This encouraged some worst practices, such as keeping the secret values in a plain-text shell script file locally. For a development cluster, that's not the worst thing in the world. But lax security practices in dev tend to bleed into prod too easily.
Copying a single secret value from CI secrets or a team password manager is a completely different story. It takes 30 seconds at the beginning of a debug session. I feel no objections to doing so.
Even this may be something we can bypass with cloud secrets managers, which I'll mention below.
What's with the name?
As we all know, there are two hard problems in computer science:
- Cache invalidation
- Naming things
- Off-by-one errors
I named this tool Amber based on Jurassic Park, and the idea of some highly important data (dinosaur DNA) being trapped in amber under layers of sediment. This fit in nicely with my image of storing encrypted secrets inside the commits of a Git repository. But since I just finished playing "Legend of Zelda: Skyward Sword," a more appropriate image seems to be:
Implementation
I wrote this tool in Rust. It's a pretty small codebase currently, clocking in at only 445 SLOC of Rust code. It's also a pretty simple overall implementation, if anyone is interested in a first project to contribute to.
Future enhancements
Future enhancements will be driven by internal and customer needs at FP Complete, as well as feedback we receive on the issue tracker and pull requests. I have a few ideas ranging from concrete to nebulous for enhancements:
- Masking values. Currently,
amber exec
will simply run the child process without modifying its output at all. A standard CI system feature is to mask secret values from output. Implementing such as change in Amber should be straightforward. (Issue #1)
- Tie-ins with cloud secrets management systems. Currently, Amber's only source of the secret key is via environment variables. There are many use cases where grabbing the data from a secrets manager, such as AWS Secrets Manager or Azure Key Vault, would be a better choice. In particular, during deployments, this could allow delegating access to secrets to existing cloud-native permissions mechanisms. See issue #2 and pull request #4 for some more information. One possible approach here is to follow a pattern of naming the secret based on the public key, leading to a zero-config approach to discovering the secret key (since the public key is already in the repository).
- Additional platform support. Currently, we're building executables for x86-64 on Linux (static via musl), Windows, and Mac. Cross compilation support from Rust is great, and one of the reasons I prefer writing CI tools like this in Rust. However, the
sodiumoxide
library depends on libsodium
, so additional GitHub Actions setup will be necessary to get these builds working.
- Auto-generation of passwords. In our Kube360 work, a common need is to generate a temporary password to be used by different components in the system (e.g., an OpenID Connect client secret used by both the Identity Provider and Service Provider). A simple
amber gen-password CLIENT_SECRET
subcommand may be nice.
- I haven't released this code to crates, but if there's interest I'd be happy to do so.
- Support for encrypted files in addition to encrypted environment variables. I haven't really thought through what the interface for this may look like.
Get started
There are instructions in the repo for getting started with Amber. The basic steps are:
- Download the executable from the release page or build it yourself
- Use
amber init
to create an amber.yaml
file and a secret key
- Store the secret key somewhere safe, like your password manager, and additionally within your CI system's secrets
- In theory, this is the last value you'll ever store there!
- Add your secrets with
amber encrypt
- Commit
amber.yaml
to your repository
- Modify your CI scripts to download the Amber executable and use
amber exec
to run commands that need secrets
More from FP Complete
FP Complete is an IT consulting firm specializing in server-side development, DevOps, Rust, and Haskell. A large part of our consulting involves improving and automating build and deployment pipelines. If you're interested in additional help from FP Complete in one of these domains, please contact us.
Interested in working with a team of DevOps, Rust, and Haskell engineers to solve real world problems? We're actively hiring senior and lead DevOps engineers.
Want to read more? Check out:
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.