During automatic infrastructure deployment on AWS, a common
question is: what is the best way to deliver sensitive information
over to EC2 instances or, more precisely applications running on
them. There are numerous solutions, such as placing the information
into user-data initialization script or simply SFTPing them onto
the instance. Although these are perfectly viable solutions, there
are well-known drawbacks with those approaches, such as size
limitations on the former and the necessity to open the SSH port
for the latter. There are also comprehensive solutions, such as
HashiCorp Vault with Consul, that can do a lot more than just
deliver credentials, but those can be an overkill for common and
simple scenarios.
Introduction
There is a way to solve secret management through utilization of
resources provided only by AWS and a cool tool called credstash. You will find a
nice guide on how to use the tool and a description on how it works
if you follow the link, but the basic idea behind credstash is that
it stores key/value data in DynamoDB while encrypting values with
KMS (Key Management Service). As a result, only a user or resource
that has read access to the DynamoDB table and permission to use
that KMS Master key can access that data. The encrypted data can
then be accessed through a very simple command line interface. In
the most simple case the process would look like this:
On your laptop:
$ credstash put my-secret high-entropy-password
my-secret has been stored
On the EC2 instance:
$ credstash get my-secret
high-entropy-password
Boom, you transferred the password across the internet in a
totally secure fashion using nothing but AWS services. At a high
level, here is what happened in the above example. During the
put
operation:
- a new random data encryption key was generated using KMS.
- the value
high-entropy-password
was encrypted with
that key.
- data encryption key is itself encrypted with the KMS Master
key, while its plaintext form is discarded.
- the encrypted data, together with the encrypted data encryption
key, are stored in a DynamoDB table under the name
my-secret
.
The KMS Master key that is used by default by credstash is the
one with the name alias/credstash
, while the default
DynamoDB table name is credstash-store
. This whole key
wrapping technique is necessary, because the KMS Master key can
only encrypt up to 4KiB of data at a time.
During the get
operation the process is
inverted:
- pull out the blob from DynamoDB table with the name
my-secret
- decrypt the data key by using the KMS API and the KMS Master
key
- decrypt the actual secret using the decrypted key.
As you probably suspect, access to the secret data can be
controlled on two levels, namely through access to DynamoDB table
and to KMS Master key. More on that later.
There are a number of tools that are used for automatic
deployment on AWS. Terraform, being one of them, stands out as an
amazing tool that allows you to describe your infrastructure as
code. It works not just with AWS, but with many other providers. In
this post we'll use nothing but terraform, so if you
are already familiar with it, go on reading forward; otherwise a
Getting
Started tutorial could be beneficial if you want to try things
out while moving along.
Initial Setup
Installing terraform is pretty straightforward, since it is
written in Go you can just download a binary for your operating
system from terraform downloads page.
Credstash, on the other hand, is written in Python and as such
can be installed with pip
. It does have a few
non-python dependencies that need to be installed beforehand. Here
is how you'd get it on Ubuntu:
$ sudo apt-get update
$ sudo apt-get install -y libssl-dev libffi-dev python-dev python-pip
$ sudo -H pip install --upgrade pip
$ sudo -H pip install credstash
If you'd rather not install anything globally you can use Python
environments, or even download another implementation of credstash
ported to a different language, for instance gcredstash, written in Go, which, just like terraform,
can be downloaded as a static executable and is fully compatible
with credstash
. Implementations in other languages are
listed in the README.
Minimal
Naturally, the example from Introduction
will not work just out of the box, so prior to using credstash, a
database table and an encryption key must be created first. Going
through the credstash documentation will reveal that a DynamoDB
table with a default name credstash-store
can be
created by running credstash setup
, while the KMS
Master key has to be created manually:
$ credstash -t test-table setup
Creating table...
Waiting for table to be created...
Table has been created. Go read the README about how to create your KMS key
Well that's no fun, we ought to be able to automate the whole
process. The
credstash-setup terraform module will do just that, thereby
taking care of the initial setup for us. Remember, we need to do
this only once and make sure not to run terraform
destroy
, unless you really want your secret data to be
permanently deleted.
Create a main.tf
file:
module "credstash" {
source = "github.com/fpco/fpco-terraform-aws/tf-modules/credstash-setup"
}
Then execute it in the same folder with the above file:
$ terraform get
$ terraform apply
Once applied, terraform will create a DynamoDB table with the
name credstash-store
and a KMS key with the name
alias/credstash
. After deployment is complete, you can
go ahead and start using credstash
on your local
machine.
Remote state
Although, it is not strictly required, I would highly recommend
using terraform's remote state feature in order to
later simplify getting the values created by this setup. We even
have a terraform module that can help you with getting the s3-remote-state bucket that was created.
terraform {
backend "s3" {
encrypt = "true"
region = "us-west-1"
bucket = "remote-tfstate"
key = "credstash/terraform.tfstate"
}
}
module "credstash" {
source = "github.com/fpco/fpco-terraform-aws/tree/master/tf-modules/credstash-setup"
}
The main benefit of using remote terraform state is that
credstash-related resources can be created just once and their
reuse can be automated during our infrastructure deployment by all
of the team members. Another more involved way would be to manually
copy and paste outputs of this module into others as input
variables, and that just sounds like too much work.
Roles and Grants
So far, usage of credstash was limited only to users that have
implicit access to all KMS keys and DynamoDB tables, i.e. admins,
power users, what have you. Basically, running
credstash
on an EC2 instance will result in a
permission error, but that is where it is most useful. The best way
to allow an EC2 instance access the secrets is to:
- create an IAM profile with an IAM role, while attaching that
profile to an EC2 instance that we are deploying
- create an IAM policy(s) that allow reading and writing from/to
the database table
credsatsh-store
, and attach those
policies to the above mentioned role.
- create a KMS grant(s) for the Master key, that gives permission
for encryption and/or decryption with that key to a grantee, which
will also be the above mentioned IAM role
The first two steps we can easily automate with terraform, but
the last step has to be done with aws-cli
or directly
through an API with some SDK. But wait, I said that we won't be
using anything besides terraform, and it is so indeed,
aws-cli
is an implicit dependency, which has to be
installed despite the fact that we will not be interacting with
directly.
Let's start with creating IAM policies first, as they can be
reused as many times as we'd like.
... # also remote state, just as above
module "credstash" {
source = "github.com/fpco/fpco-terraform-aws/tree/master/tf-modules/credstash-setup"
enable_key_rotation = true
create_reader_policy = true
create_writer_policy = true
}
output "kms_key_arn" {
value = "${module.credstash.kms_key_arn}"
}
output "reader_policy_arn" {
value = "${module.credstash.reader_policy_arn}"
}
output "writer_policy_arn" {
value = "${module.credstash.writer_policy_arn}"
}
output "install_snippet" {
value = "${module.credstash.install_snippet}"
}
output "get_cmd" {
value = "${module.credstash.get_cmd}"
}
output "put_cmd" {
value = "${module.credstash.put_cmd}"
}
One of the greatest features of terraform, in my opinion, is
that it knows exactly what needs to be done in order to reach the
desired state, so if you already called terraform
apply
in the previous example, it will figure out everything
that needs to be changed and apply only those changes without
touching resources that need no modification.
When you run terraform apply
you should see
something along the lines:
Outputs:
get_cmd = /usr/local/bin/credstash -r us-east-1 -t credential-store get
install_snippet = { apt-get update;
apt-get install -y build-essential libssl-dev libffi-dev python-dev python-pip;
pip install --upgrade pip;
pip install credstash; }
kms_key_arn = arn:aws:kms:us-east-1:123456789012:key/87b3526c-8100-11e7-9de5-4bff2f10d02a
put_cmd = /usr/local/bin/credstash -r us-east-1 -t credential-store put -k alias/credstash
reader_policy_arn = arn:aws:iam::123456789012:policy/credential-store-reader
writer_policy_arn = arn:aws:iam::123456789012:policy/credential-store-writer
At this point credstash is set up and we can verify that it
works. Helper snippets are targeted at Ubuntu based systems, but
can be easily adapted to other operating systems.
Let's install credstash on a local machine, store a test value
and pull it out from credstash-store afterwards:
$ sudo -H bash -c "$(terraform output install_snippet)"
...
$ $(terraform output put_cmd) test-key test-value
test-key has been stored
$ $(terraform output get_cmd) test-key
test-value
We can also set a new value for the key, while auto incrementing
its version, by setting -a
flag:
$ $(terraform output put_cmd) -a test-key new-test-value2
test-key has been stored
$ $(terraform output get_cmd) test-key
new-test-value2
There are a few other useful features that credstash has, which
don't have helper snippets like get_cmd
and
put_cmd
do, since they are less likely to be used in
automated scripts. But they can still be easily constructed using
terraform outputs. It's worth noting that all previously stored
values are always available, unless deleted manually:
$ credstash -r us-east-1 -t credential-store get test-key -v 0000000000000000000
test-value
$ credstash -r us-east-1 -t credential-store list
test-key -- version 0000000000000000000
test-key -- version 0000000000000000001
$ credstash -r us-east-1 -t credential-store delete test-key
Deleting test-key -- version 0000000000000000000
Deleting test-key -- version 0000000000000000001
Deploy EC2
Using credstash directly is extremely simple, but setting
everything up for it to work on EC2 instances can be a bit
daunting, so this is what this section and the credstash-grant terraform module are about.
The simplest example that comes to mind—which is actually pretty
common in practice—is deploying an EC2 instance with an nginx
webserver serving a web page (or working as a reverse proxy), while
protecting it with BasicAuthentication. We will use credstash to
automatically retrieve credentials that we will store prior to EC2
instance deployment:
$ $(terraform output put_cmd) nginx-username admin
nginx-username has been stored
$ $(terraform output put_cmd) nginx-password foobar
nginx-password has been stored
The full example can be found in
this gist, but here are the parts that are of most interest to
us.
Using the credstash-grant
module will effectively
allow read access to the DynamoDB table by attaching that policy to
an IAM role and creating a KMS grant, thus allowing that IAM role
to use the KMS Master key for decryption. This grant will
automatically be revoked upon destruction, so there is no need to
worry about some dangling settings that should be cleaned up.
# lookup credstash remote state
data "terraform_remote_state" "credstash" {
backend = "s3"
config {
region = "us-west-1"
bucket = "remote-tfstate"
key = "credstash/terraform.tfstate"
}
}
module "credstash-grant" {
source = "github.com/fpco/fpco-terraform-aws/tf-modules/credstash-grant"
kms_key_arn = "${data.terraform_remote_state.credstash.kms_key_arn}"
reader_policy_arn = "${data.terraform_remote_state.credstash.reader_policy_arn}"
roles_count = 1
roles_arns = ["${aws_iam_role.credstash-role.arn}"]
roles_names = ["${aws_iam_role.credstash-role.name}"]
}
You might notice that we created a writer policy during
credstash-setup stage, but didn't supply its ARN to the module.
This ensures that we give EC2 instance read only access to the
secret store. If ability to store secrets from within EC2 is
desired, supplying writer_policy_arn
to the module is
all that is necessary for that to work.
This is the part where credstash is called on the EC2 side:
resource "aws_instance" "webserver" {
...
associate_public_ip_address = true
iam_instance_profile = "${aws_iam_instance_profile.credstash-profile.id}"
user_data = <<USER_DATA
#!/bin/bash
${data.terraform_remote_state.credstash.install_snippet}
apt-get install -y nginx
BASIC_AUTH_USERNAME="$(${data.terraform_remote_state.credstash.get_cmd} nginx-username)"
BASIC_AUTH_PASSWORD="$(${data.terraform_remote_state.credstash.get_cmd} nginx-password)"
echo -n "$BASIC_AUTH_USERNAME:" > /etc/nginx/.htpasswd
openssl passwd -apr1 "$BASIC_AUTH_PASSWORD" >> /etc/nginx/.htpasswd
...
You are not required to use the helper snippets if you don't
want to, but those can be very helpful in the long run, especially
if some time later you choose to customize the KMS key name,
DynamoDB table or simply try to use credstash in another AWS
region. The get_cmd
and put_cmd
snippets
encapsulate this information, so we won't have to chase all the
places were we used credstash in order to update these values.
Applying terraform will deploy our webserver. After it gets
fully initialized we can verify that it worked as expected:
$ curl -s https://$(terraform output instance_ip) | grep title
<head><title>401 Authorization Required</title></head>
$ curl -s https://admin:foobar@$(terraform output instance_ip) | grep title
<title>Welcome to nginx!</title>
Contexts
In a setup where we are using credstash only on one EC2 instance
we have nothing else to worry about. Nowadays though, that is not
such a common scenario. You might have a database cluster running
on a few instances, a webserver that is managed by an auto scaling
group, a message broker running on other instances, so on and so
forth. Each one of those services requires its own set of
credentials, TLS certificates or what have you. In these kinds of
scenarios we need to make sure that instances running a webserver
will not have access to secrets that are meant only for instances
running the database. To complicate it even more, we often have
stage, test, dev, and prod environments, and we would ideally like
to isolate those from each other as much as possible.
KMS Encryption Context is a straightforward solution to this
problem.
By itself, context doesn't give any extra level of protection,
but when combined with constraints, which are specified during KMS
grant creation, it turns into a powerful protection and isolation
concept that significantly increases overall security.
Here is how encryption contexts work in a nutshell:
Whenever you run credstash put name secret foo=bar
,
the key value pair {"foo": "bar"}
, called context,
becomes cryptographically bound to the cyphertext that is stored in
the database, and therefore the same key value pair will be
required in order to decrypt the secret. Keep in mind that this
pair does not have to be something complicated, and in fact, it
must not contain any sensitive information, as it will be visible
in CloudTrail logs.
During grant creation, which is performed for us by the
credstash-grant
module, we can supply
reader_context
and writer_context
, which
will prevent credstash from running get
and
put
commands respectively without passing exactly the
same context as extra arguments. If I create a grant with reader
context env=test service=database
, there is no way for
an instance with that IAM role to read secrets that were encrypted
with env=prod service=database
or env=test
service=webserver
contexts, or no context at all for that
matter. It has to match exactly.
When analyzing security it is important to look at such cases
when the system does get compromised. In a case when an attacker
acquires root access, then all bets are off and secrets can be
easily extracted from memory or file storage, but if privilege
escalation did not occur then one of the possible concerns could be
the fact that access even to those credentials stored with specific
context could be accessed long after the deployment was complete.
In a simple single machine setup this situation can be alleviated
by revoking the KMS grant after initialization was complete, thus
preventing long term access to KMS Master key. But if an EC2
instance is deployed as part of an Auto Scaling Group, that
approach will not work, as access to secrets is needed at all times
since EC2 instances can come and go at anytime.
As a side note. There is a way to control access to KMS keys
through an IAM policy, just as it is done with DynamoDB table, but
because encryption context constraints are not available at the
policy level, we resort to explicit grant creation, precisely for
the reason of isolation described in this section.
Other use cases
Besides the obvious use case of passing credentials to EC2
instances, credstash can be a potential solution in other areas.
The practical size limit for the data being encrypted and stored
with credstash is on the order of ~100KiB, so you can store things
much larger than a short passphrases, making it perfect for storing
things like TLS certificates and SSH keys. For example we were able
to successfully supply all necessary TLS certificates used for
mutual authentication in Elastic's Beats protocol during deployment
of Logstash, while automatically generating certificates using
certstrap
.
There is no reason to think that information stored with
credstash has to be sensitive in the first place. Usual
configuration files can be stored and retrieved just as well,
therefore taking this heavy burden away from the initialization
script. In fact, this idea could be taken up a notch, and a
credstash pull command could be setup in crontab. This way an
application that is running on a server can be configured to
periodically reload its configuration, thus giving an ability to
administrator to update the configuration dynamically without use
of any other provisioning tools.
AWS Lambda also uses IAM roles, and credstash being an open
source python tool could be in theory used there as well.
It is so easy to use credstash that I actually started to manage
my own credentials with it.
A couple of examples for some of the use cases, plus more
documentation on credstash related terraform modules, can be found
in the
fpco/fpco-terraform-aws repository.
Conclusion
The overall benefits of using credstash should be pretty obvious
at this point. Sensitive information is encrypted, stored securely
in DynamoDB, and is available at all times. Furthermore, we have
the ability to fine tune which part of it can be accessed and by
which parties through usage of IAM roles and KMS grants. There are
no more worries about manually encrypting your secrets and finding
the best way to move them across the wire. You are no longer
limited by a tiny 14KiB size limit of user-data. There is no more
need for setting up SSH connections just to pass over those couple
TLS certificates. By keeping credentials in a central remote
location you are less likely to forget to remove them before
pushing the code to your repository, or leaving them in unencrypted
form in terraform state. More importantly, it gives you a unified,
programatic way to manage credentials, bringing more structure and
order to your DevOps.
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.