TL;DR: Generating and distributing service account keys poses severe security risks to your organization. They are long-lived credentials that are not automatically rotated. These keys can be leaked accidentally or maliciously allowing attackers to gain access to your sensitive GCP resources. Additionally, when used actions cannot be attributable back to a human. You don’t actually have to download these long-lived keys. There’s a better way!
Service Accounts, OAuth2 and You
For some background, almost every change you want to make in Google Cloud from creating a GKE cluster to reading from a GCS bucket is handled using an API. This API is authenticated using the OAuth2 protocol, which basically means there’s a short lived (1 hour default) access token attached to every authenticated request.
If you’re familiar with the whole “Sign in with Google” popup, that’s OAuth2 hard at work authenticating you with your Google credentials. Once you’re authenticated, an access token is attached to all your API requests whether you’re using gcloud, terraform, SDKs, or the console.
In Google Cloud, we use a lot of automation and web services which similarly need those tokens, but robots aren’t very good at opening browsers and typing in passwords so they need some sort of verifiable identity. Enter Service Accounts.
Service accounts allow automated users to prove their identity using a public/private key pair in the form of a JSON file. A service account also has the same ability as users or groups to bind to IAM roles to do things in GCP. To make an API request, a service account will sign a JWT token with its private key and the Google authentication system will verify that signature with the public key, granting an access token. This basic (and oversimplified) concept is important for later parts of this post. If you’d like to read more about this flow, check out RFC 7523.
Service accounts are very easy to use within Google Cloud. Most, if not all, compute resources (i.e. GCE instances, GKE Pods, Cloud Functions, etc.) support the ability to attach a service account. This allows these resources to act as the service account, call Google SDKs and APIs within the bounds of permissions granted to the service account. You should never need to generate and download a service account key to use a service account within Google Cloud infrastructure. A risk emerges when developers think they need a service account to accomplish a task, so they generate and download a key.
I cannot tell you how often I see documentation or tutorials instruct folks to download these service account keys and use them indefinitely or worse, store them in their source code working directory. Doing this, you’re literally one line in a .gitignore from committing this highly sensitive secret to Github and getting breached.
Short lived tokens FTW!
Remember how I said that if you have the Service Account’s private key, you can sign a JWT token and be granted an API access token? Well there is a way to do that without ever needing to download the key.
Let’s say I have a service account that is used for GKE so it has the role roles/container.developer. We’ll call this service account firstname.lastname@example.org. Let’s further say that my user email@example.com isn’t allowed to download the key to this service account and doesn’t have direct permissions to mess with GKE but what I do have is the magic role roles/iam.serviceAccountTokenCreator.
You could make this more robust by reading from a config file if you like, but I think a single-file script gets the point across. Now for every new service account you want to use, simply add it and a short name for it to this script and you’re off.
What about Terraform?
We’ve spent all this time talking about a better way to consume service accounts through gcloud but what about the very common use case of Terraform? For development purposes, we need to test our infrastructure as code somehow. Thankfully, the google terraform provider supports directly passing an OAuth2 token as an environment variable
You could further simplify this by wrapping it in a Makefile, but regardless, now you have a token that will only live in your environment for 1 hour and be useless to an attacker after that!
Attribution and Logging
The SecOps folks may be thinking, how do I attribute and audit actions taken by a user impersonating a service account? In Cloud Logging, every API call executed by a service account that has been impersonated.
I’ll also point out that this level of attribution is impossible if you allow users to download service account keys since they are effectively using shared credentials, assuming more than one person has access to download the same key. If I’m a forensic analyst or an auditor, there’s no way I could figure out definitively what human executed this API request unless each user has their own service account or key and that’s defined somewhere. Even still, that is very difficult to trace back.
There are some use cases where downloading a service account key to your workstation is necessary, but they are not the norm. Keeping secrets like this short-lived locally should be the goal, the same way we should enable MFA and not use hunter2 as our password. For the obvious cases where service account keys must be downloaded to use GCP resources from your datacenter or another cloud, I’d recommend taking a look at HashiCorp Vault which has a plugin to checkout short-lived service account keys.