Cat Cai Cat Cai is currently the Director of Platform Engineering at Fair. When not coding, you can find her writing about coding or powerlifting.

Role-based Access Control (RBAC) on Kubernetes on paper seems totally sensible. It’s obvious: of course an organization would want to enforce user and application access policies to a cluster. The Kubernetes official documentation provides a lot of guidance on how the RBAC API objects work, but there’s little on best practices of how to deploy it in a functional way for an organization. The developer tried and true Google-fu method on “Kubernetes best practices” turns up the same lack of information wrapped up in listicles of security mantras (separation of duties and all that jazz.)

Managing RBAC in a way that’s suitable to the size of your company is confusing and overwhelming. Before speeding to implementing policy, it’s worth figuring out what problems RBAC is actually trying to solve.

You’ll typically find that implementing RBAC, like all things security, is a game of seesaw between limiting access and operational ease.

There’s the principle of least privilege, which is something your security team has been bugging you about. But what does it actually mean? Both your applications running in and your developers accessing the cluster should only get access to the resources that they need. Your applications should only have access to read their own secrets and configmaps. Pretty simple. Your developers are a little tougher to figure out, since there are a multitude of developer roles and they’re prone to shift over time. For example, a mobile developer may only need read-only access to your Kubernetes clusters, whereas a lead platform developer will need admin access. On the other end, you’re trying to balance operational ease. As a cluster administrator, you want to be able to quickly grant a new user a single role that gives them all the access they need (instead of the insanity of granting individual privileges). This makes auditing easier, since you’ll know exactly who has access to what resources through the role that they have. You also want to be able to spell out roles that don’t become a huge operational headache over time.

I’ve spelled out three realistic approaches to Kubernetes RBAC. I know these because I’ve done them

Approach #1 – Cluster Admin for Everyone

If you’ve been doing Kubernetes since ye old days of <1.8 clusters, then you might already be grandfathered into a “cluster admin for all” (which I’ll lovingly dub CAFA) setup. The easiest way to ensure that users and applications retained access and function the way they did prior to RBAC was to just grant them all something akin to cluster-admin.

If you’re a small startup strapped for engineering resources, that’s also likely the way things have stayed. Business velocity often beats out security, and that’s okay. It’s just worth knowing what your risks are.

RBAC implemented: ✅

Principle of least privilege: 🙈 You’ve just built in your security engineers’ worst nightmare. Every developer has access to everything (yes, your organization’s state secrets!) Your applications have the ability to run API commands against the Kubernetes cluster. If the application is compromised, then it’s safe to assume under this implementation that everything is compromised.

Operational ease: 🤷🏻‍♀️ This implementation is easy upfront, since your developers either have access to everything in the cluster, or don’t have access at all. The thing to note is that you’ll incur organizational risks and headaches as you grow. Your developers may unintentionally delete your configmaps and secrets. Oops.

Verdict: It certainly gets the job done. It’s probably a great approach for small startups with a high level of trust, little time, and few resources to completely build out (or need) a stricter RBAC policy. It’s not so great of an approach for companies that push a 30+ engineering headcount.

#2 – RBAC Babysteps

After your organization’s Nth incident requiring manual intervention due to a stray keystroke that wipes out all cluster configurations and/or secrets, you’ve probably outgrown the CAFA solution. Between these incidents and aggressive prodding from an exasperated security team, you’ve been forced to carve out RBAC roles for both your applications and users. What’s this even look like?

All of your applications should get a service account that isn’t just the default one.

<em># Specify a service account name for your application</em> apiVersion: v1 kind: ServiceAccount metadata: name: data-engineering-app namespace: data-engineering — apiVersion: apps/v1 kind: Deployment metadata: name: data-engineering-app namespace: data-engineering spec: replicas: 2 template: metadata: labels: app: data-engineering-app spec: <em># Reference the service account in your deployment</em> serviceAccountName: data-engineering-app containers: – name: data-engineering-app image: nginx:latest ports: – containerPort: 80 env: – name: AWS_REGION value: “us-west-2” – name: WHATS_THIS_EVEN_NEED_CONFIGS_FOR valueFrom: configMapKeyRef: name: data-engineering-app key: totally.real.configs resources: requests: cpu: 10m memory: 40Mi limits: cpu: 10m memory: 40Mi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 < em > # Specify a service account name for your application</em> apiVersion : v1 kind : ServiceAccount metadata : name : data - engineering - app namespace : data - engineering — apiVersion : apps / v1 kind : Deployment metadata : name : data - engineering - app namespace : data - engineering spec : replicas : 2 template : metadata : labels : app : data - engineering - app spec : < em > # Reference the service account in your deployment</em> serviceAccountName : data - engineering - app containers : – name : data - engineering - app image : nginx : latest ports : – containerPort : 80 env : – name : AWS_REGION value : “ us - west - 2 ” – name : WHATS_THIS_EVEN_NEED_CONFIGS_FOR valueFrom : configMapKeyRef : name : data - engineering - app key : totally . real . configs resources : requests : cpu : 10m memory : 40Mi limits : cpu : 10m memory : 40Mi

Then, you’ll specify a Role and a corresponding RoleBinding that ensures that your Service Account only gets access to the K8s API resources it needs. In this case, the example app only needs to read its own configs.

<em># Create a role that allows the deployment to read configs</em> apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: <em># Remember that roles are scoped to a namespace</em> namespace: data-engineering name: umbrella:data-engineering-app rules: – apiGroups: [“”] resources: [“configmaps”] resourceNames: [“data-engineering-app-configmap”] verbs: [“get”] — <em># Create a rolebinding to bind the role to the service account</em> apiVersion: rbac.authorization.k8s.io/v1beta1 kind: RoleBinding metadata: name: umbrella:data-engineering-app roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: umbrella:data-engineering-app subjects: – kind: ServiceAccount name: data-engineering-app namespace: data-engineering 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 < em > # Create a role that allows the deployment to read configs</em> apiVersion : rbac . authorization . k8s . io / v1 kind : Role metadata : < em > # Remember that roles are scoped to a namespace</em> namespace : data - engineering name : umbrella : data - engineering - app rules : – apiGroups : [ “” ] resources : [ “ configmaps ” ] resourceNames : [ “ data - engineering - app - configmap ” ] verbs : [ “ get ” ] — < em > # Create a rolebinding to bind the role to the service account</em> apiVersion : rbac . authorization . k8s . io / v1beta1 kind : RoleBinding metadata : name : umbrella : data - engineering - app roleRef : apiGroup : rbac . authorization . k8s . io kind : Role name : umbrella : data - engineering - app subjects : – kind : ServiceAccount name : data - engineering - app namespace : data - engineering

Enforcing RBAC on the user side is similar in concept. You can create RoleBindings for individual users, but this is not the recommended path as there’s a high risk of operator insanity.

The better approach for sane RBAC is to create that your users map to; how this mapping is done is dependent on your cluster’s authenticator (e.g. the aws-iam-authenticator for EKS uses mapRoles to map a role ARN to a set of groups).

Groups and the APIs they have access to are ultimately determined based on an organization’s needs, but a generic reader (for new engineers just getting the hang of things), writer (for your engineers), and admin (for you) role is a good start. (Hey, it’s better than admin for everyone.)

— <em># An example reader ClusterRole – ClusterRole so you’re not worried about namespaces at this time. Remember, we’re talking generic reader/writer/admin roles.</em> apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: reader rules: – apiGroups: [“*”] resources: – deployments – configmaps – pods – secrets – services verbs: – get – list – watch — <em># An example reader ClusterRoleBinding that gives read permissions to</em> <em># the engineering and operations groups</em> apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: reader-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: reader subjects: – kind: Group name: umbrella:engineering – kind: Group name: umbrella:operations — <em># An example writer ClusterRole</em> apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: writer rules: – apiGroups: [“*”] resources: – deployments – configmaps – pods – secrets – services verbs: – create – delete – patch – update — <em># An example writer ClusterRoleBinding that gives write permissions to</em> <em># the operations group</em> apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: reader-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: reader subjects: – kind: Group name: umbrella:operations 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 — < em > # An example reader ClusterRole – ClusterRole so you’re not worried about namespaces at this time. Remember, we’re talking generic reader/writer/admin roles.</em> apiVersion : rbac . authorization . k8s . io / v1 kind : ClusterRole metadata : name : reader rules : – apiGroups : [ “ * ” ] resources : – deployments – configmaps – pods – secrets – services verbs : – get – list – watch — < em > # An example reader ClusterRoleBinding that gives read permissions to</em> < em > # the engineering and operations groups</em> apiVersion : rbac . authorization . k8s . io / v1beta1 kind : ClusterRoleBinding metadata : name : reader - binding roleRef : apiGroup : rbac . authorization . k8s . io kind : ClusterRole name : reader subjects : – kind : Group name : umbrella : engineering – kind : Group name : umbrella : operations — < em > # An example writer ClusterRole</em> apiVersion : rbac . authorization . k8s . io / v1 kind : ClusterRole metadata : name : writer rules : – apiGroups : [ “ * ” ] resources : – deployments – configmaps – pods – secrets – services verbs : – create – delete – patch – update — < em > # An example writer ClusterRoleBinding that gives write permissions to</em> < em > # the operations group</em> apiVersion : rbac . authorization . k8s . io / v1beta1 kind : ClusterRoleBinding metadata : name : reader - binding roleRef : apiGroup : rbac . authorization . k8s . io kind : ClusterRole name : reader subjects : – kind : Group name : umbrella : operations

RBAC implemented: (RBAC’d so hard, double checks)

Principle of Least Privilege: 🤷🏻‍♀️ For the most part, yes. There are discrete reader, writer, and admin roles. Your applications all get specific access. Time to pat ourselves on the back. Job well done…

RBAC implemented: ✅✅ (RBAC’d so hard, double checks)

Operational ease: Sure, it’s not quite as easy as giving everyone and everything God powers, but the setup laid out here isn’t too bad overall. Except, you notice over time that you’re being consulted more and more by different teams on RBAC policies. Nobody else outside of your organization’s Platform/DevSecOps/Infrastructure/Tools team can be arsed to figure out what RBAC for Kubernetes even is. You find yourself having to often update your policies to recognize new custom resource definitions for the cool Kubernetes integrations your data engineers keep spinning up. Depending on the type of authenticator you’re using, you’re also likely manually provisioning developers into the group(s) that they belong into to get the correct access. You’re beginning to feel like a glorified YAML dev.

#3: Automation

This last approach… isn’t really an approach. It’s more a series of guidelines to get you on a path to RBAC success. You’ll naturally adopt a lot of these measures over time after feeling the pain from #1 and #2.

For your applications, you’ll likely want to adopt a similar approach to the generic reader/writer/admin approach from #2. Most applications are unlikely to make heavy use of the Kubernetes API, other than reading their own configs and secrets.

For CI/CD-related applications, you can be a little more lax on API groups. Creating a knowledge base of general RBAC templates and guidelines for the rest of the company to use is a great first step. If they’re easy to use and find, your developers will end up just copy/pasting them (which is pretty much what you want).

Depending on your authentication method in Kubernetes, user provisioning may be one of the overall painful points of handling RBAC. strongDM has a Kubernetes integration to standardize and make easy the ability to grant role-based access to a user into a cluster.

Rather than creating direct user mappings, strongDM’s solution relies on generating roles all within Kubernetes and populating strongDM with a client certificate and key. Then, users can be provisioned access to the cluster in the same standardized way as all data sources.

Over time, as your organization grows, the generic reader/writer/admin approach doesn’t scale. (A solution that a random internet stranger suggested isn’t the salve to all your problems? Surprise.) You’ll need more granularity for each of your roles, meaning you need to create more roles, which becomes harder to mentally juggle. As usual, open source solutions come to the rescue to make this easier to manage. RBAC Manager make it easier to manage users, services, and role bindings over time and namespaces via labels. rakkess and rbac-lookup both provide easy visibility of service account and user roles, which, for reasons unknown, is hard to determine using kubectl alone. (It’s almost like Kubernetes intentionally makes it hard for you to understand RBAC). Popeye, a general Kubernetes scanner for enforcing best practices, is useful for detecting unused RBAC rules that build up over time from updating and deleting roles.

RBAC implemented: ✅✅ You already got these two checks, so hopefully you didn’t regress and lose RBAC implementation.

Principle of Least Privilege: Yes!

Operational ease: Still 🤷🏻‍♀️. At this point, you’ve likely realized that implementing RBAC isn’t an exact science and is prone to shift over time, depending on the growth trajectory of your org. Hopefully, with a cocktail of off-the-shelf and open-source solutions, you’ll be able to cobble together a solution that works for you and doesn’t paint you into a corner. The engineer’s dream.

Feature image via Pixabay.