Using IAM Roles for ServiceAccounts on kOps

This feature has now been implemented and available for some time. See the official docs. Note that the feature flag mentioned below has been replaced with: spec.iam. useServiceAccountExternalPermissions: true

Until recently, the only way for a Pod to use the AWS API was to either provision static credentials or assign additional IAM Policies to the Nodes Pods were running on. kOps addons rely on the latter, which has several issues:

  • All other Pods running on the same Node would have the same permissions.
  • EC2 Instances cannot enforce IMDSv2 with http-put-request-hop-limit: 1.

kOps mitigates these concerns by letting addons run on the Control Plane (CP) Nodes. Unfortunately, out of the box, kOps only protect the CP Nodes with Taints, and any cluster user can add Tolerations to Pods and schedule them on the CP Nodes.

The solution to this is to create dedicated IAM Roles for each of the addon Pods, and reduce the privileges given to the IAM Roles assigned to the EC2 instances.

kOps 1.21 introduces a set of features that in sum enables IAM Roles for ServiceAccounts (IRSA).

Let us have a look at how to enable support for IRSA.

ServiceAccount Issuer Discovery

The first feature needed to support IRSA is what Kubernetes refers to as Service Account Issuer Discovery. Essentially it means publishing the OIDC issuer discovery metadata, which contains things like the public key of the ServiceAccount token signing keys. By default, the Kubernetes API Server will publish this on the API Server, but this doesn't work out-of-the box on kOps clusters. AWS also requires the documents to be published in a globally readable location. It is technically possible to expose the API Server on a public IP and allow anonymous access to the OIDC Discovery metadata, but many would be uncomfortable doing so. When this feature is configured, kOps will publish these documents to a VFS path.

VFS path is a Virtual File System path that kOps also uses for storing configurations, secrets, and keys, e.g. the path pointing to the kOps state store is a VFS path.

Right now, only S3 is supported, as we need to implement support for converting a VFS path to the corresponding HTTPS endpoint, e.g. from s3://<bucket>/<path> to https://<bucket>.s3.<region><path>.

In order to enable this feature, you only need to add the following to the cluster spec:

    discoveryStore: s3://<my bucket>

If you want to use this with AWS, take care that there is no policy preventing public access to the objects stored therein.

Once you have OIDC discovery metadata published, you can configure any OIDC consumer that supports OIDC issuer discovery to establish trust with your service accounts. This is not limited to AWS, but can be used if you want your ServiceAccounts to authenticate natively to Hashicorp Vault or any other OIDC consumer that supports OIDC issuer discovery.

AWS OIDC Provider

The purpose of this feature is to make AWS trust the Kubernetes ServiceAccounts so that the ServiceAccounts can assume AWS IAM Roles. kOps will do this for you if you add the following to the spec:

    enableAWSOIDCProvider: true

Using IAM Roles for ServiceAccounts belonging to kOps addons

All addons that require access to the AWS API currently run on the Control Plane (CP) Nodes and assume the instance role in order to access AWS services. This is problematic because any other Pod running on CP Nodes can assume the instance role as well. And we cannot use IMDSv2 with http-put-response-hop-limit: 1 as that would block addons, too.

With the features above in place, each addon will be ported to using IRSA instead. Each addon will get a dedicated role it can assume that has exactly the privileges it needs. kOps will then automatically configure the Pods to use IRSA as well. Enabling IRSA for kOps addons is then entirely transparent. The corresponding privileges are also then removed from the CP Nodes.

At the moment, using IRSA for kOps addons requires the UseServiceAccountIAM feature flag enabled, as we feel we have not tested the functionality enough. We are also missing the ability to override/augment the IAM Policy that the ServiceAccount uses, which can be necessary, e.g. if you want to use cert-manager DNS validation for your own domains.

Creating IAM Roles for your own workloads

Provision the IAM Roles

kOps can provision IAM Roles for your workloads (Deployments, StatefulSets, Jobs, etc.), including the IAM Policy Statement that allows the workload's ServiceAccount to assume the IAM Role and grant the role the privileges you want.

You can let the role assume existing policies, or you can define the policy inline like this:

      - name: someServiceAccount
        namespace: someNamespace
            - arn:aws:iam::000000000000:policy/somePolicy
      - name: anotherServiceAccount
        namespace: anotherNamespace
          inlinePolicy: |-
                "Effect": "Allow",
                "Action": "s3:ListAllMyBuckets",
                "Resource": "*"

Configuring Pods to use IRSA

One thing to bear in mind is that kOps will not "own" ServiceAccounts the same way EKS does when using IRSA. So you have to modify your workloads as appropriately yourself.

Typically, you will use environment variables to configure the AWS SDK to use IRSA. The following shows the changes you have to make to the Pod spec:

  - env:
      value: <region>
    - name: AWS_REGION
      value: <region>
    - name: AWS_ROLE_ARN
      value: "arn:aws:iam::<account number>:role/<role>"
      value: "/var/run/secrets/"
      value: "regional"
    - mountPath: "/var/run/secrets/"
      name: aws-token
  - name: aws-token
      - serviceAccountToken:
          audience: ""
          expirationSeconds: 86400
          path: token

If you prefer, you could create ServiceAccounts with these details and use the EKS identity webhook, but I don't see kOps supporting that webhook as a native addon.

Zero-configuration IRSA

You don't have to care about anything in order for kOps addons to use IRSA. I would really like this to be the case for your own workloads as well.

Since you define the relationship between AWS IAM and ServiceAccount in the Cluster spec, and the changes you have to make to your Pod spec just mirror that relationship, something could automatically read the Cluster spec and configure workloads for you.

This would have to be an addon that either provides a webhook similar to the EKS identity webhook, or acts as a controller that watch all workloads in the cluster. It is debatable if such an addon really should be a part of the kOps project or if this should be standalone.

I would really love to hear how you would want this to behave. If you have any ideas, comment here or reach out in #kops-users on the Kubernetes Slack.