Skip to content

Cross-Runner Token Exposure through SSM Parameter Store

High
npalm published GHSA-8rp4-w85f-5qh2 Jul 1, 2024

Package

terraform philips-labs/terraform-aws-github-runner (Terraform)

Affected versions

< 5.11.1

Patched versions

v5.11.1

Description

Summary

An overly broad IAM policy statement grants all EC2 runner instances unrestricted read access to GitHub runner registration tokens and JIT config stored in the AWS SSM Parameter Store. This misconfiguration permits any runner instance to access tokens intended for other runners, potentially compromising the integrity of workflows and the confidentiality of GitHub Secrets accessible to jobs. Refinement of the IAM policy assigned to EC2 runners is necessary to ensure strict, least privilege access controls.

Details

The infrastructure designed for self-hosted runners of GitHub Actions leverages a series of AWS Lambda functions to dynamically scale runners in response to GitHub Action events. It utilizes SQS for message passing between Lambdas and an SSM Parameter Store for secure distribution of configuration values, eliminating the need to directly share secrets of varying sensitivity across instances. This setup underpins robust isolation between runners and workflows. However, an issue arises with the policy associated with the EC2 runners' role, which inadvertently grants access to GitHub runner registration tokens and JIT configurations meant for other runners.

The following statement, part of the role policy assigned to EC2 runners, allows runners assigned to this role to read parameters under the ${arn_ssm_parameters_path_tokens}* path of the parameter store.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ssm:DeleteParameter",
        "ssm:GetParameters",
        "ssm:GetParameter"
      ],
      "Resource": "${arn_ssm_parameters_path_tokens}*"
    },

Unlike some role policies in this module, this one is attached unconditionally to "${var.prefix}-runner-role" role:

resource "aws_iam_role_policy" "ssm_parameters" {
  name = "runner-ssm-parameters"
  role = aws_iam_role.runner.name
  policy = templatefile("${path.module}/policies/instance-ssm-parameters-policy.json",
    {
      arn_ssm_parameters_path_tokens = "arn:${var.aws_partition}:ssm:${var.aws_region}:${data.aws_caller_identity.current.account_id}:parameter${var.ssm_paths.root}/${var.ssm_paths.tokens}"
      arn_ssm_parameters_path_config = local.arn_ssm_parameters_path_config
    }
  )
}

The ${arn_ssm_parameters_path_tokens} Terraform variable passed to the policy template file is configured based on the ssm_paths variable passed to the module, considerating region and account id. As shown in the snippet above, the parameter path will be constructed using ssm_paths.root and var.ssm_paths.tokens:

arn_ssm_parameters_path_tokens = "arn:${var.aws_partition}:ssm:${var.aws_region}:${data.aws_caller_identity.current.account_id}:parameter${var.ssm_paths.root}/${var.ssm_paths.tokens}"

(Note: This path is slightly different for the multi-runner module. For simplicity, in this description, we will focus on the single runner type and describe the modified risk for multiple runner types later.)

The default values for these vars are passed to the runners child module by the root module, as seen in the snippets below:

//https://github.com/philips-labs/terraform-aws-github-runner/blob/main/variables.tf
variable "ssm_paths" {
  description = "The root path used in SSM to store configuration and secrets."
  type = object({
    root       = optional(string, "github-action-runners")
    app        = optional(string, "app")
    runners    = optional(string, "runners")
    webhook    = optional(string, "webhook")
    use_prefix = optional(bool, true)
  })
  default = {}
}
  ssm_root_path = var.ssm_paths.use_prefix ? "/${var.ssm_paths.root}/${var.prefix}" : "/${var.ssm_paths.root}"
(...)
ssm_paths = {
    root   = local.ssm_root_path
    tokens = "${var.ssm_paths.runners}/tokens"
    config = "${var.ssm_paths.runners}/config"
  }

So for the default variable values the IAM policy statement will look like:

{
	"Action": [
		"ssm:DeleteParameter",
		"ssm:GetParameters",
		"ssm:GetParameter"
	],
	"Effect": "Allow",
	"Resource": "arn:aws:ssm:$region:parameter/github-action-runners/runners/tokens*"
},

With the paths allowed access being parameter/github-action-runners/runners/tokens*, note the *.

When the scale_up lambda processes the SQS messages coming from the webhook lambda, it adds the runner config or the JIT config in the parameter store under the same path that the IAM policy statement allows runners to read from.

Once an instance has been created scale_up will try to create the runner's config. Based on how this is configured it will wither try to createRegistrationTokenConfig or createJitConfig. For both cases, the lambda writes the config to the SSM parameter store path:

async function createStartRunnerConfig(
  githubRunnerConfig: CreateGitHubRunnerConfig,
  instances: string[],
  ghClient: Octokit,
) {
  if (githubRunnerConfig.enableJitConfig && githubRunnerConfig.ephemeral) {
    await createJitConfig(githubRunnerConfig, instances, ghClient);
  } else {
    await createRegistrationTokenConfig(githubRunnerConfig, instances, ghClient);
  }
}

createRegistrationTokenConfig:

async function createRegistrationTokenConfig(
(...)
	await putParameter(`${githubRunnerConfig.ssmTokenPath}/${instance}`, runnerServiceConfig.join(' '), true);

createJitConfig:

async function createJitConfig(githubRunnerConfig: CreateGitHubRunnerConfig, instances: string[], ghClient: Octokit) {
(...)
await putParameter(`${githubRunnerConfig.ssmTokenPath}/${instance}`, runnerConfig.data.encoded_jit_config, true);

scale_up places each instance's token under the ${githubRunnerConfig.ssmTokenPath}/ path but also adds the instance ID at the end of the path: ${githubRunnerConfig.ssmTokenPath}/${instance}.

As established earlier, these instances all have permission to read every parameter under ${githubRunnerConfig.ssmTokenPath}/*(remember the *), but they do not have permission to list parameters under that path. Therefore, instances cannot find the tokens meant for other instances by querying the SSM store by querying the parameter store.

However, each instance does have access to it's own instance ID from the AWS metadata service. As a result, an instance can query the metada service to get its own instance ID and then use it to retrieve the token from the parameter store. This is also how the install-runner scripts work:

token=$(curl -f -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 180" || true)
(...)
instance_id=$(curl -f -H "X-aws-ec2-metadata-token: $token" -v http://169.254.169.254/latest/meta-data/instance-id)
(...)
config=$(aws ssm get-parameter --name "$token_path"/"$instance_id" --with-decryption --region "$region" | jq -r ".Parameter | .Value")

The problem arises with the assumption that each instance does not know the instance ID of the other instances running in the same environment.

Another IAM policy assigned to the role attached to these instances allows instances to read EC2 tags for all EC2 instances. describe_tags

resource "aws_iam_role_policy" "describe_tags" {
  name   = "runner-describe-tags"
  role   = aws_iam_role.runner.name
  policy = file("${path.module}/policies/instance-describe-tags-policy.json")
}

Policy template file instance-describe-tags-policy.json:

{
  "Version": "2012-10-17",
  "Statement": [
      {          
          "Effect": "Allow",
          "Action": "ec2:DescribeTags",
          "Resource": "*"          
      }
  ]
}

The above policy allows principals to use the EC2 DescribeTags API. Part of this API's response is the tagSet that contains TagDescription objects. As seen in the linked documentation, part of this object's contents is the resourceId of the resource that the tag is applied to.
Indeed:

$ aws ec2 describe-tags --region "us-east-2" --filter "Name=resource-type,Values=instance"

{
    "Tags": [
        {
            "Key": "Name",
            "ResourceId": "i-07dXXXXXX231eab",
            "ResourceType": "instance",
            "Value": "gh-runners-xxxxxx"
        },
(...)

Since this policy is associated with the instances, it is possibles for instances to describe tags, allowing them to retrieve ResourceIds and then use this ResourceId to request against the SSM Parameter store so that it receives the tokens meant for other instances.

The path that needs to be accessed would look like /github-action-runners/gh-runners/runners/tokens/i-07dXXXXXX231eab and we now know that instances are allowed to get all parameters under path: /github-action-runners/gh-runners/runners/tokens/*

There are a few time constraints to exploiting this. For JIT config, the instance executing a malicious workflow will have to use the config before the legitimate instance since these are meant to be valid for one-time use only. For runner registration tokens, the instance executing a malicious workflow will have to read the token before the legitimate instance deletes it from the parameter store after it's read.

With that token exposed, arbitrary runners can register themselves to receive jobs from the relevant repo or org.

Impact

The successful exploitation of the overly permissive IAM policies undermines the segregation between runners operating in the same environment.

For exploitation to occur, the adversary must at least be able to control or otherwise successfully modify a workflow in a repo or org that employs this infrastructure for self-hosted actions. This ability could come in a multitude of ways:

  • Misconfigured public repos that allow workflows from fork PRs to run without approval
  • Careless maintainers approving modified workflow runs,
  • Adversaries initially posing as legitimate contributors trying to exploit the GH protection that is set to not require workflow approval for Fork PRs from previous contributors
  • Leaked PAT and fine-grained GitHub tokens
  • Malicious contributors trying to move laterally in the supply chain

The outcome of successful exploitation heavily depends on how the infrastructure is utilized. The compromised segregation between runners can lead to a breach of confidentiality, particularly in cases where GitHub secrets are used, and also impact the integrity of other workflows operating within the same infrastructure.

The CVSS calculation below has been made with the assumption that a maintainer will have to approve the workflow hence User Interaction (UI) is required and that a successful attack will impact components outside the vulnerable component like other repositories or runner groups hence scope is considered changed.

Severity

High

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
High
Privileges required
Low
User interaction
Required
Scope
Changed
Confidentiality
High
Integrity
High
Availability
None

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.0/AV:N/AC:H/PR:L/UI:R/S:C/C:H/I:H/A:N

CVE ID

No known CVE

Weaknesses

Credits