Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag/name images with branch name #92

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ on:
upload_artifacts:
required: true
type: boolean
branch_name:
required: true
type: string

permissions:
id-token: write
Expand All @@ -16,6 +19,13 @@ defaults:
shell: bash

jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Run tests
run: ci/image-name/tests.sh
compute-constants:
runs-on: ubuntu-latest
outputs:
Expand Down Expand Up @@ -55,6 +65,7 @@ jobs:
run: ci/compute-variables.sh
env:
ARCH: ${{ matrix.ARCH }}
BRANCH_NAME: ${{ inputs.branch_name }}
DRIVER_VERSION: ${{ matrix.DRIVER_VERSION }}
DRIVER_FLAVOR: ${{ matrix.DRIVER_FLAVOR }}
OS: ${{ matrix.OS }}
Expand All @@ -74,6 +85,7 @@ jobs:
-only="*${OS}-${RUNNER_ENV}*" \
-var "arch=${ARCH}" \
-var "backup_aws_regions=${BACKUP_AWS_REGIONS}" \
-var "branch_name=${BRANCH_NAME}" \
-var "default_aws_region=${DEFAULT_AWS_REGION}" \
-var "driver_version=${DRIVER_VERSION}" \
-var "driver_flavor=${DRIVER_FLAVOR}" \
Expand All @@ -90,6 +102,7 @@ jobs:
env:
ARCH: ${{ matrix.ARCH }}
BACKUP_AWS_REGIONS: ${{ needs.compute-constants.outputs.BACKUP_AWS_REGIONS }}
BRANCH_NAME: ${{ inputs.branch_name }}
DEFAULT_AWS_REGION: ${{ needs.compute-constants.outputs.DEFAULT_AWS_REGION }}
DRIVER_VERSION: ${{ matrix.DRIVER_VERSION }}
DRIVER_FLAVOR: ${{ matrix.DRIVER_FLAVOR }}
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/gc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,4 @@ jobs:
run: python main.py --dry-run="${DRY_RUN}"
env:
DRY_RUN: ${{ github.event_name == 'schedule' && 'false' || inputs.dry_run }}
REPOSITORY: ${{ github.repository }}
2 changes: 2 additions & 0 deletions .github/workflows/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ on:
push:
branches:
- main
- test/**
KyleFromNVIDIA marked this conversation as resolved.
Show resolved Hide resolved

permissions:
id-token: write
Expand All @@ -18,4 +19,5 @@ jobs:
uses: ./.github/workflows/build.yaml
with:
upload_artifacts: true
branch_name: ${{ github.ref_name }}
secrets: inherit
1 change: 1 addition & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,5 @@ jobs:
uses: ./.github/workflows/build.yaml
with:
upload_artifacts: false
branch_name: ${{ github.ref_name }}
secrets: inherit
5 changes: 3 additions & 2 deletions ci/compute-image-name.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,15 @@ IMAGE_NAME=$(
--arg DRIVER_VERSION "${DRIVER_VERSION}" \
--arg DRIVER_FLAVOR "${DRIVER_FLAVOR}" \
--arg ARCH "${ARCH}" \
--arg RUNNER_VERSION "${RUNNER_VERSION}" \
--arg RUNNER_ENV "${RUNNER_ENV}" \
--arg BRANCH_NAME "${BRANCH_NAME}" \
'[
$OS,
$VARIANT,
$DRIVER_VERSION,
$DRIVER_FLAVOR,
$ARCH,
$RUNNER_VERSION
if $RUNNER_ENV != "aws" then ($BRANCH_NAME | sub("/"; "-")) else empty end
KyleFromNVIDIA marked this conversation as resolved.
Show resolved Hide resolved
] | map(select(length > 0)) | join("-")'
)

Expand Down
29 changes: 29 additions & 0 deletions ci/image-name/deserialize.jq
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
def pick_from_list($values; $required_field):
.[0] as $value |
if $values | any(. == $value) then
{"value": $value, "rest": .[1:]}
elif $required_field != "" then
"Error: \($required_field) field is required\n" | halt_error
else
{"value": null, "rest": .}
end;

def deserialize_image_name($matrix):
split("-") |
pick_from_list($matrix.OS; "OS") | .value as $os | .rest |
pick_from_list(["cpu", "gpu"]; "variant") | .value as $variant | .rest |
pick_from_list($matrix.DRIVER_VERSION; "") | .value as $driver_version | .rest |
pick_from_list($matrix.DRIVER_FLAVOR; "") | .value as $driver_flavor | .rest |
pick_from_list($matrix.ARCH; "arch") | .value as $arch | .rest |
(if any then join("-") else null end) as $branch_name |
{
"os": $os,
"variant": $variant,
"driver_version": $driver_version,
"driver_flavor": $driver_flavor,
"arch": $arch,
"branch_name": $branch_name
};

def deserialize_image_name:
. as $input | env.MATRIX | fromjson as $matrix | $input | deserialize_image_name($matrix);
7 changes: 7 additions & 0 deletions ci/image-name/deserialize.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash

set -euo pipefail

cd "$(dirname "$0")/../.."

MATRIX="$(yq -o json matrix.yaml)" IMAGE_NAME="$1" jq -c -n 'include "ci/image-name/deserialize"; env.IMAGE_NAME | deserialize_image_name'
8 changes: 8 additions & 0 deletions ci/image-name/deserialize.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
include "ci/image-name/deserialize"; .[] | deserialize_image_name
["linux-gpu-550-open-arm64-main", "windows-cpu-amd64-pr-1234", "linux-gpu-550-open-arm64"]
{"os": "linux", "variant": "gpu", "driver_version": "550", "driver_flavor": "open", "arch": "arm64", "branch_name": "main"}
{"os": "windows", "variant": "cpu", "driver_version": null, "driver_flavor": null, "arch": "amd64", "branch_name": "pr-1234"}
{"os": "linux", "variant": "gpu", "driver_version": "550", "driver_flavor": "open", "arch": "arm64", "branch_name": null}

include "ci/image-name/deserialize"; .[] | deserialize_image_name
["linux-550-arm64-main", "gpu-550-arm64-main", "linux-gpu-550-main"]
ajschmidt8 marked this conversation as resolved.
Show resolved Hide resolved
8 changes: 8 additions & 0 deletions ci/image-name/tests.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

set -euo pipefail

MATRIX="$(yq -o json matrix.yaml)"
MATRIX="${MATRIX}" jq --run-tests ci/image-name/deserialize.test

ci/image-name/deserialize.sh linux-gpu-550-open-amd64-pr-1234 | jq -e '. == {"os": "linux", "variant": "gpu", "driver_version": "550", "driver_flavor": "open", "arch": "amd64", "branch_name": "pr-1234"}'
16 changes: 9 additions & 7 deletions gc/collectors/amis.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@


class AMIGarbageCollector(gc.GarbageCollector):
def __init__(self, current_images: list[str], region: str, dry_run: bool):
def __init__(self, current_branches: list[str], region: str, dry_run: bool):
super().__init__("AMI Collector", region, dry_run)
self.ec2_client = boto3.client("ec2", region_name=region)
self.current_images = current_images
self.current_branches = current_branches
self.search_tag_name = "vm-images"
self.search_tag_value = "true"

Expand Down Expand Up @@ -51,19 +51,21 @@ def _find_expired_amis(self, amis: list[ImageTypeDef]) -> list[ImageTypeDef]:
img_tags = ami["Tags"]
if img_tags:
for tag in img_tags:
if tag["Key"] == "image-name":
img_name = tag["Value"]
ami_groups[img_name].append(ami)
if tag["Key"] == "branch-name":
branch_name = tag["Value"]
ami_groups[branch_name].append(ami)
break
else:
expired_amis.append(ami)

# Sort AMIs by creation date.
# If image is currently supported, keep only the newest AMI. Expire the rest.
# If image is not currently supported, expire all AMIs.
for img_name, amis in ami_groups.items():
for branch_name, amis in ami_groups.items():
amis = sorted(
amis, key=lambda x: parser.parse(x["CreationDate"]), reverse=True
)
if img_name in self.current_images:
if branch_name in self.current_branches:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the logic here is correct.

Branches publish many AMIs. If we only save a single AMI from a current branch (as done in the next line), we'd discard many legitimate AMIs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given the GC logic another go based on our discussion. Please take a look.

expired_amis.extend(amis[1:])
else:
expired_amis.extend(amis)
Expand Down
11 changes: 7 additions & 4 deletions gc/collectors/ecr.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,14 @@
from mypy_boto3_ecr_public.paginator import (
DescribeImagesPaginator as ECRPublicDescribeImagesPaginator,
)
from .image_name import deserialize_image_name


class ECRGarbageCollector(gc.GarbageCollector):
def __init__(self, current_images: list[str], region: str, dry_run: bool):
def __init__(self, current_branches: list[str], region: str, dry_run: bool):
super().__init__("ECR Collector", region, dry_run)
self.ecr_client = boto3.client("ecr-public", region_name=region)
self.current_images = current_images
self.current_branches = current_branches
self.repository_name = "kubevirt-images"

def _run(self) -> None:
Expand All @@ -34,19 +35,21 @@ def _get_ecr_images(self) -> list[ImageDetailTypeDef]:
def _find_expired_ecr_images(
self, images: list[ImageDetailTypeDef]
) -> list[ImageDetailTypeDef]:
branches = {b.replace("/", "-") for b in self.current_branches}
expired_images = []

for image in images:
image_tags = image.get("imageTags")
hasSupportedTags = False
if image_tags:
for tag in image_tags:
if tag in self.current_images:
parsed_image_name = deserialize_image_name(tag)
if parsed_image_name and parsed_image_name.branch_name in branches:
hasSupportedTags = True
break

# Remove images that don't have any tags or don't have any supported tags
if not image_tags or not hasSupportedTags:
if not hasSupportedTags:
expired_images.append(image)
continue
return expired_images
Expand Down
35 changes: 35 additions & 0 deletions gc/collectors/image_name.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import dataclasses
import json
import os.path
import subprocess
from typing import Optional


DESERIALIZE_PATH = os.path.join(
os.path.dirname(__file__), "..", "..", "ci", "image-name", "deserialize.sh"
)


@dataclasses.dataclass
class ImageName:
os: str
variant: str
driver_version: Optional[str]
driver_flavor: Optional[str]
arch: str
branch_name: Optional[str]


def deserialize_image_name(image_name: str) -> Optional[ImageName]:
result = subprocess.run(
[DESERIALIZE_PATH, image_name], stdout=subprocess.PIPE, text=True, check=True
)
parsed_json = json.loads(result.stdout)
return ImageName(
os=parsed_json["os"],
variant=parsed_json["variant"],
driver_version=parsed_json["driver_version"],
driver_flavor=parsed_json["driver_flavor"],
arch=parsed_json["arch"],
branch_name=parsed_json["branch_name"],
)
42 changes: 13 additions & 29 deletions gc/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,37 +2,21 @@
import subprocess
import json
import yaml
from os import path
from os import path, getenv
from collectors.ecr import ECRGarbageCollector
from collectors.amis import AMIGarbageCollector
from collectors.gc import GarbageCollector
from github import Github


def load_current_images() -> list[str]:
def load_current_branches() -> list[str]:
"""
Loads the currently supported images from the matrix and returns them as a list.
Loads the currently existing branches from the repository and returns them as a list.
"""
compute_matrix_path = path.join(
path.dirname(__file__), "..", "ci", "compute-matrix.sh"
)
compute_image_name_path = path.join(
path.dirname(__file__), "..", "ci", "compute-image-name.sh"
)
result = subprocess.run(
compute_matrix_path, cwd="..", capture_output=True, check=True
)
matrix = json.loads(result.stdout.decode("utf-8"))["include"]
images = []
for entry in matrix:
result = subprocess.run(
compute_image_name_path,
cwd="..",
capture_output=True,
env=entry,
check=True,
)
images.append(result.stdout.decode("utf-8").strip())
return images
client = Github()
repository = getenv("REPOSITORY")
assert repository is not None
return [b.name for b in client.get_repo(repository).get_branches()]


if __name__ == "__main__":
Expand All @@ -46,10 +30,10 @@ def load_current_images() -> list[str]:
)
args = parser.parse_args()
dry_run = args.dry_run != "false"
current_images = load_current_images()
if not current_images:
current_branches = load_current_branches()
if not current_branches:
print()
print("No current images found. Something's not right.")
print("No current branches found. Something's not right.")
print("Exiting to prevent all images from being deleted from AWS.")
exit(1)

Expand All @@ -59,14 +43,14 @@ def load_current_images() -> list[str]:

ecr_collectors = [
ECRGarbageCollector(
current_images=current_images,
current_branches=current_branches,
region=regions["public_ecr_region"],
dry_run=dry_run,
)
]
ami_collectors = [
AMIGarbageCollector(
current_images=current_images,
current_branches=current_branches,
region=region,
dry_run=dry_run,
)
Expand Down
1 change: 1 addition & 0 deletions gc/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ black
mypy
pyyaml
types-PyYAML
PyGithub
2 changes: 1 addition & 1 deletion locals.pkr.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ locals {
"driver-version" = var.driver_version
"driver-flavor" = var.driver_flavor
"os" = var.os
"runner-version" = var.runner_version
"branch-name" = var.branch_name
"variant" = local.variant
"Name" = local.image_id
} : k => v if v != ""
Expand Down
3 changes: 2 additions & 1 deletion matrix.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ ARCH:

ENV:
- aws
- qemu
# TODO: Uncomment this before merging
#- qemu

exclude:
# only use amd64 for windows
Expand Down
5 changes: 5 additions & 0 deletions variables.pkr.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ variable "backup_aws_regions" {
description = "A comma-separated list of the AWS regions to copy the AMIs to."
}

variable "branch_name" {
type = string
description = "The name of the GitHub branch the image originated from."
}

variable "default_aws_region" {
type = string
description = "The AWS region to provision EC2 instances in."
Expand Down