Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the maintenance procedure #187

Merged
merged 9 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions .github/workflows/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,53 @@ jobs:
export SLACK_APP_TOKEN=${{ secrets.SLACK_APP_TOKEN }}
export SLACK_BOT_TOKEN=${{ secrets.SLACK_BOT_TOKEN }}
EOF
- name: Clean up unnecessary files
run: |
docker image prune -af
sudo rm -rf /home/linuxbrew || true
sudo rm -rf /opt/az || true
sudo rm -rf /opt/microsoft || true
sudo rm -rf /opt/pipx || true
sudo rm -rf /opt/google || true
sudo rm -rf /usr/share/dotnet || true
sudo rm -rf /usr/local/lib/android || true
sudo rm -rf /usr/local/share/boost || true
sudo rm -rf /usr/lib/jvm || true
sudo rm -rf /usr/share/swift || true
sudo rm -rf /usr/local/julia* || true
sudo rm -rf /usr/local/n || true
sudo rm -rf /usr/share/kotlinc || true
sudo rm -rf /usr/local/share/edge_driver || true
sudo rm -rf /usr/local/share/chromedriver-linux64 || true
sudo rm -rf /usr/local/share/gecko_driver || true
sudo rm -rf /usr/share/miniconda || true
sudo rm -rf /usr/local/share/phantomjs* || true
sudo rm -rf /usr/share/sbt || true
sudo rm -rf /usr/local/aws-cli || true
sudo rm -rf /usr/local/aws-sam-cli || true
sudo rm -rf /usr/local/sqlpackage || true
sudo rm -rf /usr/local/bin/minikube || true
sudo rm -rf /usr/local/bin/kustomize || true
sudo rm -rf /usr/local/bin/kubectl || true
sudo rm -rf /usr/local/bin/kind || true
sudo rm -rf /usr/local/bin/helm || true
- run: make setup KINDTEST_K8S_VERSION=${{ matrix.k8s-version }}
- run: make -C kindtest start KINDTEST_K8S_VERSION=${{ matrix.k8s-version }}
- run: make -C kindtest test
env:
GIT_SSH_COMMAND: "ssh -i /tmp/deploy-key.pem"
- run: make -C kindtest logs
if: always()

- name: Check disk usage and docker images
run: |
df -h
docker images
if: always()
- name: Check directory sizes
run: sudo du -sh /usr/local/* /home/* /opt/* /tmp/* /usr/* /var/* | sort -hr
if: always()

- uses: actions/upload-artifact@v4
if: always()
with:
Expand Down
5 changes: 2 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ This project adheres to [Semantic Versioning](http://semver.org/).
We migrated the image repositories of meows to `ghcr.io`.
From meows v0.14.0, please use the following images.

- https://github.com/cybozu-go/meows/pkgs/container/meows-controller
- https://github.com/cybozu-go/meows/pkgs/container/meows-runner
- <https://github.com/cybozu-go/meows/pkgs/container/meows-controller>
- <https://github.com/cybozu-go/meows/pkgs/container/meows-runner>

The images on Quay.io ([meows-controller](https://quay.io/repository/cybozu/meows-controller), [meows-runner](https://quay.io/repository/cybozu/meows-runner)) will not be updated in the future.

Expand All @@ -41,7 +41,6 @@ The images on Quay.io ([meows-controller](https://quay.io/repository/cybozu/meow
- Support Kubernetes 1.27 ([#178](https://github.com/cybozu-go/meows/pull/1781))
- Build with go 1.21 ([#178](https://github.com/cybozu-go/meows/pull/178))


## [0.12.0] - 2023-07-05

### Changed
Expand Down
9 changes: 5 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM ghcr.io/cybozu/golang:1.23-jammy as builder
FROM ghcr.io/cybozu/golang:1.23-jammy AS builder

WORKDIR /workspace
COPY . .
RUN make build

FROM ghcr.io/cybozu/ubuntu:22.04 as controller
FROM ghcr.io/cybozu/ubuntu:22.04 AS controller
LABEL org.opencontainers.image.source="https://github.com/cybozu-go/meows"

COPY --from=builder /workspace/tmp/bin/controller /usr/local/bin
Expand All @@ -14,14 +14,15 @@ COPY --from=builder /workspace/tmp/bin/meows /usr/local/bin
USER 10000:10000
ENTRYPOINT ["controller"]

FROM ghcr.io/cybozu/ubuntu:22.04 as runner
FROM ghcr.io/cybozu/ubuntu:22.04 AS runner
LABEL org.opencontainers.image.source="https://github.com/cybozu-go/meows"

# Even if the version of the runner is out of date, it will self-update at job execution time. So there is no problem to update it when you notice.
# TODO: Until https://github.com/cybozu-go/meows/issues/137 is fixed, update it manually.
ARG RUNNER_VERSION=2.319.1
ARG RUNNER_VERSION=2.320.0

ENV DEBIAN_FRONTEND=noninteractive
# hadolint ignore=DL3015
RUN apt-get update -y \
&& apt-get install -y software-properties-common \
&& add-apt-repository -y ppa:git-core/ppa \
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ You can run jobs in your GitHub Actions workflows on your Kubernetes cluster wit
## Docker images

Docker images are available on [ghcr.io](https://github.com/orgs/cybozu-go/packages?repo_name=meows)

- [Controller](https://github.com/cybozu-go/meows/pkgs/container/meows-controller)
- [Runner](https://github.com/cybozu-go/meows/pkgs/container/meows-runner)

Expand Down
53 changes: 26 additions & 27 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,12 @@
Release procedure
=================
# Release procedure

This document describes how to release a new version of meows.

Versioning
----------
## Versioning

Follow [semantic versioning 2.0.0][semver] to choose the new version number.

Prepare change log entries
--------------------------
## Prepare change log entries

Add notable changes since the last release to [CHANGELOG.md](CHANGELOG.md).
It should look like:
Expand All @@ -19,65 +16,67 @@ It should look like:
## [Unreleased]

### Added

- Implement ... (#35)

### Changed

- Fix a bug in ... (#33)

### Removed

- Deprecated `-option` is removed ... (#39)

(snip)
```

Bump version
------------
## Bump version

1. Determine a new version number. Then set `VERSION` variable.

```console
```bash
# Set VERSION and confirm it. It should not have "v" prefix.
$ VERSION=x.y.z
$ echo $VERSION
VERSION=x.y.z
echo $VERSION
```

2. Make a branch to release

```console
$ git neco dev "bump-$VERSION"
```bash
git switch -c "bump-$VERSION"
```

3. Edit `CHANGELOG.md` for the new version ([example][]).
4. Bump image version.

```console
$ sed -i -E "s/(.*newTag: ).*/\1${VERSION}/" config/controller/kustomization.yaml config/agent/kustomization.yaml
$ sed -i -E "s/(.*Version = ).*/\1\"${VERSION}\"/" constants.go
```bash
sed -i -E "s/(.*newTag: ).*/\1${VERSION}/" config/controller/kustomization.yaml config/agent/kustomization.yaml
sed -i -E "s/(.*Version = ).*/\1\"${VERSION}\"/" constants.go
```

5. Commit the change and push it.

```console
$ git commit -a -m "Bump version to $VERSION"
$ git neco review
```bash
git commit -a -m "Bump version to $VERSION"
git push origin "bump-$VERSION"
```

6. Merge this branch.
7. Add a git tag to the main HEAD, then push it.

```console
```bash
# Set VERSION again.
$ VERSION=x.y.z
$ echo $VERSION
VERSION=x.y.z
echo $VERSION

$ git checkout main
$ git pull
$ git tag -a -m "Release v$VERSION" "v$VERSION"
git checkout main
git pull
git tag -a -m "Release v$VERSION" "v$VERSION"

# Make sure the release tag exists.
$ git tag -ln | grep $VERSION
git tag -ln | grep $VERSION

$ git push origin "v$VERSION"
git push origin "v$VERSION"
```

GitHub actions will build and push artifacts such as container images and
Expand Down
16 changes: 5 additions & 11 deletions docs/commands.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
CLI options
===========
# CLI options

`controller`
-----------
## `controller`

The CLI allows you to use the following options:

```bash
```console
$ controller -h
Kubernetes controller for GitHub Actions self-hosted runner

Expand Down Expand Up @@ -41,9 +39,7 @@ Flags:
--zap-stacktrace-level level Zap Level at and above which stacktraces are captured (one of 'info', 'error', 'panic').
```


`slack-agent`
-------------
## `slack-agent`

The Slack agent is a server program.
This notifies CI results and accepts requests for extending Pods' lifecycles
Expand All @@ -68,9 +64,7 @@ Flags:
-v, --verbose Verbose.
```


`meows`
------
## `meows`

This is a tool command to do some operations.
It enables to send requests to the slack-agent, or to control the GitHub runners.
Expand Down
10 changes: 5 additions & 5 deletions docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ Runner has the `status` and `busy` state as written [here](https://docs.github.c
If the `--ephemeral` option is given to `config.sh` does not repeat the
long polling again, and never gets `online` after the assigned job is done.
This behavior is useful for ensuring to make a clean environment for each job.
ref: https://docs.github.com/en/actions/hosting-your-own-runners/autoscaling-with-self-hosted-runners#using-ephemeral-runners-for-autoscaling
ref: <https://docs.github.com/en/actions/hosting-your-own-runners/autoscaling-with-self-hosted-runners#using-ephemeral-runners-for-autoscaling>

#### A job is scheduled only on a `online` runner

Expand Down Expand Up @@ -189,7 +189,7 @@ meows sets the namespaced name of a `RunnerPool` as a custom label.
command when the job is failed. The `if: failure()` syntax allows users
to run the step only when one of previous steps exit with non-zero code.
1. Publish the timestamp of when to delete this pod in the `/deletion_time` endpoint.
If the job is succeeded or canceled, the `Pod` publishes the current time for
If the job is succeeded or canceled, the `Pod` publishes the current time for
delete itself. If the job is failed, the `Pod` publishes the future time for
delete itself, for example 20 min later.
1. The Slack agent notifies the result of the job on a Slack channel.
Expand All @@ -208,15 +208,15 @@ A Runner `Pod` has the following state as a GitHub Actions job runner.
for example, booting a couple of VMs needed in a job before the job is assigned.
- `running`: `Pod` is running. Registered in GitHub Actions.
- `debugging`: The job has finished with failure and Users can enter `Pod` to debug.
- `stale`: The environment in the `Pod` is dirty. If a runner restarts before completing a job,
- `stale`: The environment in the `Pod` is dirty. If a runner restarts before completing a job,
the environment in the `Pod` may be dirty. This state means waiting for the Pod
to be removed to prevent Job execution with that stale Pod.

In addition, it has the following states as the exit state of the execution result of `Runner.Listener`.

- `retryable_error`: If execution fails due to a factor other than a job, restart `Runner.Listener`.
- `updating`: When a new `Runner.Listener` is released, it updates itself and restarts` Runner.Listener`.
- `undefined`: When the exit code of `Runner.Listener` is undefined. It restarts` Runner.Listener`.
- `updating`: When a new `Runner.Listener` is released, it updates itself and restarts `Runner.Listener`.
- `undefined`: When the exit code of `Runner.Listener` is undefined. It restarts `Runner.Listener`.

The above states are exposed from `/metrics` endpoint as Prometheus metrics. See [metrics.md](metrics.md).

Expand Down
Loading