Below is the example on how to install KCM for development purposes and create a managed cluster on AWS with k0s for testing. The kind cluster acts as management in this example.
git clone https://github.com/K0rdent/kcm.git && cd kcm
make help # Check available commands
Run:
make cli-install
Follow the instruction to configure AWS Provider: AWS Provider Setup
The following env variables must be set in order to deploy dev cluster on AWS:
AWS_ACCESS_KEY_ID
: The access key ID for authenticating with AWS.AWS_SECRET_ACCESS_KEY
: The secret access key for authenticating with AWS.
The following environment variables are optional but can enhance functionality:
AWS_SESSION_TOKEN
: Required only if using temporary AWS credentials.AWS_REGION
: Specifies the AWS region in which to deploy resources. If not provided, defaults tous-east-2
.
Follow the instruction on how to configure Azure Provider.
Additionally to deploy dev cluster on Azure the following env variables should be set before running deployment:
AZURE_SUBSCRIPTION_ID
- Subscription IDAZURE_TENANT_ID
- Service principal tenant IDAZURE_CLIENT_ID
- Service principal App IDAZURE_CLIENT_SECRET
- Service principal password
More detailed description of these parameters can be found here.
Follow the instruction on how to configure vSphere Provider.
To properly deploy dev cluster you need to have the following variables set:
VSPHERE_USER
VSPHERE_PASSWORD
VSPHERE_SERVER
VSPHERE_THUMBPRINT
VSPHERE_DATACENTER
VSPHERE_DATASTORE
VSPHERE_RESOURCEPOOL
VSPHERE_FOLDER
VSPHERE_CONTROL_PLANE_ENDPOINT
VSPHERE_VM_TEMPLATE
VSPHERE_NETWORK
VSPHERE_SSH_KEY
Naming of the variables duplicates parameters in ManagementCluster
. To get
full explanation for each parameter visit
vSphere cluster parameters and
vSphere machine parameters.
To properly deploy dev cluster you need to have the following variable set:
DEV_PROVIDER
- should be "eks"
To deploy a development cluster on OpenStack, first set:
DEV_PROVIDER
- should be "openstack"
We recommend using OpenStack Application Credentials as it enhances security by allowing applications to authenticate with limited, specific permissions without exposing the user's password.
OS_AUTH_URL
OS_AUTH_TYPE
OS_APPLICATION_CREDENTIAL_ID
OS_APPLICATION_CREDENTIAL_SECRET
OS_REGION_NAME
OS_INTERFACE
OS_IDENTITY_API_VERSION
You will also need to specify additional parameters related to machine sizes and images:
OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR
OPENSTACK_NODE_MACHINE_FLAVOR
OPENSTACK_IMAGE_NAME
Note
The recommended minimum vCPU value for the control plane flavor is 2, while for the worker node flavor, it is 1. For detailed information, refer to the machine-flavor CAPI docs.
To "adopt" an existing cluster first obtain the kubeconfig file for the cluster.
Then set the DEV_PROVIDER
to "adopted". Export the kubeconfig file as a variable by running the following:
export KUBECONFIG_DATA=$(cat kubeconfig | base64 -w 0)
The rest of the deployment procedure is same for all providers.
Default provider which will be used to deploy cluster is AWS, if you want to use
another provider change DEV_PROVIDER
variable with the name of provider before
running make (e.g. export DEV_PROVIDER=azure
).
-
Configure your cluster parameters in provider specific file (for example
config/dev/aws-clusterdeployment.yaml
in case of AWS):- Configure the
name
of the ClusterDeployment - Change instance type or size for control plane and worker machines
- Specify the number of control plane and worker machines, etc
- Configure the
-
Run
make dev-apply
to deploy and configure management cluster. -
Wait a couple of minutes for management components to be up and running.
-
Apply credentials for your provider by executing
make dev-creds-apply
. -
Run
make dev-mcluster-apply
to deploy managed cluster on provider of your choice with default configuration. -
Wait for infrastructure to be provisioned and the cluster to be deployed. You may watch the process with the
./bin/clusterctl describe
command. Example:export KUBECONFIG=~/.kube/config ./bin/clusterctl describe cluster <clusterdeployment-name> -n kcm-system --show-conditions all
Note
If you encounter any errors in the output of clusterctl describe cluster
inspect the logs of the
capa-controller-manager
with:
kubectl logs -n kcm-system deploy/capa-controller-manager
This may help identify any potential issues with deployment of the AWS infrastructure.
-
Retrieve the
kubeconfig
of your managed cluster:kubectl --kubeconfig ~/.kube/config get secret -n kcm-system <clusterdeployment-name>-kubeconfig -o=jsonpath={.data.value} | base64 -d > kubeconfig
E2E tests can be ran locally via the make test-e2e
target. In order to have
CI properly deploy a non-local registry will need to be used and the Helm charts
and kcm-controller image will need to exist on the registry, for example, using
GHCR:
IMG="ghcr.io/k0rdent/kcm/controller-ci:v0.0.1-179-ga5bdf29" \
VERSION=v0.0.1-179-ga5bdf29 \
REGISTRY_REPO="oci://ghcr.io/k0rdent/kcm/charts-ci" \
make test-e2e
Optionally, the NO_CLEANUP=1
env var can be used to disable After
nodes from
running within some specs, this will allow users to debug tests by re-running
them without the need to wait a while for an infrastructure deployment to occur.
For subsequent runs the CLUSTER_DEPLOYMENT_NAME=<cluster name>
env var should be
passed to tell the test what cluster name to use so that it does not try to
generate a new name and deploy a new cluster.
Tests that run locally use autogenerated names like 12345678-e2e-test
while
tests that run in CI use names such as ci-1234567890-e2e-test
. You can always
pass CLUSTER_DEPLOYMENT_NAME=
from the get-go to customize the name used by the
test.
Provider tests are broken into two types, onprem
and cloud
. For CI,
provider:onprem
tests run on self-hosted runners provided by Mirantis.
provider:cloud
tests run on GitHub actions runners and interact with cloud
infrastructure providers such as AWS or Azure.
Each specific provider test also has a label, for example, provider:aws
can be
used to run only AWS tests. To utilize these filters with the make test-e2e
target pass the GINKGO_LABEL_FILTER
env var, for example:
GINKGO_LABEL_FILTER="provider:cloud" make test-e2e
would run all cloud provider tests. To see a list of all available labels run:
ginkgo labels ./test/e2e
In CI we run make dev-aws-nuke
and make dev-azure-nuke
to cleanup test
resources within AWS and Azure CI environments. These targets are not run
locally, but can be run manually if needed. For example, to cleanup AWS
manually use:
CLUSTER_NAME=example-e2e-test make dev-aws-nuke
The release.yml
is triggered via a release, it builds and packages the
Helm charts images and airgap package and uploads them. Images and charts are
uploaded to ghcr.io/k0rdent/kcm
and the airgap package is uploaded to the
binary2a.mirantis.com
S3 bucket.
The build_test.yml
workflow is broken into three phases:
- Build and Unit Test
- E2E Tests
- Cleanup
The Build and Unit Test phase is comprised of one job, Build and Unit Test
.
This job runs the controller unit tests, linters and other checks.
It then builds and packages the Helm charts and controller images uploading them
to ghcr.io/k0rdent/kcm
so that the jobs within the E2E phase have access to
them.
- CI charts are uploaded to
ghcr.io/k0rdent/kcm/charts-ci
- KCM Controller image is uploaded to
ghcr.io/k0rdent/kcm/controller-ci
All other tests within the workflow require this job to pass before they are scheduled to run.
In order for the other jobs to be in sync the Get outputs
step is used to
capture a shared clustername
variable and version
variable used across
tests.
The E2E Tests phase is comprised of three jobs. Each of the jobs other than the
E2E Controller
job are conditional on the test e2e
label being present on
the PR which triggers the workflow.
Once the Build and Unit Test
job completes successfully all E2E jobs are
scheduled and run concurrently.
E2E Controller
- Runs the e2e tests for the controller viaGINKGO_LABEL_FILTER="controller"
The built controller is deployed in a kind cluster and the tests are run against it. These tests always run even if thetest e2e
label is not present.E2E Cloud
- Runs the AWS and Azure provider tests viaGINKGO_LABEL_FILTER="provider:cloud"
E2E Onprem
- Runs the VMWare VSphere provider tests viaGINKGO_LABEL_FILTER="provider:onprem"
this job runs on a self-hosted runner provided by Mirantis and utilizes Mirantis' internal vSphere infrastructure.
If any failures are encountered in the E2E tests the Archive test results
step
will archive test logs and other artifacts for troubleshooting.
The Cleanup phase is comprised of one job, Cleanup
. This job is conditional
on the test e2e
label being present and the E2E Cloud
job running. The job
will always run no matter the result of the E2E Cloud
job and conducts cleanup
of AWS and Azure test resources. It is intended to run no matter the result of
the E2E Cloud
job so that resources are cleaned up even if the tests fail.
The Cleanup
job runs the make dev-aws-nuke
and make dev-azure-nuke
targets.
At this time other providers do not have a mechanism for cleanup and if tests fail to delete the resources they create they will need to be manually cleaned.
The following is the notes on provider specific CCM credentials delivery process
Azure CCM/CSI controllers expect well-known azure.json
to be provided though
Secret or by placing it on host file system.
The 2A controller will create Secret named azure-cloud-provider
in the
kube-system
namespace (where all controllers reside). The name is passed to
controllers via helm values.
The azure.json
parameters are documented in detail in the
official docs
Most parameters are obtained from CAPZ objects. Rest parameters are either omitted or set to sane defaults.
cloud-provider-vsphere expects configuration to be passed in ConfigMap. The credentials are located in the secret which is referenced in the configuration.
The config itself is a yaml file and it's not very well documented (the spec docs haven't been updated for years).
Most options however has similar names and could be inferred.
All optional parameters are omitted in the configuration created by 2A controller.
Some options are hardcoded (since values are hard/impossible to get from CAPV objects). For example:
insecureFlag
is set totrue
to omit certificate management parameters. This is also a default in the official charts since most vcenters are using self-signed or signed by internal authority certificates.port
is set to443
(HTTPS)- Multi-vcenter
labels are set to default values of region and zone (
k8s-region
andk8s-zone
)
CSI expects single Secret with configuration in ini
format
(documented here).
Options are similar to CCM and same defaults/considerations are applicable.
CAPO relies on a clouds.yaml file in order to manage the OpenStack resources. This should be supplied as a Kubernetes Secret.
clouds:
my-openstack-cloud:
auth:
auth_url: <your_auth_url>
application_credential_id: <your_credential_id>
application_credential_secret: <your_credential_secret>
region_name: <your_region>
interface: <public|internal|admin>
identity_api_version: 3
auth_type: v3applicationcredential
One would typically create a Secret (for example, openstack-cloud-config) in the kcm-system namespace with the clouds.yaml. Credential object references the secret and the CAPO controllers references this Credential to provision resources.
When you deploy a new cluster, KCM automatically parses the previously created Kubernetes Secret’s data to build a cloud.conf. This cloud-config is mounted inside the CCM and/or CSI pods enabling them to manage load balancers, floating IPs, etc. Refer to configuring OpenStack CCM for more details.
Here's an example of the generated cloud.conf:
[Global]
auth-url=<your_auth_url>
application-credential-id=<your_credential_id>
application-credential-secret=<your_credential_secret>
region=<your_region>
domain-name=<your_domain_name>
[LoadBalancer]
floating-network-id=<your_floating_network_id>
[Network]
public-network-name=<your_network_name>
Use the make airgap-package
target to manually generate the airgap bundle,
to ensure the correctly tagged KCM controller image is present in the bundle
prefix the IMG
env var with the desired image, for example:
IMG="ghcr.io/k0rdent/kcm:0.0.4" make airgap-package
Not setting an IMG
var will use the default image name/tag generated by the
Makefile.