diff --git a/docs/06-config-as-data.md b/docs/06-config-as-data.md new file mode 100644 index 00000000..d44920c2 --- /dev/null +++ b/docs/06-config-as-data.md @@ -0,0 +1,165 @@ +# Configuration as Data + +* Author(s): Martin Maly, @martinmaly +* Approver: @bgrant0607 + +## Why + +This document provides background context for Package Orchestration, which is +further elaborated in a dedicated [document](07-package-orchestration.md). + +## Configuration as Data + +*Configuration as Data* is an approach to management of configuration (incl. +configuration of infrastructure, policy, services, applications, etc.) which: + +* makes configuration data the source of truth, stored separately from the live + state +* uses a uniform, serializable data model to represent configuration +* separates code that acts on the configuration from the data and from packages + / bundles of the data +* abstracts configuration file structure and storage from operations that act + upon the configuration data; clients manipulating configuration data don’t + need to directly interact with storage (git, container images) + +![CaD Overview](./CaD%20Overview.svg) + +## Key Principles + +A system based on CaD *should* observe the following key principles: + +* secrets should be stored separately, in a secret-focused storage system + ([example](https://cloud.google.com/secret-manager)) +* stores a versioned history of configuration changes by change sets to bundles + of related configuration data +* relies on uniformity and consistency of the configuration format, including + type metadata, to enable pattern-based operations on the configuration data, + along the lines of duck typing +* separates schemas for the configuration data from the data, and relies on + schema information for strongly typed operations and to disambiguate data + structures and other variations within the model +* decouples abstractions of configuration from collections of configuration data +* represents abstractions of configuration generators as data with schemas, like + other configuration data +* finds, filters / queries / selects, and/or validates configuration data that + can be operated on by given code (functions) +* finds and/or filters / queries / selects code (functions) that can operate on + resource types contained within a body of configuration data +* *actuation* (reconciliation of configuration data with live state) is separate + from transformation of configuration data, and is driven by the declarative + data model +* transformations, particularly value propagation, are preferable to wholesale + configuration generation except when the expansion is dramatic (say, >10x) +* transformation input generation should usually be decoupled from propagation +* deployment context inputs should be taken from well defined “provider context” + objects +* identifiers and references should be declarative +* live state should be linked back to sources of truth (configuration) + +## KRM CaD + +Our implementation of the Configuration as Data approach ( +[kpt](https://kpt.dev), +[Config Sync](https://cloud.google.com/anthos-config-management/docs/config-sync-overview), +and [Package Orchestration](https://github.com/GoogleContainerTools/kpt/tree/main/porch)) +build on the foundation of +[Kubernetes Resource Model](https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/resource-management.md) +(KRM). + +**Note**: Even though KRM is not a requirement of Config as Data (just like +Python or Go templates or Jinja are not specifically requirements for +[IaC](https://en.wikipedia.org/wiki/Infrastructure_as_code)), the choice of +another foundational config representation format would necessitate +implementing adapters for all types of infrastructure and applications +configured, including Kubernetes, CRDs, GCP resources and more. Likewise, choice +of another configuration format would require redesign of a number of the +configuration management mechanisms that have already been designed for KRM, +such as 3-way merge, structural merge patch, schema descriptions, resource +metadata, references, status conventions, etc. + +**KRM CaD** is therefore a specific approach to implementing *Configuration as +Data* which: +* uses [KRM](https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/resource-management.md) + as the configuration serialization data model +* uses [Kptfile](https://kpt.dev/reference/schema/kptfile/) to store package + metadata +* uses [ResourceList](https://kpt.dev/reference/schema/resource-list/) as a + serialized package wire-format +* uses a function `ResourceList → ResultList` (`kpt` function) as the + foundational, composable unit of package-manipulation code (note that other + forms of code can manipulate packages as well, i.e. UIs, custom algorithms + not necessarily packaged and used as kpt functions) + +and provides the following basic functionality: + +* load a serialized package from a repository (as `ResourceList`) (examples of + repository may be one or more of: local HDD, Git repository, OCI, Cloud + Storage, etc.) +* save a serialized package (as `ResourceList`) to a package repository +* evaluate a function on a serialized package (`ResourceList`) +* [render](https://kpt.dev/book/04-using-functions/01-declarative-function-execution) + a package (evaluate functions declared within the package itself) +* create a new (empty) package +* fork (or clone) an existing package from one package repository (called + upstream) to another (called downstream) +* delete a package from a repository +* associate a version with the package; guarantee immutability of packages with + an assigned version +* incorporate changes from the new version of an upstream package into a new + version of a downstream package +* revert to a prior version of a package + +## Value + +The Config as Data approach enables some key value which is available in other +configuration management approaches to a lesser extent or is not available +at all. + +*CaD* approach enables: + +* simplified authoring of configuration using a variety of methods and sources +* WYSIWYG interaction with configuration using a simple data serialization + formation rather than a code-like format +* layering of interoperable interface surfaces (notably GUI) over declarative + configuration mechanisms rather than forcing choices between exclusive + alternatives (exclusively UI/CLI or IaC initially followed by exclusively + UI/CLI or exclusively IaC) +* the ability to apply UX techniques to simplify configuration authoring and + viewing +* compared to imperative tools (e.g., UI, CLI) that directly modify the live + state via APIs, CaD enables versioning, undo, audits of configuration history, + review/approval, pre-deployment preview, validation, safety checks, + constraint-based policy enforcement, and disaster recovery +* bulk changes to configuration data in their sources of truth +* injection of configuration to address horizontal concerns +* merging of multiple sources of truth +* state export to reusable blueprints without manual templatization +* cooperative editing of configuration by humans and automation, such as for + security remediation (which is usually implemented against live-state APIs) +* reusability of configuration transformation code across multiple bodies of + configuration data containing the same resource types, amortizing the effort + of writing, testing, documenting the code +* combination of independent configuration transformations +* implementation of config transformations using the languages of choice, + including both programming and scripting approaches +* reducing the frequency of changes to existing transformation code +* separation of roles between developer and non-developer configuration users +* defragmenting the configuration transformation ecosystem +* admission control and invariant enforcement on sources of truth +* maintaining variants of configuration blueprints without one-size-fits-all + full struct-constructor-style parameterization and without manually + constructing and maintaining patches +* drift detection and remediation for most of the desired state via continuous + reconciliation using apply and/or for specific attributes via targeted + mutation of the sources of truth + +## Related Articles + +For more information about Configuration as Data and Kubernetes Resource Model, +visit the following links: + +* [Rationale for kpt](https://kpt.dev/guides/rationale) +* [Understanding Configuration as Data](https://cloud.google.com/blog/products/containers-kubernetes/understanding-configuration-as-data-in-kubernetes) + blog post. +* [Kubernetes Resource Model](https://cloud.google.com/blog/topics/developers-practitioners/build-platform-krm-part-1-whats-platform) + blog post series diff --git a/docs/07-package-orchestration.md b/docs/07-package-orchestration.md new file mode 100644 index 00000000..3c0d4132 --- /dev/null +++ b/docs/07-package-orchestration.md @@ -0,0 +1,510 @@ +# Package Orchestration + +* Author(s): Martin Maly, @martinmaly +* Approver: @mortent + +## Why + +Customers who want to take advantage of the benefits of [Configuration as Data +](./06-config-as-data.md) can do so today using a [kpt](https://kpt.dev) CLI and +kpt function ecosystem, including [functions catalog](https://catalog.kpt.dev/). +Package authoring is possible using a variety of editors with +[YAML](https://yaml.org/) support. That said, a delightful UI experience +of WYSIWYG package authoring which supports broader package lifecycle, including +package authoring with *guardrails*, approval workflow, package deployment, and +more, is not yet available. + +*Package Orchestration* service is part of the implementation of the +Configuration as Data approach, and enables building the delightful UI +experience supporting the configuration lifecycle. + +## Core Concepts + +This section briefly describes core concepts of package orchestration: + +***Package***: Package is a collection of related configuration files containing +configuration of [KRM][krm] **resources**. Specifically, configuration +packages are [kpt packages](https://kpt.dev/). + +***Repository***: Repositories store packages or [functions][]. +For example [git][] or [OCI](#oci). Functions may be associated with +repositories to enforce constraints or invariants on packages (guardrails). +([more details](#repositories)) + +Packages are sequentially ***versioned***; multiple versions of the same package +may exist in a repository. [more details](#package-versioning)) + +A package may have a link (URL) to an ***upstream package*** (a specific +version) from which it was cloned. ([more details](#package-relationships)) + +Package may be in one of several lifecycle stages: +* ***Draft*** - package is being created or edited. The package contents can be + modified but package is not ready to be used (i.e. deployed) +* ***Proposed*** - author of the package proposed that the package be published +* ***Published*** - the changes to the package have been approved and the + package is ready to be used. Published packages can be deployed or cloned + +***Function*** (specifically, [KRM functions][krm functions]) can be applied to +packages to mutate or validate resources within them. Functions can be applied +to a package to create specific package mutation while editing a package draft, +functions can be added to package's Kptfile [pipeline][], or associated with a +repository to be applied to all packages on changes. +([more details](#functions)) + +A repository can be designated as ***deployment repository***. *Published* +packages in a deployment repository are considered deployment-ready. +([more details](#deployment)) + + +[krm]: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/architecture/resource-management.md +[functions]: https://kpt.dev/book/02-concepts/03-functions +[krm functions]: https://github.com/kubernetes-sigs/kustomize/blob/master/cmd/config/docs/api-conventions/functions-spec.md +[pipeline]: https://kpt.dev/book/04-using-functions/01-declarative-function-execution +[Config Sync]: https://cloud.google.com/anthos-config-management/docs/config-sync-overview +[kpt]: https://kpt.dev/ +[git]: https://git-scm.org/ +[optimistic-concurrency]: https://en.wikipedia.org/wiki/Optimistic_concurrency_control +[apiserver]: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/ +[representation]: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#differing-representations +[crds]: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ + +## Core Components of Configuration as Data Implementation + +The Core implementation of Configuration as Data, *CaD Core*, is a set of +components and APIs which collectively enable: + +* Registration of repositories (Git, OCI) containing kpt packages or functions, + and discovery of packages and functions +* Porcelain package lifecycle, including authoring, versioning, deletion, + creation and mutations of a package draft, process of proposing the package + draft, and publishing of the approved package. +* Package lifecycle operations such as: + * assisted or automated rollout of package upgrade when a new version + of the upstream package version becomes available + * rollback of a package to previous version +* Deployment of packages from deployment repositories and observability of their + deployment status. +* Permission model that allows role-based access control + +### High-Level Architecture + +At the high level, the Core CaD functionality comprises: + +* a generic (i.e. not task-specific) package orchestration service implementing + * package repository management + * package discovery, authoring and lifecycle management +* [kpt][] - a Git-native, schema-aware, extensible client-side tool for + managing KRM packages +* a GitOps-based deployment mechanism (for example [Config Sync][]), which + distributes and deploys configuration, and provides observability of the + status of deployed resources +* a task-specific UI supporting repository management, package discovery, + authoring, and lifecycle + +![CaD Core Architecture](./CaD%20Core%20Architecture.svg) + +## CaD Concepts Elaborated + +Concepts briefly introduced above are elaborated in more detail in this section. + +### Repositories + +[kpt][] and [Config Sync][] currently integrate with [git][] repositories, and +there is an existing design to add [OCI support](./02-oci-support.md) to kpt. +Initially, the Package Orchestration service will prioritize integration with +[git][], and support for additional repository types may be added in the future +as required. + +Requirements applicable to all repositories include: ability to store packages, +their versions, and sufficient metadata associated with package to capture: + +* package dependency relationships (upstream - downstream) +* package lifecycle state (draft, proposed, published) +* package purpose (base package) +* (optionally) even customer-defined attributes + +At repository registration, customers must be able to specify details needed to +store packages in appropriate locations in the repository. For example, +registration of a Git repository must accept a branch and a directory. + +Repositories may have associated guardrails - mutation and validation functions +that ensure and enforce requirements of all packages in the repository, +including gating promotion of a package to a *published* lifecycle stage. + +_Note_: A user role with sufficient permissions can register a package or +function repository, including repositories containing functions authored by +the customer, or other providers. Since the functions in the registered +repositories become discoverable, customers must be aware of the implications of +registering function repositories and trust the contents thereof. + +### Package Versioning + +Packages are sequentially versioned. The important requirements are: + +* ability to compare any 2 versions of a package to be either "newer than", + equal, or "older than" relationship +* ability to support automatic assignment of versions +* ability to support [optimistic concurrency][optimistic-concurrency] of package + changes via version numbers +* simple model which easily supports automation + +We plan to use a simple integer sequence to represent package versions. + +### Package Relationships + +Kpt packages support the concept of ***upstream***. When a package is cloned +from another, the new package (called ***downstream*** package) maintains an +upstream link to the specific version of the package from which it was cloned. +If a new version of the upstream package becomes available, the upstream link +can be used to [update](https://kpt.dev/book/03-packages/05-updating-a-package) +the downstream package. + +### Deployment + +The deployment mechanism is responsible for deploying configuration packages +from a repository and affecting the live state. Because the configuration +is stored in standard repositories (Git, and in the future OCI), the deployment +component is pluggable. By default, [Config Sync][] is the deployment mechanism +used by CaD Core implementation but others can be used as well. + +Here we highlight some key attributes of the deployment mechanism and its +integration within the CaD Core: + +* _Published_ packages in a deployment repository are considered ready to be + deployed +* Config Sync supports deploying individual packages and whole repositories. + For Git specifically that translates to a requirement to be able to specify + repository, branch/tag/ref, and directory when instructing Config Sync to + deploy a package. +* _Draft_ packages need to be identified in such a way that Config Sync can + easily avoid deploying them. +* Config Sync needs to be able to pin to specific versions of deployable + packages in order to orchestrate rollouts and rollbacks. This means it must + be possible to GET a specific version of a package. +* Config Sync needs to be able to discover when new versions are available for + deployment. + +### Functions + +Functions, specifically [KRM functions][krm functions], are used in the CaD core +to manipulate resources within packages. + +* Similar to packages, functions are stored in repositories. Some repositories + (such as OCI) are more suitable for storing functions than others (such as + Git). +* Function discovery will be aided by metadata associated with the function + by which the function can advertise which resources it acts on, whether the + function is idempotent or not, whether it is a mutator or validator, etc. +* Function repositories can be registered and subsequently, user can discover + functions from the registered repositories and use them as follows: + +Function can be: + +* applied imperatively to a package draft to perform specific mutation to the + package's resources or meta-resources (`Kptfile` etc.) +* registered in the package's `Kptfile` function pipeline as a *mutator* or + *validator* in order to be automatically run as part of package rendering +* registered at the repository level as *mutator* or *validator*. Such function + then applies to all packages in the repository and is evaluated whenever a + change to a package in the repository occurs. + +## Package Orchestration - Porch + +Having established the context of the CaD Core components and the overall +architecture, the remainder of the document will focus on **Porch** - Package +Orchestration service. + +To reiterate the role of Package Orchestration service among the CaD Core +components, it is: + +* [Repository Management](#repository-management) +* [Package Discovery](#package-discovery) +* [Package Authoring](#package-authoring) and Lifecycle + +In the following section we'll expand more on each of these areas. The term +_client_ used in these sections can be either a person interacting with the UI +such as a web application or a command-line tool, or an automated agent or +process. + +### Repository Management + +The repository management functionality of Package Orchestration service enables +the client to: + +* register, unregister, update registration of repositories, and discover + registered repositories. Git repository integration will be available first, + with OCI and possibly more delivered in the subsequent releases. +* manage repository-wide upstream/downstream relationships, i.e. designate + default upstream repository from which packages will be cloned. +* annotate repository with metadata such as whether repository contains + deployment ready packages or not; metadata can be application or customer + specific +* define and enforce package invariants (guardrails) at the repository level, by + registering mutator and/or validator functions with the repository; those + registered functions will be applied to packages in the repository to enforce + invariants + +### Package Discovery + +The package discovery functionality of Package Orchestration service enables +the client to: + +* browse packages in a repository +* discover configuration packages in registered repositories and sort/filter + based on the repository containing the package, package metadata, version, + package lifecycle stage (draft, proposed, published) +* retrieve resources and metadata of an individual package, including latest + version or any specific version or draft of a package, for the purpose of + introspection of a single package or for comparison of contents of multiple + versions of a package, or related packages +* enumerate _upstream_ packages available for creating (cloning) a _downstream_ + package +* identify downstream packages that need to be upgraded after a change is made + to an upstream package +* identify all deployment-ready packages in a deployment repository that are + ready to be synced to a deployment target by Config Sync +* identify new versions of packages in a deployment repository that can be + rolled out to a deployment target by Config Sync +* discover functions in registered repositories based on filtering criteria + including containing repository, applicability of a function to a specific + package or specific resource type(s), function metadata (mutator/validator), + idempotency (function is idempotent/not), etc. + +### Package Authoring + +The package authoring and lifecycle functionality of the package Orchestration +service enables the client to: + +* Create a package _draft_ via one of the following means: + * an empty draft 'from scratch' (equivalent to + [kpt pkg init](https://kpt.dev/reference/cli/pkg/init/)) + * clone of an upstream package (equivalent to + [kpt pkg get](https://kpt.dev/reference/cli/pkg/get/)) from either a + registered upstream repository or from another accessible, unregistered, + repository + * edit an existing package (similar to the CLI command(s) + [kpt fn source](https://kpt.dev/reference/cli/fn/source/) or + [kpt pkg pull](https://github.com/GoogleContainerTools/kpt/issues/2557)) + * roll back / restore a package to any of its previous versions + ([kpt pkg pull](https://github.com/GoogleContainerTools/kpt/issues/2557) + of a previous version) +* Apply changes to a package _draft_. In general, mutations include + adding/modifying/deleting any part of the package's contents. Some specific + examples include: + * add/change/delete package metadata (i.e. some properties in the `Kptfile`) + * add/change/delete resources in the package + * add function mutators/validators to the package's [pipeline][] + * invoke a function imperatively on the package draft to perform a desired + mutation + * add/change/delete sub-package + * retrieve the contents of the package for arbitrary client-side mutations + (equivalent to [kpt fn source](https://kpt.dev/reference/cli/fn/source/)) + * update/replace the package contents with new contents, for example results + of a client-side mutations by a UI (equivalent to + [kpt fn sink](https://kpt.dev/reference/cli/fn/sink/)) +* Rebase a package onto another upstream base package + ([detail](https://github.com/GoogleContainerTools/kpt/issues/2548)) or onto + a newer version of the same package (to aid with conflict resolution during + the process of publishing a draft package) +* Get feedback during package authoring, and assistance in recovery from: + * merge conflicts, invalid package changes, guardrail violations + * compliance of the drafted package with repository-wide invariants and + guardrails +* Propose for a _draft_ package be _published_. +* Apply an arbitrary decision criteria, and by a manual or automated action, + approve (or reject) proposal of a _draft_ package to be _published_. +* Perform bulk operations such as: + * Assisted/automated update (upgrade, rollback) of groups of packages matching + specific criteria (i.e. base package has new version or specific base + package version has a vulnerability and should be rolled back) + * Proposed change validation (pre-validating change that adds a validator + function to a base package or a repository) +* Delete an existing package. + +#### Authoring & Latency + +An important goal of the Package Orchestration service is to support building +of task-specific UIs. In order to deliver low latency user experience acceptable +to UI interactions, the innermost authoring loop (depicted below) will require: + +* high performance access to the package store (load/save package) w/ caching +* low latency execution of mutations and transformations on the package contents +* low latency [KRM function][krm functions] evaluation and package rendering + (evaluation of package's function pipelines) + +![Inner Loop](./Porch%20Inner%20Loop.svg) + +#### Authoring & Access Control + +A client can assign actors (persons, service accounts) to roles that determine +which operations they are allowed to perform in order to satisfy requirements +of the basic roles. For example, only permitted roles can: + +* manipulate repository registration, enforcement of repository-wide + invariants and guardrails +* create a draft of a package and propose the draft be published +* approve (or reject) the proposal to publish a draft package +* clone a package from a specific upstream repository +* perform bulk operations such as rollout upgrade of downstream packages, + including rollouts across multiple downstream repositories +* etc. + +### Porch Architecture + +The Package Orchestration service, **Porch** is designed to be hosted in a +[Kubernetes](https://kubernetes.io/) cluster. + +The overall architecture is shown below, and includes also existing components +(k8s apiserver and Config Sync). + +![](./Porch%20Architecture.svg) + +In addition to satisfying requirements highlighted above, the focus of the +architecture was to: + +* establish clear components and interfaces +* support a low-latency package authoring experience required by the UIs + +The Porch components are: + +#### Porch Server + +The Porch server is implemented as [Kubernetes extension API server][apiserver]. +The benefits of using Kubernetes extension API server are: + +* well-defined and familiar API style +* availability of generated clients +* integration with existing Kubernetes ecosystem and tools such as `kubectl` + CLI, [RBAC](https://kubernetes.io/docs/reference/access-authn-authz/rbac/) +* avoids requirement to open another network port to access a separate endpoint + running inside k8s cluster (this is a distinct advantage over gRPC which we + considered as an alternative approach) + +Resources implemented by Porch include: + +* `PackageRevision` - represents the _metadata_ of the configuration package + revision stored in a _package_ repository. +* `PackageRevisionResources` - represents the _contents_ of the package revision +* `Function` - represents a [KRM function][krm functions] discovered in + a registered _function_ repository. + +Note that each configuration package revision is represented by a _pair_ of +resources which each present a different view (or [representation][] of the same +underlying package revision. + +Repository registration is supported by a `Repository` [custom resource][crds]. + +**Porch server** itself comprises several key components, including: + +* The *Porch aggregated apiserver* which implements the integration into the + main Kubernetes apiserver, and directly serves API requests for the + `PackageRevision`, `PackageRevisionResources` and `Function` resources. +* Package orchestration *engine* which implements the package lifecycle + operations, and package mutation workflows +* *CaD Library* which implements specific package manipulation algorithms such + as package rendering (evaluation of package's function *pipeline*), + initialization of a new package, etc. The CaD Library is shared with `kpt` + where it likewise provides the core package manipulation algorithms. +* *Package cache* which enables both local caching, as well as abstract + manipulation of packages and their contents irrespectively of the underlying + storage mechanism (Git, or OCI) +* *Repository adapters* for Git and OCI which implement the specific logic of + interacting with those types of package repositories. +* *Function runtime* which implements support for evaluating + [kpt functions][functions] and multi-tier cache of functions to support + low latency function evaluation + +#### Function Runner + +**Function runner** is a separate service responsible for evaluating +[kpt functions][functions]. Function runner exposes a [gRPC](https://grpc.io/) +endpoint which enables evaluating a kpt function on the provided configuration +package. + +The gRPC technology was chosen for the function runner service because the +[requirements](#grpc-api) that informed choice of KRM API for the Package +Orchestration service do not apply. The function runner is an internal +microservice, an implementation detail not exposed to external callers. This +makes gRPC perfectly suitable. + +The function runner also maintains cache of functions to support low latency +function evaluation. + +#### CaD Library + +The [kpt](https://kpt.dev/) CLI already implements foundational package +manipulation algorithms in order to provide the command line user experience, +including: + +* [kpt pkg init](https://kpt.dev/reference/cli/pkg/init/) - create an empty, + valid, KRM package +* [kpt pkg get](https://kpt.dev/reference/cli/pkg/get/) - create a downstream + package by cloning an upstream package; set up the upstream reference of the + downstream package +* [kpt pkg update](https://kpt.dev/reference/cli/pkg/update/) - update the + downstream package with changes from new version of upstream, 3-way merge +* [kpt fn eval](https://kpt.dev/reference/cli/fn/eval/) - evaluate a kpt + function on a package +* [kpt fn render](https://kpt.dev/reference/cli/fn/render/) - render the package + by executing the function pipeline of the package and its nested packages +* [kpt fn source](https://kpt.dev/reference/cli/fn/source/) and + [kpt fn sink](https://kpt.dev/reference/cli/fn/sink/) - read package from + local disk as a `ResourceList` and write package represented as + `ResourcesList` into local disk + +The same set of primitives form the foundational building blocks of the package +orchestration service. Further, the package orchestration service combines these +primitives into higher-level operations (for example, package orchestrator +renders packages automatically on changes, future versions will support bulk +operations such as upgrade of multiple packages, etc). + +The implementation of the package manipulation primitives in kpt was refactored +(with initial refactoring completed, and more to be performed as needed) in +order to: + +* create a reusable CaD library, usable by both kpt CLI and Package + Orchestration service +* create abstractions for dependencies which differ between CLI and Porch, + most notable are dependency on Docker for function evaluation, and dependency + on the local file system for package rendering. + +Over time, the CaD Library will provide the package manipulation primitives: + +* create a valid empty package (init) +* update package upstream pointers (get) +* perform 3-way merge (update) +* render - core package rendering algorithm using a pluggable function evaluator + to support: + * function evaluation via Docker (used by kpt CLI) + * function evaluation via an RPC to a service or appropriate function sandbox + * high-performance evaluation of trusted, built-in, functions without sandbox +* heal configuration (restore comments after lossy transformation) + +and both kpt CLI and Porch will consume the library. This approach will allow +leveraging the investment already made into the high quality package +manipulation primitives, and enable functional parity between KPT CLI and +Package Orchestration service. + +## User Guide + +Find the Porch User Guide in a dedicated [document](../../site/guides/porch-user-guide.md). + +## Open Issues/Questions + +### Deployment Rollouts & Orchestration + +__Not Yet Resolved__ + +Cross-cluster rollouts and orchestration of deployment activity. For example, +package deployed by Config Sync in cluster A, and only on success, the same +(or a different) package deployed by Config Sync in cluster B. + +## Alternatives Considered + +### gRPC API + +We considered the use of [gRPC]() for the Porch API. The primary advantages of +implementing Porch as an extension Kubernetes apiserver are: +* customers won't have to open another port to their Kubernetes cluster and can + reuse their existing infrastructure +* customers can likewise reuse existing, familiar, Kubernetes tooling ecosystem diff --git a/docs/08-package-variant.md b/docs/08-package-variant.md new file mode 100644 index 00000000..1ebd5290 --- /dev/null +++ b/docs/08-package-variant.md @@ -0,0 +1,1373 @@ +# Package Variant Controller + +* Author(s): @johnbelamaric, @natasha41575 +* Approver: @mortent + +## Why + +When deploying workloads across large fleets of clusters, it is often necessary +to modify the workload configuration for a specific cluster. Additionally, those +workloads may evolve over time with security or other patches that require +updates. [Configuration as Data](06-config-as-data.md) in general and [Package +Orchestration](07-package-orchestration.md) in particular can assist in this. +However, they are still centered around manual, one-by-one hydration and +configuration of a workload. + +This proposal introduces concepts and a set of resources for automating the +creation and lifecycle management of package variants. These are designed to +address several different dimensions of scalability: +- Number of different workloads for a given cluster +- Number of clusters across which those workloads are deployed +- Different types or characteristics of those clusters +- Complexity of the organizations deploying those workloads +- Changes to those workloads over time + +## See Also +- [Package Orchestration](07-package-orchestration.md) +- [#3347](https://github.com/GoogleContainerTools/kpt/issues/3347) Bulk package + creation +- [#3243](https://github.com/GoogleContainerTools/kpt/issues/3243) Support bulk + package upgrades +- [#3488](https://github.com/GoogleContainerTools/kpt/issues/3488) Porch: + BaseRevision controller aka Fan Out controller - but more +- [Managing Package + Revisions](https://docs.google.com/document/d/1EzUUDxLm5jlEG9d47AQOxA2W6HmSWVjL1zqyIFkqV1I/edit?usp=sharing) +- [Porch UpstreamPolicy Resource + API](https://docs.google.com/document/d/1OxNon_1ri4YOqNtEQivBgeRzIPuX9sOyu-nYukjwN1Q/edit?usp=sharing&resourcekey=0-2nDYYH5Kw58IwCatA4uDQw) + +## Core Concepts + +For this solution, "workloads" are represented by packages. "Package" is a more +general concept, being an arbitrary bundle of resources, and therefore is +sufficient to solve the originally stated problem. + +The basic idea here is to introduce a PackageVariant resource that manages the +derivation of a variant of a package from the original source package, and to +manage the evolution of that variant over time. This effectively automates the +human-centered process for variant creation one might use with `kpt`: +1. Clone an upstream package locally +1. Make changes to the local package, setting values in resources and + executing KRM functions +1. Push the package to a new repository and tag it as a new version + +Similarly, PackageVariant can manage the process of updating a package when a +new version of the upstream package is published. In the human-centered +workflow, a user would use `kpt pkg update` to pull in changes to their +derivative package. When using a PackageVariant resource, the change would be +made to the upstream specification in the resource, and the controller would +propose a new Draft package reflecting the outcome of `kpt pkg update`. + +By automating this process, we open up the possibility of performing systematic +changes that tie back to our different dimensions of scalability. We can use +data about the specific variant we are creating to lookup additional context in +the Porch cluster, and copy that information into the variant. That context is a +well-structured resource, not simply key/value pairs. KRM functions within the +package can interpret the resource, modifying other resources in the package +accordingly. The context can come from multiple sources that vary differently +along those dimensions of scalability. For example, one piece of information may +vary by region, another by individual site, another by cloud provider, and yet +another based on whether we are deploying to development, staging, or production. +By utilizing resources in the Porch cluster as our input model, we can represent +this complexity in a manageable model that is reused across many packages, +rather than scattered in package-specific templates or key/value pairs without +any structure. KRM functions, also reused across packages but configured as +needed for the specific package, are used to interpret the resources within the +package. This decouples authoring of the packages, creation of the input model, +and deploy-time use of that input model within the packages, allowing those +activities to be performed by different teams or organizations. + +We refer to the mechanism described above as *configuration injection*. It +enables dynamic, context-aware creation of variants. Another way to think about +it is as a continuous reconciliation, much like other Kubernetes controllers. In +this case, the inputs are a parent package `P` and a context `C` (which may be a +collection of many independent resources), with the output being the derived +package `D`. When a new version of `C` is created by updates to in-cluster +resources, we get a new revision of `D`, customized based upon the updated +context. Similarly, the user (or an automation) can monitor for new versions of +`P`; when one arrives, the PackageVariant can be updated to point to that new +version, resulting in a newly proposed Draft of `D`, updated to reflect the +upstream changes. This will be explained in more detail below. + +This proposal also introduces a way to "fan-out", or create multiple +PackageVariant resources declaratively based upon a list or selector, with the +PackageVariantSet resource. This is combined with the injection mechanism to +enable generation of large sets of variants that are specialized to a particular +target repository, cluster, or other resource. + +## Basic Package Cloning + +The PackageVariant resource controls the creation and lifecycle of a variant +of a package. That is, it defines the original (upstream) package, the new +(downstream) package, and the changes (mutations) that need to be made to +transform the upstream into the downstream. It also allows the user to specify +policies around adoption, deletion, and update of package revisions that are +under the control of the package variant controller. + +The simple clone operation is shown in *Figure 1*. + +| ![Figure 1: Basic Package Cloning](packagevariant-clone.png) | ![Legend](packagevariant-legend.png) | +| :---: | :---: | +| *Figure 1: Basic Package Cloning* | *Legend* | + + +Note that *proposal* and *approval* are not handled by the package variant +controller. Those are left to humans or other controllers. The exception is the +proposal of deletion (there is no concept of a "Draft" deletion), which the +package variant control will do, depending upon the specified deletion policy. + +### PackageRevision Metadata + +The package variant controller utilizes Porch APIs. This means that it is not +just doing a `clone` operation, but in fact creating a Porch PackageRevision +resource. In particular, that resource can contain Kubernetes metadata that is +not part of the package as stored in the repository. + +Some of that metadata is necessary for the management of the PackageRevision +by the package variant controller - for example, the owner reference indicating +which PackageVariant created the PackageRevision. These are not under the user's +control. However, the PackageVariant resource does make the annotations and +labels of the PackageRevision available as values that may be controlled +during the creation of the PackageRevision. This can assist in additional +automation workflows. + +## Introducing Variance +Just cloning is not that interesting, so the PackageVariant resource also +allows you to control various ways of mutating the original package to create +the variant. + +### Package Context[^porch17] +Every kpt package that is fetched with `--for-deployment` will contain a +ConfigMap called `kptfile.kpt.dev`. Analogously, when Porch creates a package +in a deployment repository, it will create this ConfigMap, if it does not +already exist. Kpt (or Porch) will automatically add a key `name` to the +ConfigMap data, with the value of the package name. This ConfigMap can then +be used as input to functions in the Kpt function pipeline. + +This process holds true for package revisions created via the package variant +controller as well. Additionally, the author of the PackageVariant resource +can specify additional key-value pairs to insert into the package +context, as shown in *Figure 2*. + +| ![Figure 2: Package Context Mutation](packagevariant-context.png) | +| :---: | +| *Figure 2: Package Context Mutation * | + +While this is convenient, it can be easily abused, leading to +over-parameterization. The preferred approach is configuration injection, as +described below, since it allows inputs to adhere to a well-defined, reusable +schema, rather than simple key/value pairs. + +### Kptfile Function Pipeline Editing[^porch18] +In the manual workflow, one of the ways we edit packages is by running KRM +functions imperatively. PackageVariant offers a similar capability, by +allowing the user to add functions to the beginning of the downstream package +`Kptfile` mutators pipeline. These functions will then execute before the +functions present in the upstream pipeline. It is not exactly the same as +running functions imperatively, because they will also be run in every +subsequent execution of the downstream package function pipeline. But it can +achieve the same goals. + +For example, consider an upstream package that includes a Namespace resource. +In many organizations, the deployer of the workload may not have the permissions +to provision cluster-scoped resources like namespaces. This means that they +would not be able to use this upstream package without removing the Namespace +resource (assuming that they only have access to a pipeline that deploys with +constrained permissions). By adding a function that removes Namespace resources, +and a call to `set-namespace`, they can take advantage of the upstream package. + +Similarly, the Kptfile pipeline editing feature provides an easy mechanism for +the deployer to create and set the namespace if their downstream package +application pipeline allows it, as seen in *Figure 3*.[^setns] + +| ![Figure 3: KRM Function Pipeline Editing](packagevariant-function.png) | +| :---: | +| *Figure 3: Kptfile Function Pipeline Editing * | + +### Configuration Injection[^porch18] + +Adding values to the package context or functions to the pipeline works +for configuration that is under the control of the creator of the PackageVariant +resource. However, in more advanced use cases, we may need to specialize the +package based upon other contextual information. This particularly comes into +play when the user deploying the workload does not have direct control over the +context in which it is being deployed. For example, one part of the organization +may manage the infrastructure - that is, the cluster in which we are deploying +the workload - and another part the actual workload. We would like to be able to +pull in inputs specified by the infrastructure team automatically, based the +cluster to which we are deploying the workload, or perhaps the region in which +that cluster is deployed. + +To facilitate this, the package variant controller can "inject" configuration +directly into the package. This means it will use information specific to this +instance of the package to lookup a resource in the Porch cluster, and copy that +information into the package. Of course, the package has to be ready to receive +this information. So, there is a protocol for facilitating this dance: +- Packages may contain resources annotated with `kpt.dev/config-injection` +- Often, these will also be `config.kubernetes.io/local-config` resources, as + they are likely just used by local functions as input. But this is not + mandatory. +- The package variant controller will look for any resource in the Kubernetes + cluster matching the Group, Version, and Kind of the package resource, and + satisfying the *injection selector*. +- The package variant controller will copy the `spec` field from the matching + in-cluster resource to the in-package resource, or the `data` field in the + case of a ConfigMap. + +| ![Figure 4: Configuration Injection](packagevariant-config-injection.png) | +| :---: | +| *Figure 4: Configuration Injection* | + + +Note that because we are injecting data *from the Kubernetes cluster*, we can +also monitor that data for changes. For each resource we inject, the package +variant controller will establish a Kubernetes "watch" on the resource (or +perhaps on the collection of such resources). A change to that resource will +result in a new Draft package with the updated configuration injected. + +There are a number of additional details that will be described in the detailed +design below, along with the specific API definition. + +## Lifecycle Management + +### Upstream Changes +The package variant controller allows you to specific a specific upstream +package revision to clone, or you can specify a floating tag[^notimplemented]. + +If you specify a specific upstream revision, then the downstream will not be +changed unless the PackageVariant resource itself is modified to point to a new +revision. That is, the user must edit the PackageVariant, and change the +upstream package reference. When that is done, the package variant controller +will update any existing Draft package under its ownership by doing the +equivalent of a `kpt pkg update` to update the downstream to be based upon +the new upstream revision. If a Draft does not exist, then the package variant +controller will create a new Draft based on the current published downstream, +and apply the `kpt pkg update`. This updated Draft must then be proposed and +approved like any other package change. + +If a floating tag is used, then explicit modification of the PackageVariant is +not needed. Rather, when the floating tag is moved to a new tagged revision of +the upstream package, the package revision controller will notice and +automatically propose and update to that revision. For example, the upstream +package author may designate three floating tags: stable, beta, and alpha. The +upstream package author can move these tags to specific revisions, and any +PackageVariant resource tracking them will propose updates to their downstream +packages. + +### Adoption and Deletion Policies +When a PackageVariant resource is created, it will have a particular +repository and package name as the downstream. The adoption policy controls +whether the package variant controller takes over an existing package with that +name, in that repository. + +Analogously, when a PackageVariant resource is deleted, a decision must be +made about whether or not to delete the downstream package. This is controlled +by the deletion policy. + +## Fan Out of Variant Generation[^pvsimpl] + +When used with a single package, the package variant controller mostly helps us +handle the time dimension - producing new versions of a package as the upstream +changes, or as injected resources are updated. It can also be useful for +automating common, systematic changes made when bringing an external package +into an organization, or an organizational package into a team repository. + +That is useful, but not extremely compelling by itself. More interesting is when +we use PackageVariant as a primitive for automations that act on other +dimensions of scale. That means writing controllers that emit PackageVariant +resources. For example, we can create a controller that instantiates a +PackageVariant for each developer in our organization, or we can create +a controller to manage PackageVariants across environments. The ability to not +only clone a package, but make systematic changes to that package enables +flexible automation. + +Workload controllers in Kubernetes are a useful analogy. In Kubernetes, we have +different workload controllers such as Deployment, StatefulSet, and DaemonSet. +Ultimately, all of these result in Pods; however, the decisions about what Pods +to create, how to schedule them across Nodes, how to configure those Pods, and +how to manage those Pods as changes happen are very different with each workload +controller. Similarly, we can build different controllers to handle different +ways in which we want to generate PackageRevisions. The PackageVariant +resource provides a convenient primitive for all of those controllers, allowing +a them to leverage a range of well-defined operations to mutate the packages as +needed. + +A common need is the ability to generate many variants of a package based on +a simple list of some entity. Some examples include generating package variants +to spin up development environments for each developer in an organization; +instantiating the same package, with slight configuration changes, across a +fleet of clusters; or instantiating some package per customer. + +The package variant set controller is designed to fill this common need. This +controller consumes PackageVariantSet resources, and outputs PackageVariant +resources. The PackageVariantSet defines: +- the upstream package +- targeting criteria +- a template for generating one PackageVariant per target + +Three types of targeting are supported: +- An explicit list of repositories and package names +- A label selector for Repository objects +- An arbitrary object selector + +Rules for generating a PackageVariant are associated with a list of targets +using a template. That template can have explicit values for various +PackageVariant fields, or it can use [Common Expression Language +(CEL)](https://github.com/google/cel-go) expressions to specify the field +values. + +*Figure 5* shows an example of creating PackageVariant resources based upon the +explicitly list of repositories. In this example, for the `cluster-01` and +`cluster-02` repositories, no template is defined the resulting PackageVariants; +it simply takes the defaults. However, for `cluster-03`, a template is defined +to change the downstream package name to `bar`. + +| ![Figure 5: PackageVariantSet with Repository List](packagevariantset-target-list.png) | +| :---: | +| *Figure 5: PackageVariantSet with Repository List* | + +It is also possible to target the same package to a repository more than once, +using different names. This is useful, for example, if the package is used to +provision namespaces and you would like to provision many namespaces in the same +cluster. It is also useful if a repository is shared across multiple clusters. +In *Figure 6*, two PackageVariant resources for creating the `foo` package in +the repository `cluster-01` are generated, one for each listed package name. +Since no `packageNames` field is listed for `cluster-02`, only one instance is +created for that repository. + +| ![Figure 6: PackageVariantSet with Package List](packagevariantset-target-list-with-packages.png) | +| :---: | +| *Figure 6: PackageVariantSet with Package List* | + +*Figure 7* shows an example that combines a repository label selector with +configuration injection that various based upon the target. The template for the +PackageVariant includes a CEL expression for the one of the injectors, so that +the injection varies systematically based upon attributes of the target. + +| ![Figure 7: PackageVariantSet with Repository Selector](packagevariantset-target-repo-selector.png) | +| :---: | +| *Figure 7: PackageVariantSet with Repository Selector* | + +## Detailed Design + +### PackageVariant API + +The Go types below defines the `PackageVariantSpec`. + +```go +type PackageVariantSpec struct { + Upstream *Upstream `json:"upstream,omitempty"` + Downstream *Downstream `json:"downstream,omitempty"` + + AdoptionPolicy AdoptionPolicy `json:"adoptionPolicy,omitempty"` + DeletionPolicy DeletionPolicy `json:"deletionPolicy,omitempty"` + + Labels map[string]string `json:"labels,omitempty"` + Annotations map[string]string `json:"annotations,omitempty"` + + PackageContext *PackageContext `json:"packageContext,omitempty"` + Pipeline *kptfilev1.Pipeline `json:"pipeline,omitempty"` + Injectors []InjectionSelector `json:"injectors,omitempty"` +} + +type Upstream struct { + Repo string `json:"repo,omitempty"` + Package string `json:"package,omitempty"` + Revision string `json:"revision,omitempty"` +} + +type Downstream struct { + Repo string `json:"repo,omitempty"` + Package string `json:"package,omitempty"` +} + +type PackageContext struct { + Data map[string]string `json:"data,omitempty"` + RemoveKeys []string `json:"removeKeys,omitempty"` +} + +type InjectionSelector struct { + Group *string `json:"group,omitempty"` + Version *string `json:"version,omitempty"` + Kind *string `json:"kind,omitempty"` + Name string `json:"name"` +} + +``` + +#### Basic Spec Fields + +The `Upstream` and `Downstream` fields specify the source package and +destination repository and package name. The `Repo` fields refer to the names +Porch Repository resources in the same namespace as the PackageVariant resource. +The `Downstream` does not contain a revision, because the package variant +controller will only create Draft packages. The `Revision` of the eventual +PackageRevision resource will be determined by Porch at the time of approval. + +The `Labels` and `Annotations` fields list metadata to include on the created +PackageRevision. These values are set *only* at the time a Draft package is +created. They are ignored for subsequent operations, even if the PackageVariant +itself has been modified. This means users are free to change these values on +the PackageRevision; the package variant controller will not touch them again. + +`AdoptionPolicy` controls how the package variant controller behaves if it finds +an existing PackageRevision Draft matching the `Downstream`. If the +`AdoptionPolicy` is `adoptExisting`, then the package variant controller will +take ownership of the Draft, associating it with this PackageVariant. This means +that it will begin to reconcile the Draft, just as if it had created it in the +first place. An `AdoptionPolicy` of `adoptNone` (the default) will simply ignore +any matching Drafts that were not created by the controller. + +`DeletionPolicy` controls how the package variant controller behaves with +respect to PackageRevisions that it has created when the PackageVariant resource +itself is deleted. A value of `delete` (the default) will delete the +PackageRevision (potentially removing it from a running cluster, if the +downstream package has been deployed). A value of `orphan` will remove the owner +references and leave the PackageRevisions in place. + +#### Package Context Injection + +PackageVariant resource authors may specify key-value pairs in the +`spec.packageContext.data` field of the resource. These key-value pairs will be +automatically added to the `data` of the `kptfile.kpt.dev` ConfigMap, if it +exists. + +Specifying the key `name` is invalid and must fail validation of the +PackageVariant. This key is reserved for kpt or Porch to set to the package +name. Similarly, `package-path` is reserved and will result in an error. + +The `spec.packageContext.removeKeys` field can also be used to specify a list of +keys that the package variant controller should remove from the `data` field of +the `kptfile.kpt.dev` ConfigMap. + +When creating or updating a package, the package variant controller will ensure +that: +- The `kptfile.kpt.dev` ConfigMap exists, failing if not +- All of the key-value pairs in `spec.packageContext.data` exist in the `data` + field of the ConfigMap. +- None of the keys listed in `spec.packageContext.removeKeys` exist in the + ConfigMap. + +Note that if a user adds a key via PackageVariant, then changes the +PackageVariant to no longer add that key, it will NOT be removed automatically, +unless the user also lists the key in the `removeKeys` list. This avoids the +need to track which keys were added by PackageVariant. + +Similarly, if a user manually adds a key in the downstream that is also listed +in the `removeKeys` field, the package variant controller will remove that key +the next time it needs to update the downstream package. There will be no +attempt to coordinate "ownership" of these keys. + +If the controller is unable to modify the ConfigMap for some reason, this is +considered an error and should prevent generation of the Draft. This will result +in the condition `Ready` being set to `False`. + +#### Kptfile Function Pipeline Editing + +PackageVariant resource creators may specify a list of KRM functions to add to +the beginning of the Kptfile's pipeline. These functions are listed in the field +`spec.pipeline`, which is a +[Pipeline](https://github.com/GoogleContainerTools/kpt/blob/cf1f326486214f6b4469d8432287a2fa705b48f5/pkg/api/kptfile/v1/types.go#L236), +just as in the Kptfile. The user can therefore prepend both `validators` and +`mutators`. + +Functions added in this way are always added to the *beginning* of the Kptfile +pipeline. In order to enable management of the list on subsequent +reconciliations, functions added by the package variant controller will use the +`Name` field of the +[Function](https://github.com/GoogleContainerTools/kpt/blob/cf1f326486214f6b4469d8432287a2fa705b48f5/pkg/api/kptfile/v1/types.go#L283). +In the Kptfile, each function will be named as the dot-delimited concatenation +of `PackageVariant`, the name of the PackageVariant resource, the function name +as specified in the pipeline of the PackageVariant resource (if present), and +the positional location of the function in the array. + +For example, if the PackageVariant resource contains: + +```yaml +apiVersion: config.porch.kpt.dev/v1alpha1 +kind: PackageVariant +metadata: + namespace: default + name: my-pv +spec: + ... + pipeline: + mutators: + - image: gcr.io/kpt-fn/set-namespace:v0.1 + configMap: + namespace: my-ns + name: my-func + - image: gcr.io/kpt-fn/set-labels:v0.1 + configMap: + app: foo +``` + +Then the resulting Kptfile will have these two entries prepended to its +`mutators` list: + +```yaml + pipeline: + mutators: + - image: gcr.io/kpt-fn/set-namespace:v0.1 + configMap: + namespace: my-ns + name: PackageVariant.my-pv.my-func.0 + - image: gcr.io/kpt-fn/set-labels:v0.1 + configMap: + app: foo + name: PackageVariant.my-pv..1 +``` + +During subsequent reconciliations, this allows the controller to identify the +functions within its control, remove them all, and re-add them based on its +updated content. By including the PackageVariant name, we enable chains of +PackageVariants to add functions, so long as the user is careful about their +choice of resource names and avoids conflicts. + +If the controller is unable to modify the Pipeline for some reason, this is +considered an error and should prevent generation of the Draft. This will result +in the condition `Ready` being set to `False`. + +#### Configuration Injection Details + +As described [above](#configuration-injection), configuration injection is a +process whereby in-package resources are matched to in-cluster resources, and +the `spec` of the in-cluster resources is copied to the in-package resource. + +Configuration injection is controlled by a combination of in-package resources +with annotations, and *injectors* (also known as *injection selectors*) defined +on the PackageVariant resource. Package authors control the injection points +they allow in their packages, by flagging specific resources as *injection +points* with an annotation. Creators of the PackageVariant resource specify how +to map in-cluster resources to those injection points using the injection +selectors. Injection selectors are defined in the `spec.injectors` field of the +PackageVariant. This field is an ordered array of structs containing a GVK +(group, version, kind) tuple as separate fields, and name. Only the name is +required. To identify a match, all fields present must match the in-cluster +object, and all *GVK* fields present must match the in-package resource. In +general the name will not match the in-package resource; this is discussed in +more detail below. + +The annotations, along with the GVK of the annotated resource, allow a package +to "advertise" the injections it can accept and understand. These injection +points effectively form a configuration API for the package, and the injection +selectors provide a way for the PackageVariant author to specify the inputs +for those APIs from the possible values in the management cluster. If we define +those APIs carefully, they can be used across many packages; since they are +KRM resources, we can apply versioning and schema validation to them as well. +This creates a more maintainable, automatable set of APIs for package +customization than simple key/value pairs. + +As an example, we may define a GVK that contains service endpoints that many +applications use. In each application package, we would then include an instance +of that resource, say called "service-endpoints", and configure a function to +propagate the values from that resource to others within our package. As those +endpoints may vary by region, in our Porch cluster we can create an instance of +this GVK for each region: "useast1-service-endpoints", +"useast2-service-endpoints", "uswest1-service-endpoints", etc. When we +instantiate the PackageVariant for a cluster, we want to inject the resource +corresponding to the region in which the cluster exists. Thus, for each cluster +we will create a PackageVariant resource pointing to the upstream package, but +with injection selector name values that are specific to the region for that +cluster. + +It is important to realize that the name of the in-package resource and the in- +cluster resource need not match. In fact, it would be an unusual coincidence if +they did match. The names in the package are the same across PackageVariants +using that upstream, but we want to inject different resources for each one such +PackageVariant. We also do not want to change the name in the package, because +it likely has meaning within the package and will be used by functions in the +package. Also, different owners control the names of the in-package and in- +cluster resources. The names in the package are in the control of the package +author. The names in the cluster are in the control of whoever populates the +cluster (for example, some infrastructure team). The selector is the glue +between them, and is in control of the PackageVariant resource creator. + +The GVK on the other hand, has to be the same for the in-package resource and +the in-cluster resource, because it tells us the API schema for the resource. +Also, the namespace of the in-cluster object needs to be the same as the +PackageVariant resource, or we could leak resources from namespaces to which +our PackageVariant user does not have access. + +With that understanding, the injection process works as follows: + +1. The controller will examine all in-package resources, looking for those with + an annotation named `kpt.dev/config-injection`, with one of the following + values: `required` or `optional`. We will call these "injection points". It + is the responsibility of the package author to define these injection points, + and to specify which are required and which are optional. Optional injection + points are a way of specifying default values. +1. For each injection point, a condition will be created *in the + downstream PackageRevision*, with ConditionType set to the dot-delimited + concatenation of `config.injection`, with the in-package resource kind and + name, and the value set to `False`. Note that since the package author + controls the name of the resource, kind and name are sufficient to + disambiguate the injection point. We will call this ConditionType the + "injection point ConditionType". +1. For each required injection point, the injection point ConditionType will + be added to the PackageRevision `readinessGates` by the package variant + controller. Optional injection points' ConditionTypes must not be added to + the `readinessGates` by the package variant controller, but humans or other + actors may do so at a later date, and the package variant controller should + not remove them on subsequent reconciliations. Also, this relies upon + `readinessGates` gating publishing the package to a *deployment* repository, + but not gating publishing to a blueprint repository. +1. The injection processing will proceed as follows. For each injection point: + - The controller will identify all in-cluster objects in the same + namespace as the PackageVariant resource, with GVK matching the injection + point (the in-package resource). If the controller is unable to load this + objects (e.g., there are none and the CRD is not installed), the injection + point ConditionType will be set to `False`, with a message indicating that + the error, and processing proceeds to the next injection point. Note that + for `optional` injection this may be an acceptable outcome, so it does not + interfere with overall generation of the Draft. + - The controller will look through the list of injection selectors in + order and checking if any of the in-cluster objects match the selector. If + so, that in-cluster object is selected, and processing of the list of + injection selectors stops. Note that the namespace is set based upon the + PackageVariant resource, the GVK is set based upon the in-package resource, + and all selectors require name. Thus, at most one match is possible for any + given selector. Also note that *all fields present in the selector* must + match the in-cluster resource, and only the *GVK fields present in the + selector* must match the in-package resource. + - If no in-cluster object is selected, the injection point ConditionType will + be set to `False` with a message that no matching in-cluster resource was + found, and processing proceeds to the next injection point. + - If a matching in-cluster object is selected, then it is injected as + follows: + - For ConfigMap resources, the `data` field from the in-cluster resource is + copied to the `data` field of the in-package resource (the injection + point), overwriting it. + - For other resource types, the `spec` field from the in-cluster resource + is copied to the `spec` field of the in-package resource (the injection + point), overwriting it. + - An annotation with name `kpt.dev/injected-resource-name` and value set to + the name of the in-cluster resource is added (or overwritten) in the + in-package resource. + +If the the overall injection cannot be completed for some reason, or if any of +the below problems exist in the upstream package, it is considered an error and +should prevent generation of the Draft: + - There is a resource annotated as an injection point but having an invalid + annotation value (i.e., other than `required` or `optional`). + - There are ambiguous condition types due to conflicting GVK and name + values. These must be disambiguated in the upstream package, if so. + +This will result in the condition `Ready` being set to `False`. + +Note that whether or not all `required` injection points are fulfilled does not +affect the *PackageVariant* conditions, only the *PackageRevision* conditions. + +**A Further Note on Selectors** + +Note that by allowing the use of GVK, not just name, in the selector, more +precision in selection is enabled. This is a way to constrain the injections +that will be done. That is, if the package has 10 different objects with +`config-injection` annotation, the PackageVariant could say it only wants to +replace certain GVKs, allowing better control. + +Consider, for example, if the cluster contains these resources: + +- GVK1 foo +- GVK1 bar +- GVK2 foo +- GVK2 bar + +If we could only define injection selectors based upon name, it would be +impossible to ever inject one GVK with `foo` and another with `bar`. Instead, +by using GVK, we can accomplish this with a list of selectors like: + + - GVK1 foo + - GVK2 bar + +That said, often name will be sufficiently unique when combined with the +in-package resource GVK, and so making the selector GVK optional is more +convenient. This allows a single injector to apply to multiple injection points +with different GVKs. + +#### Order of Mutations + +During creation, the first thing the controller does is clone the upstream +package to create the downstream package. + +For update, first note that changes to the downstream PackageRevision can be +triggered for several different reasons: + +1. The PackageVariant resource is updated, which could change any of the options + for introducing variance, or could also change the upstream package revision + referenced. +1. A new revision of the upstream package has been selected due to a floating + tag change, or due to a force retagging of the upstream. +1. An injected in-cluster object is updated. + +The downstream PackageRevision may have been updated by humans or other +automation actors since creation, so we cannot simply recreate the downstream +PackageRevision from scratch when one of these changes happens. Instead, the +controller must maintain the later edits by doing the equivalent of a `kpt pkg +update`, in the case of changes to the upstream for any reason. Any other +changes require reapplication of the PackageVariant functionality. With that +understanding, we can see that the controller will perform mutations on the +downstream package in this order, for both creation and update: + +1. Create (via Clone) or Update (via `kpt pkg update` equivalent) + - This is done by the Porch server, not by the package variant controller + directly. + - This means that Porch will run the Kptfile pipeline after clone or + update. +1. Package variant controller applies configured mutations + - Package Context Injections + - Kptfile KRM Function Pipeline Additions/Changes + - Config Injection +1. Package variant controller saves the PackageRevision and + PackageRevisionResources. + - Porch server executes the Kptfile pipeline + +The package variant controller mutations edit resources (including the Kptfile), +based on the contents of the PackageVariant and the injected in-cluster +resources, but cannot affect one another. The results of those mutations +throughout the rest of the package is materialized by the execution of the +Kptfile pipeline during the save operation. + +#### PackageVariant Status + +PackageVariant sets the following status conditions: + - `Stalled` is set to True if there has been a failure that most likely + requires user intervention. + - `Ready` is set to True if the last reconciliation successfully produced an + up-to-date Draft. + +The PackageVariant resource will also contain a `DownstreamTargets` field, +containing a list of downstream `Draft` and `Proposed` PackageRevisions owned by +this PackageVariant resource, or the latest `Published` PackageRevision if there +are none in `Draft` or `Proposed` state. Typically, there is only a single +Draft, but use of the `adopt` value for `AdoptionPolicy` could result in +multiple Drafts being owned by the same PackageVariant. + +### PackageVariantSet API[^pvsimpl] + +The Go types below defines the `PackageVariantSetSpec`. + +```go +// PackageVariantSetSpec defines the desired state of PackageVariantSet +type PackageVariantSetSpec struct { + Upstream *pkgvarapi.Upstream `json:"upstream,omitempty"` + Targets []Target `json:"targets,omitempty"` +} + +type Target struct { + // Exactly one of Repositories, RepositorySeletor, and ObjectSelector must be + // populated + // option 1: an explicit repositories and package names + Repositories []RepositoryTarget `json:"repositories,omitempty"` + + // option 2: a label selector against a set of repositories + RepositorySelector *metav1.LabelSelector `json:"repositorySelector,omitempty"` + + // option 3: a selector against a set of arbitrary objects + ObjectSelector *ObjectSelector `json:"objectSelector,omitempty"` + + // Template specifies how to generate a PackageVariant from a target + Template *PackageVariantTemplate `json:"template,omitempty"` +} +``` + +At the highest level, a PackageVariantSet is just an upstream, and a list of +targets. For each target, there is a set of criteria for generating a list, and +a set of rules (a template) for creating a PackageVariant from each list entry. + +Since `template` is optional, lets start with describing the different types of +targets, and how the criteria in each is used to generate a list that seeds the +PackageVariant resources. + +The `Target` structure must include exactly one of three different ways of +generating the list. The first is a simple list of repositories and package +names for each of those repositories[^repo-pkg-expr]. The package name list is +needed for uses cases in which you want to repeatedly instantiate the same +package in a single repository. For example, if a repository represents the +contents of a cluster, you may want to instantiate a namespace package once for +each namespace, with a name matching the namespace. + +This example shows using the `repositories` field: + +```yaml +apiVersion: config.porch.kpt.dev/v1alpha2 +kind: PackageVariantSet +metadata: + namespace: default + name: example +spec: + upstream: + repo: example-repo + package: foo + revision: v1 + targets: + - repositories: + - name: cluster-01 + - name: cluster-02 + - name: cluster-03 + packageNames: + - foo-a + - foo-b + - foo-c + - name: cluster-04 + packageNames: + - foo-a + - foo-b +``` + +In this case, PackageVariant resources are created for each of these pairs of +downstream repositories and packages names: + +| Repository | Package Name | +| ---------- | ------------ | +| cluster-01 | foo | +| cluster-02 | foo | +| cluster-03 | foo-a | +| cluster-03 | foo-b | +| cluster-03 | foo-c | +| cluster-04 | foo-a | +| cluster-04 | foo-b | + +All of those PackageVariants have the same upstream. + +The second criteria targeting is via a label selector against Porch Repository +objects, along with a list of package names. Those packages will be instantiated +in each matching repository. Just like in the first example, not listing a +package name defaults to one package, with the same name as the upstream +package. Suppose, for example, we have these four repositories defined in our +Porch cluster: + +| Repository | Labels | +| ---------- | ------------------------------------- | +| cluster-01 | region=useast1, env=prod, org=hr | +| cluster-02 | region=uswest1, env=prod, org=finance | +| cluster-03 | region=useast2, env=prod, org=hr | +| cluster-04 | region=uswest1, env=prod, org=hr | + +If we create a PackageVariantSet with the following `spec`: + +```yaml +spec: + upstream: + repo: example-repo + package: foo + revision: v1 + targets: + - repositorySelector: + matchLabels: + env: prod + org: hr + - repositorySelector: + matchLabels: + region: uswest1 + packageNames: + - foo-a + - foo-b + - foo-c +``` + +then PackageVariant resources will be created with these repository and package +names: + +| Repository | Package Name | +| ---------- | ------------ | +| cluster-01 | foo | +| cluster-03 | foo | +| cluster-04 | foo | +| cluster-02 | foo-a | +| cluster-02 | foo-b | +| cluster-02 | foo-c | +| cluster-04 | foo-a | +| cluster-04 | foo-b | +| cluster-04 | foo-c | + +Finally, the third possibility allows the use of *arbitrary* resources in the +Porch cluster as targeting criteria. The `objectSelector` looks like this: + +```yaml +spec: + upstream: + repo: example-repo + package: foo + revision: v1 + targets: + - objectSelector: + apiVersion: krm-platform.bigco.com/v1 + kind: Team + matchLabels: + org: hr + role: dev +``` + +It works exactly like the repository selector - in fact the repository selector +is equivalent to the object selector with the `apiVersion` and `kind` values set +to point to Porch Repository resources. That is, the repository name comes from +the object name, and the package names come from the listed package names. In +the description of the template, we will see how to derive different repository +names from the objects. + +#### PackageVariant Template + +As previously discussed, the list entries generated by the target criteria +result in PackageVariant entries. If no template is specified, then +PackageVariant default are used, along with the downstream repository name and +package name as described in the previous section. The template allows the user +to have control over all of the values in the resulting PackageVariant. The +template API is shown below. + +```go +type PackageVariantTemplate struct { + // Downstream allows overriding the default downstream package and repository name + // +optional + Downstream *DownstreamTemplate `json:"downstream,omitempty"` + + // AdoptionPolicy allows overriding the PackageVariant adoption policy + // +optional + AdoptionPolicy *pkgvarapi.AdoptionPolicy `json:"adoptionPolicy,omitempty"` + + // DeletionPolicy allows overriding the PackageVariant deletion policy + // +optional + DeletionPolicy *pkgvarapi.DeletionPolicy `json:"deletionPolicy,omitempty"` + + // Labels allows specifying the spec.Labels field of the generated PackageVariant + // +optional + Labels map[string]string `json:"labels,omitempty"` + + // LabelsExprs allows specifying the spec.Labels field of the generated PackageVariant + // using CEL to dynamically create the keys and values. Entries in this field take precedent over + // those with the same keys that are present in Labels. + // +optional + LabelExprs []MapExpr `json:"labelExprs,omitempty"` + + // Annotations allows specifying the spec.Annotations field of the generated PackageVariant + // +optional + Annotations map[string]string `json:"annotations,omitempty"` + + // AnnotationsExprs allows specifying the spec.Annotations field of the generated PackageVariant + // using CEL to dynamically create the keys and values. Entries in this field take precedent over + // those with the same keys that are present in Annotations. + // +optional + AnnotationExprs []MapExpr `json:"annotationExprs,omitempty"` + + // PackageContext allows specifying the spec.PackageContext field of the generated PackageVariant + // +optional + PackageContext *PackageContextTemplate `json:"packageContext,omitempty"` + + // Pipeline allows specifying the spec.Pipeline field of the generated PackageVariant + // +optional + Pipeline *PipelineTemplate `json:"pipeline,omitempty"` + + // Injectors allows specifying the spec.Injectors field of the generated PackageVariant + // +optional + Injectors []InjectionSelectorTemplate `json:"injectors,omitempty"` +} + +// DownstreamTemplate is used to calculate the downstream field of the resulting +// package variants. Only one of Repo and RepoExpr may be specified; +// similarly only one of Package and PackageExpr may be specified. +type DownstreamTemplate struct { + Repo *string `json:"repo,omitempty"` + Package *string `json:"package,omitempty"` + RepoExpr *string `json:"repoExpr,omitempty"` + PackageExpr *string `json:"packageExpr,omitempty"` +} + +// PackageContextTemplate is used to calculate the packageContext field of the +// resulting package variants. The plain fields and Exprs fields will be +// merged, with the Exprs fields taking precedence. +type PackageContextTemplate struct { + Data map[string]string `json:"data,omitempty"` + RemoveKeys []string `json:"removeKeys,omitempty"` + DataExprs []MapExpr `json:"dataExprs,omitempty"` + RemoveKeyExprs []string `json:"removeKeyExprs,omitempty"` +} + +// InjectionSelectorTemplate is used to calculate the injectors field of the +// resulting package variants. Exactly one of the Name and NameExpr fields must +// be specified. The other fields are optional. +type InjectionSelectorTemplate struct { + Group *string `json:"group,omitempty"` + Version *string `json:"version,omitempty"` + Kind *string `json:"kind,omitempty"` + Name *string `json:"name,omitempty"` + + NameExpr *string `json:"nameExpr,omitempty"` +} + +// MapExpr is used for various fields to calculate map entries. Only one of +// Key and KeyExpr may be specified; similarly only on of Value and ValueExpr +// may be specified. +type MapExpr struct { + Key *string `json:"key,omitempty"` + Value *string `json:"value,omitempty"` + KeyExpr *string `json:"keyExpr,omitempty"` + ValueExpr *string `json:"valueExpr,omitempty"` +} + +// PipelineTemplate is used to calculate the pipeline field of the resulting +// package variants. +type PipelineTemplate struct { + // Validators is used to caculate the pipeline.validators field of the + // resulting package variants. + // +optional + Validators []FunctionTemplate `json:"validators,omitempty"` + + // Mutators is used to caculate the pipeline.mutators field of the + // resulting package variants. + // +optional + Mutators []FunctionTemplate `json:"mutators,omitempty"` +} + +// FunctionTemplate is used in generating KRM function pipeline entries; that +// is, it is used to generate Kptfile Function objects. +type FunctionTemplate struct { + kptfilev1.Function `json:",inline"` + + // ConfigMapExprs allows use of CEL to dynamically create the keys and values in the + // function config ConfigMap. Entries in this field take precedent over those with + // the same keys that are present in ConfigMap. + // +optional + ConfigMapExprs []MapExpr `json:"configMapExprs,omitempty"` +} +``` + +This is a pretty complicated structure. To make it more understandable, the +first thing to notice is that many fields have a plain version, and an `Expr` +version. The plain version is used when the value is static across all the +PackageVariants; the `Expr` version is used when the value needs to vary across +PackageVariants. + +Let's consider a simple example. Suppose we have a package for provisioning +namespaces called "base-ns". We want to instantiate this several times in the +`cluster-01` repository. We could do this with this PackageVariantSet: + +```yaml +apiVersion: config.porch.kpt.dev/v1alpha2 +kind: PackageVariantSet +metadata: + namespace: default + name: example +spec: + upstream: + repo: platform-catalog + package: base-ns + revision: v1 + targets: + - repositories: + - name: cluster-01 + packageNames: + - ns-1 + - ns-2 + - ns-3 +``` + +That will produce three PackageVariant resources with the same upstream, all +with the same downstream repo, and each with a different downstream package +name. If we also want to set some labels identically across the packages, we can +do that with the `template.labels` field: + +```yaml +apiVersion: config.porch.kpt.dev/v1alpha2 +kind: PackageVariantSet +metadata: + namespace: default + name: example +spec: + upstream: + repo: platform-catalog + package: base-ns + revision: v1 + targets: + - repositories: + - name: cluster-01 + packageNames: + - ns-1 + - ns-2 + - ns-3 + template: + labels: + package-type: namespace + org: hr +``` + +The resulting PackageVariant resources will include `labels` in their `spec`, +and will be identical other than their names and the `downstream.package`: + +```yaml +apiVersion: config.porch.kpt.dev/v1alpha1 +kind: PackageVariant +metadata: + namespace: default + name: example-aaaa +spec: + upstream: + repo: platform-catalog + package: base-ns + revision: v1 + downstream: + repo: cluster-01 + package: ns-1 + labels: + package-type: namespace + org: hr +--- +apiVersion: config.porch.kpt.dev/v1alpha1 +kind: PackageVariant +metadata: + namespace: default + name: example-aaab +spec: + upstream: + repo: platform-catalog + package: base-ns + revision: v1 + downstream: + repo: cluster-01 + package: ns-2 + labels: + package-type: namespace + org: hr +--- + +apiVersion: config.porch.kpt.dev/v1alpha1 +kind: PackageVariant +metadata: + namespace: default + name: example-aaac +spec: + upstream: + repo: platform-catalog + package: base-ns + revision: v1 + downstream: + repo: cluster-01 + package: ns-3 + labels: + package-type: namespace + org: hr +``` + +When using other targeting means, the use of the `Expr` fields becomes more +likely, because we have more possible sources for different field values. The +`Expr` values are all [Common Expression Language (CEL)](https://github.com/google/cel-go) +expressions, rather than static values. This allows the user to construct values +based upon various fields of the targets. Consider again the +`repositorySelector` example, where we have these repositories in the cluster. + +| Repository | Labels | +| ---------- | ------------------------------------- | +| cluster-01 | region=useast1, env=prod, org=hr | +| cluster-02 | region=uswest1, env=prod, org=finance | +| cluster-03 | region=useast2, env=prod, org=hr | +| cluster-04 | region=uswest1, env=prod, org=hr | + +If we create a PackageVariantSet with the following `spec`, we can use the +`Expr` fields to add labels to the PackageVariantSpecs (and thus to the +resulting PackageRevisions later) that vary based on cluster. We can also use +this to vary the `injectors` defined for each PackageVariant, resulting in each +PackageRevision having different resources injected. This `spec`: + +```yaml +spec: + upstream: + repo: example-repo + package: foo + revision: v1 + targets: + - repositorySelector: + matchLabels: + env: prod + org: hr + template: + labelExprs: + key: org + valueExpr: "repository.labels['org']" + injectorExprs: + - nameExpr: "repository.labels['region'] + '-endpoints'" +``` + +will result in three PackageVariant resources, one for each Repository with the +labels env=prod and org=hr. The `labels` and `injectors` fields of the +PackageVariantSpec will be different for each of these PackageVariants, as +determined by the use of the `Expr` fields in the template, as shown here: + +```yaml +apiVersion: config.porch.kpt.dev/v1alpha1 +kind: PackageVariant +metadata: + namespace: default + name: example-aaaa +spec: + upstream: + repo: example-repo + package: foo + revision: v1 + downstream: + repo: cluster-01 + package: foo + labels: + org: hr + injectors: + name: useast1-endpoints +--- +apiVersion: config.porch.kpt.dev/v1alpha1 +kind: PackageVariant +metadata: + namespace: default + name: example-aaab +spec: + upstream: + repo: example-repo + package: foo + revision: v1 + downstream: + repo: cluster-03 + package: foo + labels: + org: hr + injectors: + name: useast2-endpoints +--- +apiVersion: config.porch.kpt.dev/v1alpha1 +kind: PackageVariant +metadata: + namespace: default + name: example-aaac +spec: + upstream: + repo: example-repo + package: foo + revision: v1 + downstream: + repo: cluster-04 + package: foo + labels: + org: hr + injectors: + name: uswest1-endpoints +``` + +Since the injectors are different for each PackageVariant, the resulting +PackageRevisions will each have different resources injected. + +When CEL expressions are evaluated, they have an environment associated with +them. That is, there are certain objects that are accessible within the CEL +expression. For CEL expressions used in the PackageVariantSet `template` field, +the following variables are available: + +| CEL Variable | Variable Contents | +| -------------- | ------------------------------------------------------------ | +| repoDefault | The default repository name based on the targeting criteria. | +| packageDefault | The default package name based on the targeting criteria. | +| upstream | The upstream PackageRevision. | +| repository | The downstream Repository. | +| target | The target object (details vary; see below). | + +There is one expression that is an exception to the table above. Since the +`repository` value corresponds to the Repository of the downstream, we must +first evaluate the `downstream.repoExpr` expression to *find* that +repository. Thus, for that expression only, `repository` is not a valid +variable. + +There is one more variable available across all CEL expressions: the `target` +variable. This variable has a meaning that varies depending on the type of +target, as follows: + +| Target Type | `target` Variable Contents | +| ------------------- | -------------------------- | +| Repo/Package List | A struct with two fields: `repo` and `package`, the same as the `repoDefault` and `packageDefault` values. | +| Repository Selector | The Repository selected by the selector. Although not recommended, this could be different than the `repository` value, which can be altered with `downstream.repo` or `downstream.repoExpr`. | +| Object Selector | The Object selected by the selector. | + +For the various resource variables - `upstream`, `repository`, and `target` - +arbitrary access to all fields of the object could lead to security concerns. +Therefore, only a subset of the data is available for use in CEL expressions. +Specifically, the following fields: `name`, `namespace`, `labels`, and +`annotations`. + +Given the slight quirk with the `repoExpr`, it may be helpful to state the +processing flow for the template evaluation: + +1. The upstream PackageRevision is loaded. It must be in the same namespace as + the PackageVariantSet[^multi-ns-reg]. +1. The targets are determined. +1. For each target: + 1. The CEL environment is prepared with `repoDefault`, `packageDefault`, + `upstream`, and `target` variables. + 1. The downstream repository is determined and loaded, as follows: + - If present, `downstream.repoExpr` is evaluated using the CEL + environment, and the result used as the downstream repository name. + - Otherwise, if `downstream.repo` is set, that is used as the downstream + repository name. + - If neither is present, the default repository name based on the target is + used (i.e., the same value as the `repoDefault` variable). + - The resulting downstream repository name is used to load the corresponding + Repository object in the same namespace as the PackageVariantSet. + 1. The downstream Repository is added to the CEL environment. + 1. All other CEL expressions are evaluated. +1. Note that if any of the resources (e.g., the upstream PackageRevision, or the + downstream Repository) are not found our otherwise fail to load, processing + stops and a failure condition is raised. Similarly, if a CEL expression + cannot be properly evaluated due to syntax or other reasons, processing stops + and a failure condition is raised. + +#### Other Considerations +It would appear convenient to automatically inject the PackageVariantSet +targeting resource. However, it is better to require the package advertise +the ways it accepts injections (i.e., the GVKs it understands), and only inject +those. This keeps the separation of concerns cleaner; the package does not +build in an awareness of the context in which it expects to be deployed. For +example, a package should not accept a Porch Repository resource just because +that happens to be the targeting mechanism. That would make the package unusable +in other contexts. + +#### PackageVariantSet Status + +The PackageVariantSet status uses these conditions: + - `Stalled` is set to True if there has been a failure that most likely + requires user intervention. + - `Ready` is set to True if the last reconciliation successfully reconciled + all targeted PackageVariant resources. + +## Future Considerations +- As an alternative to the floating tag proposal, we may instead want to have + a separate tag tracking controller that can update PV and PVS resources to + tweak their upstream as the tag moves. +- Installing a collection of packages across a set of clusters, or performing + the same mutations to each package in a collection, is only supported by + creating multiple PackageVariant / PackageVariantSet resources. Options to + consider for these use cases: + - `upstreams` listing multiple packages. + - Label selector against PackageRevisions. This does not seem that useful, as + PackageRevisions are highly re-usable and would likely be composed in many + different ways. + - A PackageRevisionSet resource that simply contained a list of Upstream + structures and could be used as an Upstream. This is functionally equivalent + to the `upstreams` option, but that list is reusable across resources. + - Listing multiple PackageRevisionSets in the upstream would be nice as well. + - Any or all of these could be implemented in PackageVariant, + PackageVariantSet, or both. + +## Footnotes +[^porch17]: Implemented in Porch v0.0.17. +[^porch18]: Coming in Porch v0.0.18. +[^notimplemented]: Proposed here but not yet implemented as of Porch v0.0.18. +[^setns]: As of this writing, the `set-namespace` function does not have a + `create` option. This should be added to avoid the user needing to also use + the `upsert-resource` function. Such common operation should be simple for + users. +[^pvsimpl]: This document describes PackageVariantSet `v1alpha2`, which will be + available starting with Porch v0.0.18. In Porch v0.0.16 and 17, the `v1alpha1` + implementation is available, but it is a somewhat different API, without + support for CEL or any injection. It is focused only on fan out targeting, + and uses a [slightly different targeting + API](https://github.com/GoogleContainerTools/kpt/blob/main/porch/controllers/packagevariantsets/api/v1alpha1/packagevariantset_types.go). +[^repo-pkg-expr]: This is not exactly correct. As we will see later in the + `template` discussion, this the repository and package names listed actually + are just defaults for the template; they can be further manipulated in the + template to reference different downstream repositories and package names. + The same is true for the repositories selected via the `repositorySelector` + option. However, this can be ignored for now. +[^multi-ns-reg]: Note that the same upstream repository can be registered in + multiple namespaces without a problem. This simplifies access controls, + avoiding the need for cross-namespace relationships between Repositories and + other Porch resources. diff --git a/docs/CaD Core Architecture.svg b/docs/CaD Core Architecture.svg new file mode 100644 index 00000000..2650dbb2 --- /dev/null +++ b/docs/CaD Core Architecture.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/CaD Overview.svg b/docs/CaD Overview.svg new file mode 100644 index 00000000..7ced71bc --- /dev/null +++ b/docs/CaD Overview.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/Porch Architecture.svg b/docs/Porch Architecture.svg new file mode 100644 index 00000000..926bff1a --- /dev/null +++ b/docs/Porch Architecture.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/Porch Inner Loop.svg b/docs/Porch Inner Loop.svg new file mode 100644 index 00000000..829db8f1 --- /dev/null +++ b/docs/Porch Inner Loop.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 00000000..7c04825e --- /dev/null +++ b/docs/README.md @@ -0,0 +1,4 @@ +# Docs + +These are rough docs, brought over from the [kpt](https://github.com/kptdev/kpt) +repository when Porch moved here. They need some work. diff --git a/docs/packagevariant-clone.png b/docs/packagevariant-clone.png new file mode 100644 index 00000000..4f60fd9f Binary files /dev/null and b/docs/packagevariant-clone.png differ diff --git a/docs/packagevariant-config-injection.png b/docs/packagevariant-config-injection.png new file mode 100644 index 00000000..14fe0a35 Binary files /dev/null and b/docs/packagevariant-config-injection.png differ diff --git a/docs/packagevariant-context.png b/docs/packagevariant-context.png new file mode 100644 index 00000000..6fd7e0d5 Binary files /dev/null and b/docs/packagevariant-context.png differ diff --git a/docs/packagevariant-function.png b/docs/packagevariant-function.png new file mode 100644 index 00000000..d50cc7c7 Binary files /dev/null and b/docs/packagevariant-function.png differ diff --git a/docs/packagevariant-legend.png b/docs/packagevariant-legend.png new file mode 100644 index 00000000..83f3729a Binary files /dev/null and b/docs/packagevariant-legend.png differ diff --git a/docs/packagevariantset-target-list-with-packages.png b/docs/packagevariantset-target-list-with-packages.png new file mode 100644 index 00000000..891c01cf Binary files /dev/null and b/docs/packagevariantset-target-list-with-packages.png differ diff --git a/docs/packagevariantset-target-list.png b/docs/packagevariantset-target-list.png new file mode 100644 index 00000000..3c3931fe Binary files /dev/null and b/docs/packagevariantset-target-list.png differ diff --git a/docs/packagevariantset-target-repo-selector.png b/docs/packagevariantset-target-repo-selector.png new file mode 100644 index 00000000..8c07d268 Binary files /dev/null and b/docs/packagevariantset-target-repo-selector.png differ