Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kfp-operators upgrades from 2.0.0-alpha.7 -> 2.0.3 for CKF 1.8 release #362

Merged
merged 15 commits into from
Nov 20, 2023

Conversation

DnPlas
Copy link
Contributor

@DnPlas DnPlas commented Oct 26, 2023

This PR merges 1.8-updates-dev-branch into main, which includes changes for (but not limited to):

Testing instructions

Tesnting environment:

  • Microk8s 1.25-strict/stable

Make sure you run microk8s config > ~/.kube/config before continuing. Some kfp-operators components
use in cluster kubeconfig.

  • juju 3.1/stable
  1. Deploy the bundle provided in this PR juju deploy ./ckf-1.8-dev.yaml --trust
  2. Refresh all charms to the latest upload of this PR
juju refresh kfp-api                    --channel latest/edge/pr-362 --resource oci-image=gcr.io/ml-pipeline/api-server:2.0.3
juju refresh kfp-db                     --channel 8.0/stable
juju refresh kfp-persistence            --channel latest/edge/pr-362 --resource oci-image=gcr.io/ml-pipeline/persistenceagent:2.0.3
juju refresh kfp-profile-controller     --channel latest/edge/pr-362 --resource oci-image=python:3.7
juju refresh kfp-schedwf                --channel latest/edge/pr-362 --resource oci-image=gcr.io/ml-pipeline/scheduledworkflow:2.0.3
juju refresh kfp-ui                     --channel latest/edge/pr-362 --resource ml-pipeline-ui=gcr.io/ml-pipeline/frontend:2.0.3
juju refresh kfp-viewer                 --channel latest/edge/pr-362 --resource kfp-viewer-image=gcr.io/ml-pipeline/viewer-crd-controller:2.0.3
juju refresh kfp-viz                    --channel latest/edge/pr-362 --resource oci-image=gcr.io/ml-pipeline/visualization-server:2.0.3
  1. Ensure the apiserver is up and ready
$ kubectl logs -nkubeflow kfp-api-0  -c apiserver -f

# The last message from the apiserver should be something like this:
2023-11-17T08:42:23.667Z [pebble] Service "apiserver" starting: bash -c 'sleep 1.1 && /bin/apiserver --config=/config --sampleconfig=/config/sample_config.json -logtostderr=true '
2023-11-17T08:42:24.811Z [apiserver] I1117 08:42:24.811447     104 client_manager.go:170] Initializing client manager
2023-11-17T08:42:24.811Z [apiserver] I1117 08:42:24.811679     104 config.go:57] Config DBConfig.MySQLConfig.ExtraParams not specified, skipping
2023-11-17T08:42:25.036Z [apiserver] I1117 08:42:25.036461     104 client_manager.go:511] Successfully created bucket mlpipeline
2023-11-17T08:42:25.036Z [apiserver] I1117 08:42:25.036823     104 swf.go:64] (Expected when in cluster) Failed to create scheduled workflow client by out of cluster kubeconfig. Error: stat /root/.kube/config: no such file or directory
2023-11-17T08:42:25.036Z [apiserver] I1117 08:42:25.036853     104 swf.go:66] Starting to create scheduled workflow client by in cluster config.
2023-11-17T08:42:25.037Z [apiserver] I1117 08:42:25.037592     104 client_manager.go:214] Client manager initialized successfully
2023-11-17T08:42:27.138Z [apiserver] I1117 08:42:27.138649     104 main.go:284] All samples are loaded.
2023-11-17T08:42:27.138Z [apiserver] I1117 08:42:27.138709     104 main.go:143] Starting Http Proxy
2023-11-17T08:42:27.139Z [apiserver] I1117 08:42:27.139794     104 main.go:96] Starting RPC server 
  1. Login into the Kubeflow dashboard
  2. Create an experiment
  3. Create a run from the "Data passing..." sample pipeline.
  4. Wait for completion
  5. Create a recurring run and wait for completion

Please note both CI tests and manual test should pass before we merge this PR.

Test bundle

ckf-1.8-dev.txt

* feat: add sa token component to kfp-persistence

The kfp-persistence charms needs a ServiceAccount token to be
inside the workload container to be able to start the service.
The sa token component will generate the required token and save it
in a file where it can be accessed to later be pushed from the charm
to the workload container.

Fixes #343

* tests: use mysql 8.0/stable for integration tests
NohaIhab and others added 11 commits November 20, 2023 15:11
* fix: apply authorization policy with relaxed rule
* refactor: refactor env variables for kfp-api to match v2.0.2

For a few releases the KFP API server component has been defining
certain configurations using env variables instead of a config file.
This change plus adding/removing vars will ensure the charmed kfp-api
keeps consistent with the upstream project.
* chore: update kfp-operators 2.0.0-alpha.7 -> 2.0.2

Update all resources according to the pipelines 2.0.2 upstream project.

Signed-off-by: Phoevos Kalemkeris <[email protected]>
Co-authored-by: Daniela Plascencia <[email protected]>
…nv var (#361)

The METADATA_GRPC_SERVICE_HOST env var has to include the name
and namespace of the metadata grpc server, otherwise pipeline runs
end in errors when trying to write in the metadata SQLite DB.

This commit reverts a change introduced by canonical/kfp-operators#3312.
We implement a sync.py file from upstream which we lightly modify.  This commit reformats that sync.py to be as similar as possible to upstream's sync.py for easier comparison and maintenance.  #331 previously reformatted sync.py to meet our style guide, but that made it harder to track and changes upstream and import them into our version

This should have no functional change on the charm, it only changes things to make maintenance easier.
* update kfp-api's apiserver configuration

This:
* removes deprecated `DBCONFIG_USER`, etc, environment variables (they have been replaced by variables of name `DBCONFIG_[driver]CONFIG_*`)
* adds `OBJECTSTORECONFIG_HOST`, `_PORT`, and `_REGION`, which previously were required.  Although currently they seem to be ignored due to kubeflow/pipelines#9689 - but in theory they'll matter again?  Not sure exactly the scope of that issue.
…sts (#364)

* chore: updates the test bundle definitions and use latest/edge for tests

This commit updates the many bundle defintions in tests/integration/bundles/ to match the
latest changes to the repository.
At the same time it changes the bundle that is used for running repository level integration tests from
1.7/stable to latest/edge to be able to test the kfp-operators and all other related charms with the
latest changes.
* tests: enable bundle functional tests for v2 compiled pipelines

The recent upgrades in kubeflow pipelines introduced a new version of the SDK that is
compatible with the newer backend (pipelines 2.0.3). This commits ensures that the CI
tests both a pipeline v1 and v2 to ensure good coverage.
orfeas-k
orfeas-k previously approved these changes Nov 20, 2023
Copy link
Contributor

@orfeas-k orfeas-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes have been reviewed in their corresponding PR. Also, the commit we 're merging to main has been tested as part of PR #383.
A slight note, I think we could add a cleaner name to the PR

@DnPlas DnPlas changed the title 1.8 updates dev branch merge 1.8-updates-dev-branch into main Nov 20, 2023
* fix: reduce flakiness of kfp-api integration tests

use mysql-k8s from 8.0/stable branch instead of latest/edge
@DnPlas DnPlas changed the title merge 1.8-updates-dev-branch into main kfp-operators upgrades from 2.0.0-alpha.7 -> 2.0.3 for CKF 1.8 release Nov 20, 2023
Copy link
Contributor

@ca-scribner ca-scribner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything was also tested as it was landed in the dev branch. lgtm!

@DnPlas DnPlas merged commit 9386488 into main Nov 20, 2023
46 checks passed
@orfeas-k orfeas-k deleted the 1.8-updates-dev-branch branch November 28, 2024 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants