Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing latest changes from upstream main for ramen #163

Merged
merged 6 commits into from
Dec 13, 2023
Merged

Conversation

df-build-team
Copy link

PR containing the latest commits from upstream main branch

nirs added 6 commits December 12, 2023 19:33
In cluster replacement flow, we may need to disable DR when a drcluster
is not available, which breaks disable DR flow in many places.

To allow cleanup to proceed on the hub and on the remaining managed
cluster, the user will have to delete the drcluster for the failed
cluster.

Signed-off-by: Nir Soffer <[email protected]>
When getting or updating vrgs from managed clusters, we passed the
drpolicy, and used it's drcluster names for fetching the vrg. To handle
unavailable drcluster, we need the DRCluster type instead of it name,
but we don't have the required context and client to get it in
getVRGsFromManagedClusters().

Fixed by getting the drclusters in the caller when we have the context
and client, and pass down the drcluster list instead of the drpolicy.

Signed-off-by: Nir Soffer <[email protected]>
When drcluster is unreachable the ManagedClusterView may continue to
report the last status for the VRG and no error. This fails disable DR
when we wait until all VRGs are deleted.

Fixed by ignoring VRGs on deleted drclusters. Once the VRG on the
available drcluster is deleted, VRG count reach zero and we can complete
drpc deletion.

Since this change is in the getVRGsFromManagedClusters() is also affects
creatingDRPCInstance and maintainance mode. I'm not sure about these
changes yet, but it seems a good idea not to handle VRG on a deleted
drcluster.

Signed-off-by: Nir Soffer <[email protected]>
When creating s3 profile names, include only the s3 profiles from
available drclusters.

This affects generating a VRG for manifestwork. When validating failover
prerequisites we already use only the failover clsuter, and this cluster
cannot be deleted.

Another case of drcluster deploy/undeploy flow, which still use all
drclusters and may require more work.

Signed-off-by: Nir Soffer <[email protected]>
- More consistent logging - "Creating ManifestWork" or "Updating
  ManifestWork".
- Extract key variable to clean up the Client.Get() calls
- Eliminate unneeded temporary err and retryErr variables
- Remove commented code

Signed-off-by: Nir Soffer <[email protected]>
When a drcluster is deleted, the VRG s3 profiles list changes, so we
need to update the manifestwork.

In ensureVRGManifestWork() we skipped updating the manifestwork if the
VRG exists, now we always update the manifestwork. This requires little
bit more work for every reconcile, but should be efficient enough since
the vrg and manifestwork are cached.

Issues:

- It is not clear if ensureVRGManifestWork() is safe only for primary
  vrg, and it callers are using it correctly. But if we had wrong
  callers before, this change increases the impact of a wrong call,
  updating existing VRG which was not updated before. Add a TODO to look
  at this later, so we can test replace cluster flow now.

- The VRG is updated few minutes after detecting a deleted drcluster,
  since we don't watch drcluster delete events, and we filter drclsuter
  events only for drpc failing over to updated clusters.

Signed-off-by: Nir Soffer <[email protected]>
@df-build-team df-build-team requested a review from a team December 13, 2023 08:06
Copy link

openshift-ci bot commented Dec 13, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: df-build-team

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ShyamsundarR ShyamsundarR merged commit 0289d8c into main Dec 13, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants