Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing latest changes from upstream main for ramen #423

Merged
merged 13 commits into from
Dec 11, 2024
Merged

Conversation

df-build-team
Copy link

PR containing the latest commits from upstream main branch

ELENAGER and others added 13 commits December 8, 2024 16:13
- Log when we start to wait for readiness
- Replace INFO logs with more detailed error on timeout
- Replace checkDRPCConditions() with more generic condtionMet()
- Use meta.FindStatusConditions to find the conditions
- Use ramen.Condition* constants

Signed-off-by: Nir Soffer <[email protected]>
For managed apps we waited for Relocated or FailedOver state. For
discovered apps we waited for Completed progression. Unify both to wait
for the state.

Signed-off-by: Nir Soffer <[email protected]>
Inline waitDRPC in the 2 call sites, to make it clear that we wait for
the drpc state and then for readiness. Adding trivial wrapper with
unclear name is not helpful.

Signed-off-by: Nir Soffer <[email protected]>
We have different versions of ubuntu being used across the github
workflows. Change to 24.04 version throughout. Github workflows is
changing the `latest` tag to 24.04 by 2025-01-17.

Signed-off-by: Raghavendra Talur <[email protected]>
When a group is specified in the restore sequence, it uses the name of
the group to find the backup to restore the resources. All the resources
in the backup will be restored in this step of restore.

However, if one wishes to restore a subset of resources from a backup,
then they can create a different group. This group can specify the name
of the backup that has the resources by setting the BackupRef field.

This commit makes specifying the BackupRef optional. If a backup is
found with the same name as the group in the restore sequence, it will
be used. If not, that will be treated as an error.

Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
We returned false, swallowing the error silently.

The drpc remained in this state:

    - lastTransitionTime: "2024-12-10T12:45:26Z"
      message: unable to start failover, spec.FailoverCluster (dr2) is not a valid
        Secondary target

With no clue why dr2 is not a valid Secondary cluster. The new log may
explain the reason.

Signed-off-by: Nir Soffer <[email protected]>
Since we added a VRG on the secondary cluster we have a random failure
when deleting the DRPC after relocate. When this happens, we find the
PVC in terminating state on the secondary cluster, and the VR and VRG
are never deleted.

This change avoids this issue by deleting the secondary VRG first, and
deleting the primary VRG only after the secondary VRG was deleted.

When we wait for the deletion of a VRG, we must requeue the request. In
a real system we may be able to return success and detect the deletion
when a watched resource changes, but in unit tests breaks if we don't
requeue the request. The only way to do this with the current code is to
return an error. This logs many errors in the log during deletion, like:

    ERROR   Secondary VRG manifestwork deletion in progress
    very noisy
        stacktrace...

    ERROR Primary VRG manifestwork deletion in progress
    very noisy
        stacktrace...

More work is needed to silence the expected errors.

Fixes: RamenDR#1659
Signed-off-by: Nir Soffer <[email protected]>
Copy link

openshift-ci bot commented Dec 11, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: df-build-team

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ShyamsundarR ShyamsundarR merged commit a80adfb into main Dec 11, 2024
41 of 43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants