Retry on etcd too many requests error #132

omertuc · 2024-04-25T10:42:09Z

tl;dr

A fix for a rare error:

Error: finalizing

Caused by:
    0: commiting etcd cache to actual etcd
    1: grpc request error: status: Unknown, message: "etcdserver: too many requests", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }

Background

When committing our in-memory etcd representation to actual etcd, we send all delete requests concurrently (we have many).

Issue

Sometimes this leads to us receiving an error from etcd which says "etcdserver: too many requests". Recert treated this error as a hard error and as a result it exits.

Solution

Compare the error string to this exact phrasing (as there doesn't seem to be a more robust error code we can check, the code just says Unknown), and if we encounter it, just repeat the request again. Eventually hopefully all requests should go through.

# tl;dr A fix for a rare error: ``` Error: finalizing Caused by: 0: commiting etcd cache to actual etcd 1: grpc request error: status: Unknown, message: "etcdserver: too many requests", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} } ``` # Background When committing our in-memory etcd representation to actual etcd, we send all delete requests concurrently (we have many). # Issue Sometimes this leads to us receiving an error from etcd which says "etcdserver: too many requests". Recert treated this error as a hard error and as a result it exits. # Solution Compare the error string to this exact phrasing (as there doesn't seem to be a more robust error code we can check, the code just says `Unknown`), and if we encounter it, just repeat the request again. Eventually hopefully all requests should go through.

openshift-ci · 2024-04-25T10:42:28Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: omertuc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [omertuc]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

omertuc · 2024-04-25T15:49:15Z

/retest

omertuc · 2024-04-25T18:02:40Z

/retest

mresvanis · 2024-04-26T07:28:48Z

/test baremetalds-sno-recert-cluster-rename

mresvanis · 2024-04-26T07:30:04Z

/lgtm

omertuc · 2024-04-26T12:26:44Z

/retest

mresvanis · 2024-04-26T14:55:16Z

/retest

mresvanis · 2024-04-29T09:21:37Z

/retest

omertuc · 2024-04-29T11:27:36Z

I'm starting to think it's actually broken

omertuc · 2024-04-29T11:45:21Z

/retest

omertuc · 2024-04-29T11:46:26Z

Checking clean CI in #134

omertuc · 2024-05-02T21:06:15Z

/retest

omertuc · 2024-05-03T14:38:54Z

/retest

omertuc · 2024-05-15T13:26:05Z

/retest

mresvanis · 2024-05-21T09:45:36Z

/test baremetalds-sno-recert-cluster-rename

mresvanis · 2024-05-21T12:13:07Z

/lgtm

omertuc · 2024-05-21T12:17:20Z

/hold not sure if works

omertuc · 2024-05-21T12:17:26Z

/retest

omertuc · 2024-05-21T13:23:35Z

/test e2e-aws-ovn-single-node-recert-serial

eranco74 · 2024-06-09T10:57:12Z

/test e2e-aws-ovn-single-node-recert-serial

omertuc · 2024-06-11T08:10:19Z

/unhold

mresvanis · 2024-06-11T13:21:07Z

/test e2e-aws-ovn-single-node-recert-serial

mresvanis · 2024-06-12T09:56:05Z

/override ci/prow/e2e-aws-ovn-single-node-recert-parallel

openshift-ci · 2024-06-12T09:56:10Z

@mresvanis: Overrode contexts on behalf of mresvanis: ci/prow/e2e-aws-ovn-single-node-recert-parallel

In response to this:

/override ci/prow/e2e-aws-ovn-single-node-recert-parallel

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci bot requested a review from mresvanis April 25, 2024 10:42

openshift-ci bot added the approved label Apr 25, 2024

openshift-ci bot assigned mresvanis Apr 26, 2024

openshift-ci bot added the lgtm label Apr 26, 2024

openshift-ci bot added the do-not-merge/hold label May 21, 2024

openshift-ci bot removed the do-not-merge/hold label Jun 11, 2024

openshift-merge-bot bot merged commit 3b58208 into rh-ecosystem-edge:main Jun 12, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry on etcd too many requests error #132

Retry on etcd too many requests error #132

omertuc commented Apr 25, 2024 •

edited

Loading

openshift-ci bot commented Apr 25, 2024

omertuc commented Apr 25, 2024

omertuc commented Apr 25, 2024

mresvanis commented Apr 26, 2024

mresvanis commented Apr 26, 2024

omertuc commented Apr 26, 2024

mresvanis commented Apr 26, 2024

mresvanis commented Apr 29, 2024

omertuc commented Apr 29, 2024

omertuc commented Apr 29, 2024

omertuc commented Apr 29, 2024

omertuc commented May 2, 2024

omertuc commented May 3, 2024

omertuc commented May 15, 2024

mresvanis commented May 21, 2024

mresvanis commented May 21, 2024

omertuc commented May 21, 2024

omertuc commented May 21, 2024

omertuc commented May 21, 2024

eranco74 commented Jun 9, 2024

omertuc commented Jun 11, 2024

mresvanis commented Jun 11, 2024

mresvanis commented Jun 12, 2024

openshift-ci bot commented Jun 12, 2024

Retry on etcd too many requests error #132

Retry on etcd too many requests error #132

Conversation

omertuc commented Apr 25, 2024 • edited Loading

tl;dr

Background

Issue

Solution

openshift-ci bot commented Apr 25, 2024

omertuc commented Apr 25, 2024

omertuc commented Apr 25, 2024

mresvanis commented Apr 26, 2024

mresvanis commented Apr 26, 2024

omertuc commented Apr 26, 2024

mresvanis commented Apr 26, 2024

mresvanis commented Apr 29, 2024

omertuc commented Apr 29, 2024

omertuc commented Apr 29, 2024

omertuc commented Apr 29, 2024

omertuc commented May 2, 2024

omertuc commented May 3, 2024

omertuc commented May 15, 2024

mresvanis commented May 21, 2024

mresvanis commented May 21, 2024

omertuc commented May 21, 2024

omertuc commented May 21, 2024

omertuc commented May 21, 2024

eranco74 commented Jun 9, 2024

omertuc commented Jun 11, 2024

mresvanis commented Jun 11, 2024

mresvanis commented Jun 12, 2024

openshift-ci bot commented Jun 12, 2024

omertuc commented Apr 25, 2024 •

edited

Loading