Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect interpretation of x-ratelimit-reset for Okta #76

Open
6 tasks done
morganfogg opened this issue Sep 2, 2024 · 1 comment · May be fixed by openfga/sdk-generator#493
Open
6 tasks done

Incorrect interpretation of x-ratelimit-reset for Okta #76

morganfogg opened this issue Sep 2, 2024 · 1 comment · May be fixed by openfga/sdk-generator#493
Assignees
Labels
bug Something isn't working

Comments

@morganfogg
Copy link

morganfogg commented Sep 2, 2024

Checklist

  • I have looked into the README and have not found a suitable solution or answer.
  • I have looked into the documentation and have not found a suitable solution or answer.
  • I have searched the issues and have not found a suitable solution or answer.
  • I have upgraded to the latest version of OpenFGA and the issue still persists.
  • I have searched the Slack community and have not found a suitable solution or answer.
  • I agree to the terms within the OpenFGA Code of Conduct.

Description

Currently the rate-limit handling code interprets the contents of the x-ratelimit-reset header as a number of milliseconds from the current time to wait. (see:

However, the x-ratelimit-reset value being returned by Okta FGA is a Unix timestamp rather than an offset from the current time, meaning hitting the rate limit will theoretically cause the code to hang for ~20 days. Further down it also seems to interpret it as either a number of seconds or minutes from the current time, though this property doesn't seem to be used in the delay. An example of the headers I am receiving are:

X-Ratelimit-Limit: 30
X-Ratelimit-Remaining: 0
X-Ratelimit-Reset: 1725235793

Retry-After is a standard header used by APIs to indicate when the SDK can retry.

The SDK should:

  • Honor this header on 429s
  • Expose this header value in the error when received
  • Fallback to exponential retry when this header is not available (e.g. not sent by the server on 429s or on e.g. 500s)
  • Drop support for retrying based on X-Rate-Limit-Reset (currently only .NET SDK supports that)

Expectation

The SDK interprets the rate limit reset header as a Unix timestamp, or provides an option to do so.

Based on openfga/sdk-generator#495, it should do the following

Retry On

  • For queries that affect state:
    • Retry on 429s, falling back to exponential backoff
    • Retry on 5xxs (except 501 not implemented) only if the Retry-After headers are sent - do not fall back to exponential backoff
  • For all others:
    • All 429s, and >=500 (except 501 not implemented)

Max Allowable Retries

15

Default Number of Retries

SDKs: 3

Retry Parameters

  1. If Retry-After header is found, use it

    1. if it is an integer, treat it as the number of seconds from now to retry, if it is <1 from now or >1800 from now (aka >30 min) - assume it is invalid and continue
    2. if it is a date, parse it but if it is <1 from now or >1800 from now (aka >30 min) - assume it is invalid and continue
  2. If neither header is found, use exponential backoff but we'll add some jitter, so the retry is a random number between

    1. 2^loopCount * 500ms and 2^(loopCount + 1) * 500ms
    2. if the result of (a) is > 120s, cap it at 120s which should happen between the 8th and 9th retry

That means:

  • if retry-after header was returned and is valid, we’ll use it - so if it says in 4 min all good
  • if retry-after header was not returned, we will retry at:
    • 500ms
    • 1s
    • 2s
    • 4s
    • 8s
    • 16s
    • 32s
    • 64s
    • 120s ← at this point is is >4min since initial call
    • 120s
    • 120s
    • 120s
    • 120s
    • 120s

Reproduction

  1. Perform a large number of requests against the endpoint https://api.au1.fga.dev
  2. SDK hangs indefinitely after the rate limit is hit.

OpenFGA SDK version

0.5.0

OpenFGA version

N/A

SDK Configuration

ApiUrl, StoreId, AuthorizationModelId and Credentials are provided. No other configuration.

Logs

No response

References

The header is interpreted as a number of milliseconds from the current time here:

And the delay happens here:

await Task.Delay(waitInMs);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Backlog
Development

Successfully merging a pull request may close this issue.

3 participants
@rhamzeh @morganfogg and others