Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document window staggering behavior #120

Open
twz915 opened this issue Jul 11, 2017 · 9 comments
Open

Document window staggering behavior #120

twz915 opened this issue Jul 11, 2017 · 9 comments
Milestone

Comments

@twz915
Copy link

twz915 commented Jul 11, 2017

https://github.com/jsocol/django-ratelimit/blob/master/ratelimit/utils.py#L79
w = ts - (ts % period) + (zlib.crc32(value) % period)
I've found it's hard to understand the code, could you please explain why add a crc32 constant?

def _get_window(value, period):
    ts = int(time.time())
    if period == 1:
        return ts
    if not isinstance(value, bytes):
        value = value.encode('utf-8')
    w = ts - (ts % period) + (zlib.crc32(value) % period)
    if w < ts:
        return w + period
    return w
@jsocol
Copy link
Owner

jsocol commented Jul 11, 2017

Hmm, I thought there was an issue or at least commit comment at some point. This should be documented—at least the behavior if not the implementation.

_get_window() gets a fixed counting window for the given value that is staggered so not all windows expire at the same time.

If a rate was 200/1h, we don't want everyone to be able to start making requests at exactly the same time (e.g. at 08:00, all ratelimited users are unratelimited again and traffic spikes). So we shift the windows. So my hour window might be from 08:21:12 to 09:21:12 and yours might be from 08:42:09 to 09:42:09 (and 10:42:09, etc). It does this for every unit except seconds.

crc32 is a decent, quick way of turning an arbitrary bytestring into an integer. It's distributed randomly-enough for our purposes here, doing a reasonable job of spreading out the windows.

@jsocol
Copy link
Owner

jsocol commented Jul 11, 2017

Renaming and leaving this open as an issue to document the window staggering behavior.

@jsocol jsocol changed the title _get_windows, could you please explain why add a crc32 constant? Document window staggering behavior Jul 11, 2017
@twz915
Copy link
Author

twz915 commented Jul 11, 2017

@jsocol Thanks for your clear explaination. It's an excellent design on this occasion.

@marfire
Copy link

marfire commented Jul 18, 2017

I believe this is already documented here. It might be better to move it out of the easily-missed "Upgrade Notes" section, though, since it's of interest to new users as well.

@jsocol
Copy link
Owner

jsocol commented Jul 19, 2017

Aha! Thank you, @marfire—I thought it was written down somewhere.

I'll try to move the docs about window behavior (fixed, staggered) into a better, more obvious spot, but it might take me a little bit to get there. If anyone else has time to take a pass or at least suggest a spot they might expect to find that info in the existing docs, let me know!

@marfire
Copy link

marfire commented Jul 19, 2017

You're welcome, and thank you for the package.

I think putting window information on the Rates page would be a natural fit.

@phillbaker
Copy link

I'm wondering if instead of staggering buckets, an algorithm like the token bucket algorithm would be an appropriate approach?

It seems like the token bucket accomplishes the goals of rate limiting while also avoiding the thundering herd problem since bucket capacity is replenished gradually, rather than being immediately refilled at the beginning of each period.

@jsocol
Copy link
Owner

jsocol commented Aug 17, 2018

That is a really interesting approach! I hadn’t heard of it before. The Stripe post was really great, too, just as a resource. They didn’t describe their implementation, but I did find one from ClassDojo.

This algorithm is definitely more complex and I don’t think it’s a good fit for a default because I don’t see how you could avoid the race condition issues with Memcached—or generally without using methods not available through the Django cache interface. That would tightly couple ratelimit to Redis and I don’t think that’s OK. (I am happy to be proven wrong, though, if someone can describe a race-condition-free variant using only Django cache API methods.)

A long time ago there was a concept of pluggable backends that was removed in favor of relying directly and purely on the cache API. I still generally believe that’s the right call for this library, and in particular I’m not sure how the token bucket approach would work with the current @ratelimit decorator API. But I think there’s definitely room for more libraries out there! Or if you want to open another issue with a proposal, please do!

@phillbaker
Copy link

Yup, makes sense. Thanks for the thoughts - I agree this would be hard to do in memcache!

@jsocol jsocol modified the milestones: 1.1, 3.0 Feb 20, 2019
@jsocol jsocol modified the milestones: 3.0, 3.1 Jan 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants