-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get packet buffer size from operating system #330
Comments
The Line 250 in 35005f0
It's interesting that there's a significant difference with different buffer sizes then, can't really see why that would be the case but I can imagine the future will be pretty big due to the 64K buffer so that I'd suggest to instead move the buffer to the heap (i.e use a |
Well you can try the benchmark yourself, it records the previous and tells you the diff , it will tell you if it's within the noise threshold. I included an example output (though not from the same run) Some of it is probably noise, but yeah it is also not great that we're allocating 65kb on the stack or in general, because if we want to run a lot of instances of Quilkin it will add up pretty quickly. We'd be able to get a more fine grained view of the performance in tokio with #317.
Yeah I wouldn't want to add the OS code to Quilkin itself, not the least because it would be generally more useful to other projects, but if someone in the community makes or has already made a cross platform lib that handles all the complexity and we just get the size (similar to |
Allocating the 64K buffer on the stack makes the future expensive to move. This allocates the buffer on the heap instead. Noticed some significant perf improvements in load tests via https://github.com/majek/dump/tree/master/how-to-receive-a-million-packets Tried using the naive benchmark from #321 but that doesn't seem consisten atm - constantly got perf regressions/improvements on reruns even with no code change (I think either running both the proxy and the mock server within the same process/scheduler adds too much noise or we don't have a large enough unit of work) Relates to #330
* Move packet buffer to heap Allocating the 64K buffer on the stack makes the future expensive to move. This allocates the buffer on the heap instead. Noticed some significant perf improvements in load tests via https://github.com/majek/dump/tree/master/how-to-receive-a-million-packets Tried using the naive benchmark from #321 but that doesn't seem consisten atm - constantly got perf regressions/improvements on reruns even with no code change (I think either running both the proxy and the mock server within the same process/scheduler adds too much noise or we don't have a large enough unit of work) Relates to #330 * Initialize buffer Co-authored-by: Mark Mandel <[email protected]>
Currently Quilkin always allocates a
u16::MAX
length buffer for every message, this is incorrect for a number of reasons. Firstly, the max length is slightly less than that as it's really65535 - (IP + UDP headers)
, further is far too large compared to realistic workloads which are typically in the 200–500 bytes range (to prevent IPv4 fragmentation), and is much bigger than what most OS' set as the limit for UDP payloads. For example; macOS by default has a limit of 9216.Just testing #321 with different workloads and buffer sizes, having a lower buffer has a significant impact on the performance. For example having a buffer of 1500 bytes had a 15% increase in throughput for smaller workloads. So instead of always using the maximum, we should try to be a bit smarter about how much we allocate. A good default would be to query the OS for the maximum UDP buffer, I also think it would be worth adding a server level option for setting a buffer limit manually allowing you to lower it for when you're using it with a protocol with a smaller MTU.
quilkin/src/proxy/server.rs
Lines 240 to 242 in 35005f0
The text was updated successfully, but these errors were encountered: