-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polygon to cells experimental fuzzer #800
Polygon to cells experimental fuzzer #800
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that maxPolygonToCellsSize
doesn't allocate enough cells for the new modes in some cases. I've seen similar behavior here in benchmarks - this could be an issue that affects the old polygonToCells
as well.
I have a plan for a new maxPolygonToCellsSize
algo that I'll try out, this may fix the fuzzer errors.
Has this polygon been added as a test case to ensure we aren't dropping a lot of cells there? |
I'd rather add some simplified version, as the Antarctica polygon is too large to be easily added to a test file. I can add a new PR with more tests for this. |
Still seeing errors here with the new estimator. Is there an easy-ish way to get the input for the failure cases? |
Here is the hexdump of a new crash case for
|
Nick Rabinowitz seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
I tried adding an additional check today to recompute the expected array size and bail if the function is about to write past the end of the array. That seems to, at the cost of recomputing the size, avoid any remaining crashes. Unfortunately there is a timeout with some input which is currently failing the test. |
@@ -433,7 +433,8 @@ void iterStepPolygonCompact(IterCellsPolygonCompact *iter) { | |||
|
|||
// Target res: Do a fine-grained check | |||
if (cellRes == iter->_res) { | |||
if (mode == CONTAINMENT_CENTER || mode == CONTAINMENT_OVERLAPPING) { | |||
if (mode == CONTAINMENT_CENTER || mode == CONTAINMENT_OVERLAPPING || | |||
mode == CONTAINMENT_OVERLAPPING_BBOX) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. These additional checks might get the fuzzer to pass, but they miss the point of CONTAINMENT_OVERLAPPING_BBOX
, which is to do a much faster check than CONTAINMENT_OVERLAPPING
. The perf impact here is likely significant, since we're going from a fast bbox check to a slow set of polygon-based checks.
Are all of these additional checks needed to make the fuzzer pass? Or can we narrow down to find the check that's actually missed by CONTAINMENT_OVERLAPPING_BBOX
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All but the change around line 511 seem to be necessary to prevent crashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit that I'm not seeing any significant differences in the benchmarks after this change, assuming I'm running them correctly. I think I'm ok with this to get the new algo out the door, though I'd really like to look through the logic here and understand what's missing in the bbox check - it doesn't make a lot of sense to me conceptually that these would work when the bbox doesn't, so we must be messing up the bbox check in some way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted elsewhere, this seems ok for now, but we should definitely investigate why it's necessary, as the changes to make the fuzzer pass could have significant perf impact.
@@ -692,9 +696,27 @@ void iterDestroyPolygon(IterCellsPolygon *iter) { | |||
H3Error H3_EXPORT(polygonToCellsExperimental)(const GeoPolygon *polygon, | |||
int res, uint32_t flags, | |||
H3Index *out) { | |||
#ifdef H3_POLYGON_TO_CELLS_ASSERT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we comment on how to set this variable? Can we use the TESTONLY
macro here instead?
Do we still need to add a simplified Antartica-like polygon to the tests? I could help with that. |
I'm overriding the CLA check since the only people with commits in this branch are myself and @nrabinowitz (who may have used an old email address, but has certainly signed the CLA before) |
This adds new fuzzers for the
polygonToCellsExperimental
function, based on the existing functions. Since #796 (this PR is based on that branch) adds containment modes, this updates the existing fuzzers to exercise those options too.When I run the
fuzzerPolygonToCellsExperimental
, I see some runtime errors about misaligned addresses, which doesn't cause an issue on Apple M1 architecture but might indicate trouble on other architectures. When I runfuzzerPolygonToCellsExperimentalNoHoles
I get a crash aboutheap-buffer-overflow
aroundpolygonToCellsExperimental polyfill.c:646
(looks like the output exceeds the allocated output buffer in some case?)