-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of memory after it is freed #8
Comments
NOLINT some code as I cannot easily make it pass clang-tidy checks. Fixes JacksonAllan#8 but will likely need a good review.
Hi! Thanks for working on this. I was able to spend a few hours investigating it today. I hadn’t used clang-tidy before, so I hope I’m understanding its output correctly and apologize in advance if I seem uninformed. The tool seems to report a lot of issues related to the // Instantiating a map template.
#define NAME int_int_map
#define KEY_TY int
#define VAL_TY int
#include <https://raw.githubusercontent.com/JacksonAllan/Verstable/main/verstable.h>
int main(void)
{
int_int_map our_map;
int_int_map_init( &our_map );
int_int_map_insert_raw(&our_map, 1, &(int){1}, false, true);
int_int_map_cleanup(&our_map);
} Godbolt link with clang-tidy window. Here’s the full clang-tidy output:
So clang-tidy is reporting, for example, a (potential?) Given that
It looks like clang-tidy is getting confused by the logic of setting I’m not sure what the best solution here is. How important is it to support clang-tidy if we know that it is repeatedly generating false positives? I suspect that the outcome will be |
Hi! Thanks for a very clean API for this implementation of a hashmap. Compared to others I've looked at it's a joy to work with. Thank you. I agree with your assessment about the false positives, except for the use of memory after it's freed. On line 1299, if we failed to insert an entry we then free newtable.buckets, double bucket_count and then continue back to line 1286 and attempt to insert a new entry into a bucket we just freed. Firstly, the continue statement is odd because it's at the end of the for loop already - it's a non op. I think the intent of the continue was to goto table creation again and start over with an increased bucket count. Am I wrong here? Is this another false positive and if so can you please explain why?
I don't think supporting it directly is fair, other analysis tools can and do emit other false positives based on personal experience in other projects. However, if others use analysis tools on their code (which is more and more common place) at least having a comment somewhere about it would be best maybe including a light technical reason why it's a false positive. Specifically for clang-tidy, there is no way of saying "ignore all analysis from a header file". They have recently committed an option to exclude a header, but I think it will still analyse macros which won't help here so the only way silencing clang-tidy is to add I think adding // NOLINT is fine as it's quite short and doesn't affect the readability to any detriment and is not clang-tidy specific. |
Thanks, glad you're finding it useful! There’s a work-in-progress draft of an article benchmarking the library against other hash table libraries here. The actual data shown currently shown there is old, but newer data is available here.
You’re absolutely right! Sorry, I should have examined your PR more closely before jumping to conclusions. It seems that this bug was able to go undetected because of its rarity. This code handles the case that upon doubling the table's bucket count, one of the existing keys cannot be reinserted within 211 quadratic jumps (i.e. what I call “quadratic displacement” in the in-code comments) of its home bucket. Under normal usage, I think this should basically never happen (Edit: By the way, this bug also affects CC, whose hash-table implementation is based on Verstable. So what I’d like to do now is release a patch within the next day or two fixing this bug and the unnecessary cast-away-const in the endian-checking functions (#6). As for the Edit:
There is |
OK, that's fair. I have to disable the clang-tidy checks for memcpy_s and friends because our build target is Alpine Linux which uses musl libc which does not support those. Sadly, I cannot use |
I've uploaded a preliminary fix to the dev branch. It's slightly different to the one you proposed. I think there should either be a bool VT_CAT( NAME, _rehash )( NAME *table, size_t bucket_count )
{
// The attempt to resize the bucket array and rehash the keys must occur inside a loop that incrementally doubles the
// target bucket count because a failure could theoretically occur at any load factor due to the displacement limit.
while( true )
{
NAME new_table = {
0,
bucket_count - 1,
NULL,
NULL
#ifdef CTX_TY
, table->ctx
#endif
};
void *allocation = MALLOC_FN(
VT_CAT( NAME, _total_alloc_size )( &new_table )
#ifdef CTX_TY
, &new_table.ctx
#endif
);
if( VT_UNLIKELY( !allocation ) )
return false;
new_table.buckets = (VT_CAT( NAME, _bucket ) *)allocation;
new_table.metadata = (uint16_t *)( (unsigned char *)allocation + VT_CAT( NAME, _metadata_offset )( &new_table ) );
memset( new_table.metadata, 0x00, ( bucket_count + 4 ) * sizeof( uint16_t ) );
// Iteration stopper at the end of the actual metadata array (i.e. the first of the four excess metadata).
new_table.metadata[ bucket_count ] = 0x01;
for( size_t bucket = 0; bucket < VT_CAT( NAME, _bucket_count )( table ); ++bucket )
if( table->metadata[ bucket ] != VT_EMPTY )
{
VT_CAT( NAME, _itr ) itr = VT_CAT( NAME, _insert_raw )(
&new_table,
table->buckets[ bucket ].key,
#ifdef VAL_TY
&table->buckets[ bucket ].val,
#endif
true,
false
);
if( VT_UNLIKELY( VT_CAT( NAME, _is_end )( itr ) ) )
- {
- FREE_FN(
- new_table.buckets,
- VT_CAT( NAME, _total_alloc_size )( &new_table )
- #ifdef CTX_TY
- , &new_table.ctx
- #endif
- );
-
- bucket_count *= 2;
- continue;
- }
+ break;
}
+ // If a key could not be reinserted due to the displacement limit, double the bucket count and retry.
+ if( new_table.key_count < table->key_count )
+ {
+ FREE_FN(
+ new_table.buckets,
+ VT_CAT( NAME, _total_alloc_size )( &new_table )
+ #ifdef CTX_TY
+ , &new_table.ctx
+ #endif
+ );
+
+ bucket_count *= 2;
+ continue;
+ }
+
if( table->buckets_mask )
FREE_FN(
table->buckets,
VT_CAT( NAME, _total_alloc_size )( table )
#ifdef CTX_TY
, &table->ctx
#endif
);
*table = new_table;
return true;
}
} The main issue that I'm having at the moment is that I haven't yet managed to craft a dataset and/or hash function that tests this code path (and therefore triggers the bug). What is needed is a dataset/hash function that causes a displacement limit violation when the bucket count is x but not when it is 2x (in which case the bug could presumably be triggered via Also note that the dev branch now contains some other changes that have been pending for a while. Most are insignificant (e.g. corrections to typos in the code comments), but the default integer hash function (was murmur3) has changed to a slightly weaker one (fast-hash) that proved to perform better in my benchmarks. |
Change looks good to me! I can't run it through my linter to verify until next week though as I'm on holiday. |
Hope you're enjoying your break! I’ve just released v2.1.0, which includes the above fix. The new version is no longer generating the use-after-free warning in Clang-Tidy, as far as I can tell. I also finally managed to trigger the bug using For the sake of organization, I think I will close this issue soon and open a new one regarding proper Clang-Tidy support. |
New issue regarding proper Clang-Tidy support: #11. |
Reproduer:
The text was updated successfully, but these errors were encountered: