Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with negative indexes #13

Open
MatthewCaseres opened this issue Mar 28, 2024 · 4 comments
Open

Issue with negative indexes #13

MatthewCaseres opened this issue Mar 28, 2024 · 4 comments

Comments

@MatthewCaseres
Copy link
Collaborator

From another issue

I've found an issue with the indexing in Tables - if you have multiple keys, an integer key going from 18-90 (say), and you look up 2, then you can get a false positive - it will find an earlier index. (2 would find the value for 90-16 for the prior key)

possible solution:
Can resolve by bounds checking. Possibly use optional bounds checking, like Table(safe=False) and I think it is best that safe=True by default. Safe=True performs any important validations that take time and safe=False doesn't?

@lewisfogden
Copy link
Owner

Agree - I almost think rectify() should be automatic, and bounds checking on by default.

need to test this:

if safe:
    if not np.all((min_value <= key) & (key <= max_value)):
       raise KeyError('blah')

Some options analysed here.
https://stackoverflow.com/questions/10542240/easy-way-to-test-if-each-element-in-an-numpy-array-lies-between-two-values

@lewisfogden
Copy link
Owner

Added in 923009c checks suggest a bit of a performance hit (can be disabled if you are confident in your data), I've also added rectify as an option.

@MatthewCaseres
Copy link
Collaborator Author

np.take with mode clip will prevent the wraparound. Could that be useful?

https://numpy.org/doc/stable/reference/generated/numpy.take.html

@lewisfogden
Copy link
Owner

Not sure how I would use it (some of the np docs can't be a bit cryptic for me!). Does that do the same thing as np.clip? Wondering here whether we should just always clip (i.e. bound), could always add a min - 1 index and max + 1 index both with values np.nan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants