-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: Match only within relative path when indexer is validating #859
Conversation
Codecov Report
@@ Coverage Diff @@
## master #859 +/- ##
==========================================
- Coverage 86.29% 86.03% -0.27%
==========================================
Files 35 35
Lines 3954 3973 +19
Branches 1026 1031 +5
==========================================
+ Hits 3412 3418 +6
- Misses 353 363 +10
- Partials 189 192 +3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sensible. Need to look at this CI failure...
It'd be nice to write some unit tests, but I'm not sure I have the time :S |
What about: def test__check_path_matches_patterns():
hidden_ds = "/home/user/.cache/datasets/myds"
failing_path = f"{hidden_ds}/myfile"
sensible_exclusions = [re.compile(r"/.*")]
assert _check_path_matches_patterns(failing_path, sensible_exclusions) is False
assert _check_path_matches_patterns(failing_path, sensible_exclusions, root=hidden_ds) is True |
Also please rebase on master to fix tests. |
606c091
to
13ab11c
Compare
Seems to be breaking expectations. |
bids/layout/index.py
Outdated
if isinstance(patt, (str, Path)): | ||
if str(patt) in path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@effigies I hope I did not misinterpreted these two lines.
Here is the string matching pattern:
- Shouldn't we allow strings, not just Paths?
- Shouldn't we just check the presence of the substring, rather than the exact match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, the changes were much more involved - probably this doesn't have much relevance now.
Okay, I seem to have opened a can of worms. The implementation of path matching and validation was pretty brittle, almost tailored to the few tests we have here. Although I haven't done an exhaustive revision of the truth table of possible combinations of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't had time to test this out myself but looks reasonable.
Taking a step back force_index
seems like a bit a mistake of the API, as it complicates things quite a bit. I also don't like how rules for what is indexed or excluded by default is in the code, rather than driven the BIDS schema-- but that will be a future effort.
Either way those comments don't have bearing on this PR and I think in the meantime this is good, and I'm glad we have more coverage on this.
I totally second this thought 👍 |
I've been thinking about @adelavega's feelings against That should go in a different PR, though. @effigies care to give this a final check? |
@oesteban about such an API breaking change, you should also be aware about #863 We will be eventually replacing At that point, we will allow a fair amount of API breaking changes so that would be a good time for that. The plan is to also having the current |
For now, it could be useful to:
|
That works for me |
Will merge if tests pass. Thanks for your patience. |
Make sure bids-standard/pybids#859 is available.
One major problem of the current behavior is, if one wants to exclude dot-folders and dot-files then a pattern like
re.compile(r"/\.")
will also match patterns above the root folder. E.g., the path/home/username/.cache/dataset/sub-01
would match, as well as/home/username/.cache/dataset/.datalad
. However, we only the latter to be filtered out.Resolves: #858.