-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework ExtractionFilter
to adept to boolean values
#423
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest also updating the default value for the only_complete
attribute in the crawl()
method of crawler.py
Hmm, what would be the reasoning behind this? Maybe I'm missing something, but the behavior of |
From what I see, the current default value is |
Yeah, that's right, but neither Update: Wait, maybe there is a misunderstanding here. |
OK, I think I see my mistake. So just to be sure: the expected behavior of |
No, |
OK, now I think I got the idea. I guess the name confused me a bit, perhaps we could change it to something along the lines of |
That's a good point. I agree, |
Not really, but maybe something like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
With #422, 'free_access' is computed correctly. Because not all publishers set this information correctly, I felt the need to rework the extraction filter again.
Requires
,skip_boolean
which enables one to skip boolean values for evaluationRequires
,RequiresAll
,RequiresAll
is a name wrap for Requires() and propagates theskip_bool
parameter to the user.