-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix deprecated calls to scrapy.utils.request.request_fingerprint
#50
base: master
Are you sure you want to change the base?
Conversation
@@ -12,7 +12,7 @@ jobs: | |||
runs-on: ubuntu-latest | |||
strategy: | |||
matrix: | |||
python-version: [3.5, 3.6, 3.7, 3.8, 3.9] | |||
python-version: ["3.7", "3.8", "3.9", "3.10", "3.11", "3.12", "3.13"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.5 and 3.6 don't seem to work anymore? at least in my fork
@@ -5,7 +5,7 @@ | |||
|
|||
from scrapy.http import Request | |||
from scrapy.item import Item | |||
from scrapy.utils.request import request_fingerprint | |||
from scrapy.utils.request import fingerprint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Scrapy 2.7+, the right approach is using Crawler.fingerprinter.fingerprint()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Gallaecio wouldn't using this require storing an instance of (and instantiating) a Crawler
object as an instance variable?
from the messages in scrapy 2.11.2 (https://github.com/scrapy/scrapy/blob/e8cb5a03b382b98f2c8945355076390f708b918d/scrapy/utils/request.py#L86-L136) it seems to suggest getting the crawler during instantiation with DeltaFetch.from_crawler
if I am reading it right.
but what if the DeltaFetch
object isn't instantiated that way? what should happen in DeltaFetch._get_key
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeltaFetch
is a spider middleware, instantiated by Scrapy.
Most Scrapy components are instantiated with the create_instance
(Scrapy 2.11-) / build_from_crawler
(Scrapy 2.12+) functions, which call from_crawler
if defined. Spider middlewares are definitely one of those components. So __init__
should never be called without first calling from_crawler
.
It does happen in tests, that use self.mwcls()
, and would need to change to either use from_crawler
or, better yet, use create_instance
/ build_from_crawler
.
Currently this plugin is broken on Scrapy 2.12
scrapy.utils.request.request_fingerprint
was removed in Scrapy 2.12 (https://docs.scrapy.org/en/2.12/news.html#deprecation-removals)If accepted, a version bump would be helpful 🙂