-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2020-resolver] Pip downloads lots of different versions of the same package #9922
Comments
Yup, can concur:
|
Here's the dockerfile for that build for reference: https://github.com/linuxserver/docker-sickchill/blob/master/Dockerfile |
Here's the dependency map of the package above (sickchill): A dependency package (subliminal) listed |
There's an entire section in pip's documentation about this. Please read https://pip.pypa.io/en/stable/user_guide/#dependency-resolution-backtracking. Since GitHub hid the closing notes on #8713, I'll link to them directly as well: #8713 (comment). You can scan the discussion above that, where we've described the specific tracking issues for the various things. |
@pradyunsg thanks, but I think you misunderstood my message. I did read the docs and that thread you linked, before I posted my last message here and they don't really explain the issue I brought up. The issue is that, I'm trying to install a single package,
With the above info, the listings, As you can see in the build logs above, pip instead downloads all versions of Is that the intended behavior? If so what is the reasoning because I can't come up with one? There is literally no reason I can think of for pip to download all those versions. Also, if a package lists just I'm not trying to be difficult. It's just that this new behavior causes many other issues. If you look at the build log I linked above, you'll see that the downloads from pypi get slower and slower, and eventually halt (perhaps throttled?). The build I linked to was cancelled by me after 4 hours, and the one before that was going on 15 hours (I thought the builder had crashed and cancelled that one) whereas the same build would complete with pip 21.0.1 in just 15 minutes on a slow arm32v7 device, including building py-cryptography with rust/cargo as seen here: https://ci.linuxserver.io/blue/organizations/jenkins/Docker-Pipeline-Builders%2Fdocker-sickchill/detail/master/427/pipeline/124 I get that your group's official stance is Thanks |
Gah. Apologies. I should've spent more time on my comment, to be a bit more elaborate:
That's not the "official stance". Quoting myself from the specific comment I linked to already:
The reality is that there are costs to having multiple issues that effectively serve as a blanket for all kinds of weird things that the dependency resolver might do (that would result in aggressive backtracking). They're usually not used for anything except folks to say "me too!" on, especially if we've broken out the discussion to other places. It adds to maintainance overhead for this issue tracker and is not really useful from my PoV. Your report is excellent, and significantly clearer than many of the ones we've received in the past, and it does seem to have very clear instructions on how to reproduce it. I've added it to my pile of existing excellent reports for where the resolver is just being stupid, and will likely test against it when we make improvements to validate that they actually improve things.
It's trying to be exhaustive and, for some reason, the specific package structure that you have is making it backtrack on a bad choice for the requirement to backtrack on. Honestly, there's significantly better things pip's resolver can do, with the easiest examples being "CDCL" or "Tree-Pruning". While the resolver is operating on incomplete information and all that, it is also not remembering some of the useful bits of information that it could infer. The reality of it is that, well, it could be smarter and isn't. pip's maintainers know that. OTOH, I'm gonna take 5 minutes now and write about the "costs" of reopening such blanket-scope issues: Instead of sitting through and making progress toward actually fixing these short-falls, I've spent well over an hour drafting this comment, to make sure that I'm not contradicting something we've said already, because you used the phrase "official stance" and now I feel the need to be careful around what I say. And, this whole thing has already demotivated me enough to not be working on pip's resolver stuff this weekend, and I'll likely go do something else with my free time this weekend. |
As a related point, I have a side project where I'm working on trying to write a program that, when given a a set of requirements, generates a report of the dependency graph in a form that makes it easier to diagnose these issues. Of course, that involves reproducing a big chunk of pip's logic (luckily there are libraries for a reasonable proportion of this) and it's not guaranteed that it will be that much quicker than just running pip itself. (And maybe even slower, as to get the full graph I can't even prune the tree). Another diagnostic tool I want to try to write is something that runs resolvelib (the core of pip's resolver) on a dependency tree, to do quicker "offline analysis" of problems like this. But both of those projects take time to develop, and I only have limited free time. So progress is slow. In your case, you've provided that info, thank you for that. For many reports, we don't have that level of detail. One point to note, though - you have sections in that report saying things like
This is over-simplified, in that twilio 6.56.0 could have different dependencies than 6.55.0, and pip has to check that level of detail as well. Doing so is usually redundant (it is for twilio, I believe) but not always, and it's one reason that even a detailed analysis can miss a problem (suppose twilio 6.57.0 had a dependency bug causing a conflict - we could try a lot of options before concluding that 6.57.0 was a lost cause and backtracking to 6.56.0). |
FWIW, I did some more digging on this, and the problem is that twilio requires And yes, I don't know why specifically pip is downloading many copies of pytz. Desperation, probably 😉 The problem here is that if we have incompatible requirements, then pip will, given enough time, download every version of every package involved in the installation - simply to check that there isn't some combination that has different requirements which are resolvable. So even though you and I know that there's no chance that a different version of pytz will fix the problem, pip can't know that. This is typical of most "pip doesn't finish" problems - there's a conflict that pip can't resolve, so how long do we keep trying before we give up? We can't give useful diagnostics if we don't try everything, so stopping quickly (which is something we're considering) will just mean we get more people complaining "pip didn't tell me what was wrong when it gave up". |
Ah, that makes a lot of sense. I'll look into that. Thanks |
Description
Sorry I reopen the issue #8713,
after pip upgrade docker image started to build much more time even with strict versions, it looks like a bug, not a feature. Solution for this is to use
--use-deprecated=legacy-resolver
and downgrade pip versionExpected behavior
I'm expecting soft version comparison, not comparison all packages by all packages, it could be a long journey
pip version
20.3.1
Python version
3.8
OS
Mac OS 10.15.7
How to Reproduce
Just install any of your existing projects using both versions and using
--use-deprecated=legacy-resolver
Output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: