-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FBref] Handle missing Scores & Fixtures page on the Big 5 European Leagues Stats page #284
[FBref] Handle missing Scores & Fixtures page on the Big 5 European Leagues Stats page #284
Conversation
EDIT: found the bug causing the failures. Pushing a fix in a bit. |
…ined" is not present
@probberechts is there a way we can get this and #286 merged and released any time soon? I'd really appreciate that as those bugs are blocking me in my personal project :) |
Hi @lorenzodb1 I'll have to take a closer look at these and I'll try to make some time during the weekend. What's blocking me from merging them immediately is:
|
|
For dealing with the missing Scores & Fixtures on the Big 5 European Leagues Stats page, I would be ok with two alternative approaches:
|
This was my first approach to fixing that issue. I changed it to the current implementation because it would mean adding the same logic of splitting the Big5 leagues in both methods. Since Out of curiosity, what's the reason behind keeping both |
Unless I am missing something, the only thing that would have to be changed in
These contain relevant data as well, such as the season's champion and top scorer. Additionaly, making them private would be a breaking change. |
If the filter in L196 can be dropped then that'd definitely be a better solution.
I see. Maybe it'd be worth adding them to the API docs then. |
Yes, that makes sense. I've added them in #307. |
@probberechts I applied your suggestion to these changes. I couldn't drop the |
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## master #284 +/- ##
==========================================
+ Coverage 61.28% 61.40% +0.12%
==========================================
Files 10 10
Lines 1431 1433 +2
Branches 290 292 +2
==========================================
+ Hits 877 880 +3
Misses 508 508
+ Partials 46 45 -1
☔ View full report in Codecov by Sentry. |
The current implementation would be a breaking change and it still does not work as I would expect. Now it would scrape the data from the "Big 5 European Leagues Combined" page by default. The code below would combine the top-5 leagues, but it should return 5 leagues: fbref = sd.FBref(["ENG-Premier League", "FRA-Ligue 1", "ITA-Serie A", "ESP-La Liga", "GER-Bundesliga"], "2223")
fbref.read_leagues() |
@probberechts I ended up changing it to |
Looks good now. Thanks for all your work on improving the FBref scraper! |
[FBref] Non-data rows in the table body should be removed #277). I added the logic to remove those replicated headers when found.