Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MNT] Raise version bound for scikit-learn 1.6 #2486

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

MatthewMiddlehurst
Copy link
Member

What does this implement/fix? Explain your changes.

Raises the sklearn version bound and fixes broken tests.

This version introduces a new tags system used by sklearn. We have our own tag system, but I have integrated this somewhat so sklearn functions which use tags their should still work as intended.

The classification, regression and clustering base now extend the respective sklearn mixin. Just inheriting these is more maintainable than implementing our own methods for their tags and keeping up with changes to methods which use them IMO ( i.e. one of the breakages was is_classifier which now uses these tags instead of self._estimator_type).

Any other comments?

Currently, there are some expected results failures due to a bug with trees, see scikit-learn/scikit-learn#30557. I probably won't merge this until 1.6.1 so we dont have to swap this again unless someone really needs it.

@aeon-actions-bot aeon-actions-bot bot added classification Classification package clustering Clustering package maintenance Continuous integration, unit testing & package distribution regression Regression package labels Jan 9, 2025
@aeon-actions-bot
Copy link
Contributor

Thank you for contributing to aeon

I have added the following labels to this PR based on the title: [ $\color{#EC843A}{\textsf{maintenance}}$ ].
I have added the following labels to this PR based on the changes made: [ $\color{#BCAE15}{\textsf{classification}}$, $\color{#4011F3}{\textsf{clustering}}$, $\color{#7E0206}{\textsf{regression}}$ ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

  • Run pre-commit checks for all files
  • Run mypy typecheck tests
  • Run all pytest tests and configurations
  • Run all notebook example tests
  • Run numba-disabled codecov tests
  • Stop automatic pre-commit fixes (always disabled for drafts)
  • Disable numba cache loading
  • Push an empty commit to re-run CI checks

@MatthewMiddlehurst
Copy link
Member Author

@baraline I think other than expected results the only thing is the shapelet visualiser i.e. https://github.com/aeon-toolkit/aeon/actions/runs/12699462793/job/35400155079?pr=2486#step:8:925, not sure what would cause this. Can take a look if you dont have time but could just be a small thing.

@baraline
Copy link
Member

Will check during lunch time, doesnt look to bad

@baraline
Copy link
Member

@MatthewMiddlehurst I don't seem to be able to reproduce the bug locally somehow? I don't see how the changes could cause it. I'm using

    python: 3.10.13 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:24:38) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\antoine\anaconda3\envs\aeon_dev\python.exe
   machine: Windows-10-10.0.22631-SP0
Python dependencies:
         aeon: 1.0.0
          pip: 23.3.1
   setuptools: 68.0.0
 scikit-learn: 1.2.2
        numpy: 1.24.4
        numba: 0.58.1
        scipy: 1.11.4
       pandas: 2.0.3

@MatthewMiddlehurst
Copy link
Member Author

@baraline need scikit-learn 1.6.0

@baraline
Copy link
Member

baraline commented Jan 11, 2025

@MatthewMiddlehurst The issue is that the coef_ argument of RidgeClassifierCV is of shape (n_features) instead of shape (n_targets,n_features) as it should be (it's (1, n_features) according to docs in binary case) after fitting any shapelet classifier using it as classifier, so this could be an sklearn bug ? A simple fix on our side would be to check for at least 2D array on line 716 of _shapelets.py visualization file

TonyBagnall
TonyBagnall previously approved these changes Jan 13, 2025
Copy link
Contributor

@TonyBagnall TonyBagnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@baraline baraline dismissed stale reviews from TonyBagnall and SebastianSchmidl via f1e6485 February 3, 2025 21:11
@baraline baraline self-requested a review as a code owner February 3, 2025 21:11
@baraline
Copy link
Member

baraline commented Feb 3, 2025

Fixed the shapelet visualisation part (nvm some notebook failure still happens), but it seems there is other failures with expected results for multiple estimators, is 1.6 supposed to change any behaviour ? I don't see any change on highlights or docs between 1.5 and 1.6.

Edit : for some linear models, the coef_ argument changed in 1.6, see scikit-learn/scikit-learn@2f8f9f3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
classification Classification package clustering Clustering package maintenance Continuous integration, unit testing & package distribution regression Regression package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants