Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scarliles/defuse partitioner #70

Draft
wants to merge 10 commits into
base: submodulev3
Choose a base branch
from

Conversation

SamuelCarliles3
Copy link
Collaborator

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Defuses Partitioner to prevent viral spread of concrete implementations for each Partitioner subtype in classes which hold a concrete instance

Any other comments?

asv benchmarks run fine in my linux dev vm, fail on setup_cache in my m2 macbook...

Copy link

github-actions bot commented Jul 6, 2024

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here


ruff

ruff detected issues. Please run ruff check --fix --output-format=full . locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.5.1.


examples/linear_model/plot_tweedie_regression_insurance_claims.py:82:35: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
   |
81 |     # unquote string fields
82 |     for column_name in df.columns[df.dtypes.values == object]:
   |                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
83 |         df[column_name] = df[column_name].str.strip("'")
84 |     return df.iloc[:n_samples]
   |

sklearn/cluster/_optics.py:327:12: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
    |
325 |         """
326 |         dtype = bool if self.metric in PAIRWISE_BOOLEAN_FUNCTIONS else float
327 |         if dtype == bool and X.dtype != bool:
    |            ^^^^^^^^^^^^^ E721
328 |             msg = (
329 |                 "Data will be converted to boolean for"
    |

sklearn/cluster/tests/test_dbscan.py:294:12: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
    |
292 |     obj = DBSCAN()
293 |     s = pickle.dumps(obj)
294 |     assert type(pickle.loads(s)) == obj.__class__
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
    |

sklearn/linear_model/tests/test_ridge.py:1023:12: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
     |
1022 |     assert len(ridge_cv.coef_.shape) == 1
1023 |     assert type(ridge_cv.intercept_) == np.float64
     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
1024 | 
1025 |     cv = KFold(5)
     |

sklearn/linear_model/tests/test_ridge.py:1031:12: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
     |
1030 |     assert len(ridge_cv.coef_.shape) == 1
1031 |     assert type(ridge_cv.intercept_) == np.float64
     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
     |

sklearn/metrics/pairwise.py:2364:12: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
     |
2362 |         dtype = bool if metric in PAIRWISE_BOOLEAN_FUNCTIONS else "infer_float"
2363 | 
2364 |         if dtype == bool and (X.dtype != bool or (Y is not None and Y.dtype != bool)):
     |            ^^^^^^^^^^^^^ E721
2365 |             msg = "Data was converted to boolean for metric %s" % metric
2366 |             warnings.warn(msg, DataConversionWarning)
     |

sklearn/model_selection/_search.py:1100:24: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
     |
1098 |                 arr_dtype = np.dtype(object)
1099 |             else:
1100 |                 if any(np.min_scalar_type(x) == object for x in param_list):
     |                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
1101 |                     # `np.result_type` might get thrown off by `.dtype` properties
1102 |                     # (which some estimators have).
     |

sklearn/model_selection/_search.py:1107:52: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
     |
1105 |                     # https://github.com/scikit-learn/scikit-learn/issues/29157
1106 |                     arr_dtype = np.dtype(object)
1107 |             if len(param_list) == n_candidates and arr_dtype != object:
     |                                                    ^^^^^^^^^^^^^^^^^^^ E721
1108 |                 # Exclude `object` else the numpy constructor might infer a list of
1109 |                 # tuples to be a 2d array.
     |

sklearn/model_selection/_split.py:2899:27: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
     |
2897 |                 if value is None and hasattr(self, "cvargs"):
2898 |                     value = self.cvargs.get(key, None)
2899 |             if len(w) and w[0].category == FutureWarning:
     |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
2900 |                 # if the parameter is deprecated, don't show it
2901 |                 continue
     |

sklearn/model_selection/tests/test_validation.py:589:20: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
    |
588 |             # Make sure all the arrays are of np.ndarray type
589 |             assert type(cv_results["test_r2"]) == np.ndarray
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
590 |             assert type(cv_results["test_neg_mean_squared_error"]) == np.ndarray
591 |             assert type(cv_results["fit_time"]) == np.ndarray
    |

sklearn/model_selection/tests/test_validation.py:590:20: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
    |
588 |             # Make sure all the arrays are of np.ndarray type
589 |             assert type(cv_results["test_r2"]) == np.ndarray
590 |             assert type(cv_results["test_neg_mean_squared_error"]) == np.ndarray
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
591 |             assert type(cv_results["fit_time"]) == np.ndarray
592 |             assert type(cv_results["score_time"]) == np.ndarray
    |

sklearn/model_selection/tests/test_validation.py:591:20: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
    |
589 |             assert type(cv_results["test_r2"]) == np.ndarray
590 |             assert type(cv_results["test_neg_mean_squared_error"]) == np.ndarray
591 |             assert type(cv_results["fit_time"]) == np.ndarray
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
592 |             assert type(cv_results["score_time"]) == np.ndarray
    |

sklearn/model_selection/tests/test_validation.py:592:20: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
    |
590 |             assert type(cv_results["test_neg_mean_squared_error"]) == np.ndarray
591 |             assert type(cv_results["fit_time"]) == np.ndarray
592 |             assert type(cv_results["score_time"]) == np.ndarray
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
593 | 
594 |             # Ensure all the times are within sane limits
    |

sklearn/utils/estimator_checks.py:1509:8: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
     |
1508 |     # func can output tuple (e.g. score_samples)
1509 |     if type(result_full) == tuple:
     |        ^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
1510 |         result_full = result_full[0]
1511 |         result_by_batch = list(map(lambda x: x[0], result_by_batch))
     |

sklearn/utils/tests/test_validation.py:1343:12: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
     |
1341 |         )
1342 |     assert str(raised_error.value) == str(err_msg)
1343 |     assert type(raised_error.value) == type(err_msg)
     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ E721
     |

sklearn/utils/validation.py:874:49: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
    |
872 |         if all(isinstance(dtype_iter, np.dtype) for dtype_iter in dtypes_orig):
873 |             dtype_orig = np.result_type(*dtypes_orig)
874 |         elif pandas_requires_conversion and any(d == object for d in dtypes_orig):
    |                                                 ^^^^^^^^^^^ E721
875 |             # Force object if any of the dtypes is an object
876 |             dtype_orig = object
    |

Found 16 errors.

cython-lint

cython-lint detected issues. Please fix them locally and push the changes. Here you can see the detected issues. Note that the installed cython-lint version is cython-lint=0.16.2.


/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pxd:13:1: E265 block comment should start with '# '
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pxd:71:90: W291 trailing whitespace
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:9:40: 'swap' imported but unused
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:16:1: W293 blank line contains whitespace
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:36:1: W293 blank line contains whitespace
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:539:37: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:540:37: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:541:37: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:542:37: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:543:37: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:544:37: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:550:41: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:551:41: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:552:41: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:553:41: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_partitioner.pyx:554:41: E127 continuation line over-indented for visual indent
/home/runner/work/scikit-learn/scikit-learn/sklearn/tree/_sort.pxd:13:5: E128 continuation line under-indented for visual indent

Generated for commit: 09a8ec5. Link to the linter CI: here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant