-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEST-#5903: Add third party unit tests for interoperability #5906
base: master
Are you sure you want to change the base?
TEST-#5903: Add third party unit tests for interoperability #5906
Conversation
Signed-off-by: Labanya Mukhopadhyay <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
Signed-off-by: Labanya Mukhopadhyay <[email protected]>
@labanyamukhopadhyay thanks for the contribution! Could you tell a little more about these tests. Do I understand correctly that these are adapted tests from the libraries themselves, originally written for pandas? Are there any plans to add these tests to the appropriate libraries in the future? In general, what is the support plan? :) |
@anmyachev Yes these are unit tests from the third party libraries that were adapted for Modin dataframes instead of pandas! So these are already in the respective libraries (for ex https://github.com/matplotlib/matplotlib/blob/main/lib/matplotlib/tests/test_axes.py). As Modin becomes more interoperable, I believe we can add to the tests/ remove the pytest skip decorators accordingly. But the currently passing tests should all continue to pass for future Modin versions as a sanity check. |
@labanyamukhopadhyay thanks for the answer. @modin-project/modin-core any objection to do this? |
@labanyamukhopadhyay, do I understand correctly that the tests were just copied from those libraries? Or there were some changes being made in order to adapt them to modin? I they were no changes made to the tests, then maybe there's no much sense to duplicate the tests in the modin's repo? We could have run those tests directly from the third-party test suites if we would need to. |
@dchigarev The tests were changed slightly to be used with Modin dataframes instead of pandas so they are not exactly the same as the ones in the libraries' test suites. |
Signed-off-by: Labanya Mukhopadhyay <[email protected]>
Signed-off-by: Labanya Mukhopadhyay <[email protected]>
So I'm wondering whether we should make this a separate repository (maybe a sub-repo of the modin itself) as the tests seem not to be related to the modin internals but rather to the libraries that take modin dataframes. Also, what's the estimated time to run them all? I'm wondering whether our current CI is capable to do it. cc @modin-project/modin-core , need your opinions |
I'm also wondering who is going to be responsible for maintaining the tests (if they have to be updated due to dependencies or functionality changes or if they suddenly started to fail for unknown reasons, who will be in charge to investigate this?). If this will be a separate repo it could have its own list of maintainers that may include some devs from modin side as well as other interested parties (maybe developers of those third-party libraries). |
+1 for creating sub-repo |
@modin-project/modin-core Is there any objection if I start creating a separate repository for such tests? (under |
I like the idea of a separate repo. We can take https://github.com/data-apis/dataframe-interchange-tests as a reference BTW. |
What do these changes do?
Adds unit tests from third party libraries - Matplotlib, Plotly, Scikit-learn, and XGBoost to test interoperability with Modin dataframe.
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date