Discussion: Don't run integration tests in PRs #4768

kratman · 2025-01-16T16:52:43Z

Description

This change would remove the integration tests from PRs, but leave them running in the scheduled nightly test runs.
While not done here, I think this is part of #3662, where we should have alerts if integration tests fail in scheduled tests, but not run them in each PR.

Reasoning:

Integration tests run for ~15 minutes per test job (240 total CI minutes per push)
I have rarely seen integration tests fail
Integration tests don't count towards coverage
Some of the integration tests do not even assert anything, so they are only looking for crashes

Type of change

Optimization (back-end change that speeds up the code)

Key checklist:

No style issues: $ pre-commit run (or $ nox -s pre-commit) (see CONTRIBUTING.md for how to set this up to run automatically when committing locally, in just two lines of code)
All tests pass: $ python -m pytest (or $ nox -s tests)
The documentation builds: $ python -m pytest --doctest-plus src (or $ nox -s doctests)

You can run integration tests, unit tests, and doctests together at once, using $ nox -s quick.

Further checks:

Code is commented, particularly in hard-to-understand areas
Tests added that prove fix is effective or that feature works

kratman · 2025-01-16T16:57:41Z

I am interested in hearing the thoughts of the other @pybamm-team/maintainers on this topic. This is really a question of how much value the much longer tests are adding. I know that major interface changes will cause them to fail

kratman · 2025-01-16T17:08:28Z

Original runtime for some tests:

With this change:

valentinsulzer · 2025-01-16T17:09:10Z

I've experienced a couple of cases in the past where the integration tests caught some edge cases that weren't caught by the unit tests, then required fairly significant effort to fix. So the risk would be that such changes get through then require major effort from maintainers to fix

kratman · 2025-01-16T17:16:33Z

I've experienced a couple of cases in the past where the integration tests caught some edge cases that weren't caught by the unit tests, then required fairly significant effort to fix. So the risk would be that such changes get through then require major effort from maintainers to fix

Yeah that is why I think it is also important to have it create issues on failures. So anything that does get through is caught quickly.

The other potential improvement could just be disabling them on draft PRs, but that still requires some discipline from devs on marking PRs as drafts.

None of the options are perfect, but our CI does run for an extremely long time. I basically setup this PR for discussion, because I needed the timings

aabills · 2025-01-16T17:27:06Z

I'd expect a whole lot of shape errors on develop if this were implemented because the unit tests don't test setting parameters etc. for various model combinations. Those are typically not easy to debug.

codecov · 2025-01-16T18:43:17Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.66%. Comparing base (1ac422a) to head (dd7c0d9).

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #4768   +/-   ##
========================================
  Coverage    98.66%   98.66%           
========================================
  Files          303      303           
  Lines        23227    23227           
========================================
  Hits         22916    22916           
  Misses         311      311

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

aabills · 2025-01-18T01:01:01Z

Just a thought -- while I think we should run integration on pr's, we probably don't need to run the full matrix. I think most of the os and python version specific stuff will likely get caught in unit tests.

Saransh-cpp · 2025-01-18T01:19:11Z

Adding onto @aabills' comment, we should only run the tests (ideally both unit and integration) on ubuntu (with all the python versions as breaking deprecations are often introduced) because pybamm has pure python wheels now (platform independent). The platform dependent things should be/are being tested in pybammsolvers' CI.

agriyakhetarpal · 2025-01-18T02:59:19Z

I would propose not reducing to testing only on Ubuntu as of now because there can still be platform-specific issues that can lurk outside of pybammsolvers (even if it's something as simple as differences in character encodings across platforms, for example, which we have come across before, or with issues coming from our dependencies most of which are not pure Python, such as CasADi/SciPy/NumPy).

However, in my opinion, not running integration tests on PRs sounds good to me if we keep these points in mind:

that they should run as a part of the nightly tests, and an issue is opened if they fail (as suggested above)
that there should be a way for users to still run them on PRs if they want – this is quite easy to add with GHA in just ~3 lines of code

Most of our PRs might not bring failures in the integration tests, but the second point would ensure anyone can trigger them by adding [integration] in the commit message if they feel their changes can break things. Another option is to use labels, such as "run integration tests," which a workflow can respond to (downside being that only collaborators can add those labels).

For point 1, I've used https://github.com/xarray-contrib/issue-from-pytest-log to create issues from failing CI jobs in other projects – we can also use this here.

agriyakhetarpal · 2025-01-18T03:02:41Z

Or, we could request NumFOCUS to pay for larger runners 😛 They have some budgets allocated for this: https://github.com/numfocus/infrastructure/

kratman · 2025-01-18T03:05:36Z

I think limiting the test matrix on integration tests sounds like a good compromise.

I know there are issues with our dependencies on windows. I was running into some of the while trying to upgrade to python 3.13 and numpy 2.0 in the fall. So I think testing on all OSes is still a good idea for at least the unit tests

valentinsulzer · 2025-01-18T03:40:20Z

Switching to IDAKLU as solver should also help speed up a lot of the integration tests

brosaplanella · 2025-01-20T08:35:44Z

I also have found some cases where integration tests caught some bugs, so I find them useful. Also from my experience, I think most of the time, when an integration tests fails, it fails across platforms, so reducing the matrix sounds like a good option for me. Removing them from draft PRs also sounds good, even if only half of the PRs are marked as drafts, it significantly reduces the overall computational time.

Don't run integration tests in PRs

dd7c0d9

kratman self-assigned this Jan 16, 2025

kratman requested review from martinjrobins, Saransh-cpp, agriyakhetarpal and arjxn-py as code owners January 16, 2025 16:52

kratman marked this pull request as draft January 16, 2025 17:00

kratman changed the title ~~Don't run integration tests in PRs~~ Discussion: Don't run integration tests in PRs Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Don't run integration tests in PRs #4768

Discussion: Don't run integration tests in PRs #4768

kratman commented Jan 16, 2025 •

edited

Loading

kratman commented Jan 16, 2025 •

edited

Loading

kratman commented Jan 16, 2025

valentinsulzer commented Jan 16, 2025

kratman commented Jan 16, 2025

aabills commented Jan 16, 2025

codecov bot commented Jan 16, 2025

aabills commented Jan 18, 2025

Saransh-cpp commented Jan 18, 2025 •

edited

Loading

agriyakhetarpal commented Jan 18, 2025

agriyakhetarpal commented Jan 18, 2025

kratman commented Jan 18, 2025

valentinsulzer commented Jan 18, 2025

brosaplanella commented Jan 20, 2025

Discussion: Don't run integration tests in PRs #4768

Are you sure you want to change the base?

Discussion: Don't run integration tests in PRs #4768

Conversation

kratman commented Jan 16, 2025 • edited Loading

Description

Type of change

Key checklist:

Further checks:

kratman commented Jan 16, 2025 • edited Loading

kratman commented Jan 16, 2025

valentinsulzer commented Jan 16, 2025

kratman commented Jan 16, 2025

aabills commented Jan 16, 2025

codecov bot commented Jan 16, 2025

Codecov Report

aabills commented Jan 18, 2025

Saransh-cpp commented Jan 18, 2025 • edited Loading

agriyakhetarpal commented Jan 18, 2025

agriyakhetarpal commented Jan 18, 2025

kratman commented Jan 18, 2025

valentinsulzer commented Jan 18, 2025

brosaplanella commented Jan 20, 2025

kratman commented Jan 16, 2025 •

edited

Loading

kratman commented Jan 16, 2025 •

edited

Loading

Saransh-cpp commented Jan 18, 2025 •

edited

Loading