Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staggered invert test cleanup + expanded staggered gtest support #1421

Merged
merged 61 commits into from
Jan 10, 2024

Conversation

weinbe2
Copy link
Contributor

@weinbe2 weinbe2 commented Dec 7, 2023

This PR is a massive cleanup and refactor of various staggered test executables: staggered_dslash_(c)test, staggered_invert_test, staggered_eigensolve_test, and hisq_stencil_test. Motivators:

  • Homogenize these different tests as much as possible (as is appropriate)
  • Include gtest support in all of them
  • Make the tests as stylistically similar to the Wilson versions (dslash_(c)test, invert_test, and eigensolve_test) as possible
  • Add support for testing the Laplace operator when it is enabled at compile time (-DQUDA_LAPLACE)

Some cleanup happened along the way:

  • BiCGStab cleanup:
  1. Switching it to explicitly use ColorSpinorField objects as opposed to pointers thereof
  2. More robustly improve the final convergence (in some cases the final solution may have been slightly above the tolerance, i.e., ~1.01e-6 when the requested tolerance was 1e-6). The addition of staggered_invert_test ctests uncovered this!
  • Move some of the is_[...]_[solve/solution/etc] routines out of the invert test helpers and into the more general host_utils files.

A few gotchas of note:

  • The default behavior of staggered_invert_test is now to solve the preconditioned operator with conjugate gradient, since that's arguably the most typical use case. The previous default was solving the full operator with bicgstab.
  • I removed the --test [#] flags from staggered_invert_test to more closely match the behavior of invert_test. I created a new flag --legacy-test-info that will print the appropriate command line arguments to reproduce each old test.

I did make a few changes in preparation of adding support for the "local" staggered operators, which can be used for Schwarz preconditioning. This includes copying and pasting some relevant bits of code from Wilson-type files into the staggered-type files, commenting the code out, and explicitly leaving an additional comment that this is for future reference. I know this is generally not a best practice, but hopefully it's understandable in this case.

As it stands, a staggered-only build does indeed pass ctest -v.

There are some small bits of outstanding work, but nothing I see as holding up PR review right now. They are:

  • Confirming ctest -v for Wilson-type fermions still works (this PR nearly exclusively effects staggered tests, but I did introduce a few abstractions that maybe went awry for Wilson)
  • Formally confirming deflation still works in staggered_invert_test
  • Formally confirming MG still works in staggered_invert_test
  • Checking workflows that use bicgstab
  • doxygen
  • clang-format

…merated test types, set new defaults, improved command line arg documentation
Copy link
Member

@maddyscientist maddyscientist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments based on visual code review.

Beyond that though, my main concern is how long the tests take to run, noting that the tests are failing to complete on CSCS. In particular:

  • Staggered eigensolves take much longer than their Wilson brethren. Anything to be done here?
  • HISQ stencil test looks like it's taking longer than expected: is there unnecessary setup being redone between partitions?

lib/spinor_dilute.in.cu Outdated Show resolved Hide resolved
tests/host_reference/staggered_dslash_reference.cpp Outdated Show resolved Hide resolved
tests/utils/set_params.cpp Show resolved Hide resolved
tests/utils/host_utils.h Show resolved Hide resolved
tests/staggered_invert_test.cpp Outdated Show resolved Hide resolved
tests/staggered_eigensolve_test_gtest.hpp Outdated Show resolved Hide resolved
lib/inv_bicgstab_quda.cpp Outdated Show resolved Hide resolved
@weinbe2
Copy link
Contributor Author

weinbe2 commented Jan 3, 2024

  • Staggered eigensolves take much longer than their Wilson brethren. Anything to be done here?
  • HISQ stencil test looks like it's taking longer than expected: is there unnecessary setup being redone between partitions?

Based on offline conversations, the time spent on staggered eigensolves has been addressed in a1303bd , plus some additional opportunities to speed up the unit tests run on CSCS in 3252b6c . The HISQ stencil test time was decided to be reasonable.

tests/CMakeLists.txt Outdated Show resolved Hide resolved
@weinbe2
Copy link
Contributor Author

weinbe2 commented Jan 4, 2024

@maddyscientist this should be good for a last review + clang-format

Got a few things to fix in BiCGstab with regards to MG (near-null vector generation and pre-smoothing)

Copy link
Member

@maddyscientist maddyscientist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this PR, subject to the HISQ stencil tests being merged into a single test (as opposed to separate test/ctest driver files #1430). Great work @weinbe2.

@cpviolator before we merge this, want to take a quick look at the Eigensolver changes made in this PR?

lib/eig_trlm.cpp Outdated Show resolved Hide resolved
Copy link
Member

@cpviolator cpviolator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice adjustment @weinbe2 ! Perfectly happy to see convergence and tolerance adherence improving.

@weinbe2 weinbe2 merged commit b87195b into develop Jan 10, 2024
12 checks passed
@weinbe2 weinbe2 deleted the feature/stag-cleanup branch January 10, 2024 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants