Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ifpack2: BlockTriDi fix for large blocks #13792

Merged
merged 2 commits into from
Feb 7, 2025

Conversation

brian-kelley
Copy link
Contributor

@brian-kelley brian-kelley commented Feb 6, 2025

In BlockTriDi's ExtractAndFactorize kernels that use scratch, fall back to level 1 when there isn't sufficient level 0.

@trilinos/ifpack2

Motivation

Ifpack2's block TriDi and block Jacobi should work for block sizes to and including 32. For relatively large block sizes like 27, many GPUs didn't have enough shared memory per block for the numeric setup (extract+factorize). When there isn't enough shared, this PR falls back to level 1 scratch.

The level 0 and level 1 cases are instantiated separately to make sure there is no performance hit to the original level 0 case.

Related Issues

No issue for this, was reported by email

Stakeholder Feedback

Issue reported directly by @jewatkins for SPARC.

Testing

Added testing with the maximum block size of 32. Verified that the level 1 scratch code path is taken on nvidia and amd gpus in this test.

In ExtractAndFactorize kernels that use scratch,
fall back to level 1 when there isn't sufficient level 0.

Signed-off-by: Brian Kelley <[email protected]>
Use max block size of 32 and a small grid. Test
standard BTD, BTD with Schur line splitting, and Block Jacobi.

Signed-off-by: Brian Kelley <[email protected]>
@brian-kelley brian-kelley added type: bug The primary issue is a bug in Trilinos code or tests pkg: Ifpack2 client: SPARC Issues related to or needed more specifically by the ATDM SPARC code labels Feb 6, 2025
@brian-kelley brian-kelley self-assigned this Feb 6, 2025
@brian-kelley brian-kelley requested a review from a team as a code owner February 6, 2025 16:41
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 1088
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_gcc

  • Build Num: 1138
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
BLOCKING_BUILD false
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_gcc-openmpi_debug

  • Build Num: 1139
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_clang

  • Build Num: 1137
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-clang-11.0.1-openmpi-4.0.5-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_cuda

  • Build Num: 1136
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8-gpu
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_intel

  • Build Num: 1057
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-intel-2021.3-sems-openmpi-4.1.6_release-debug_shared_no-kokkos-arch_no-asan_no-complex_fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_cuda-uvm

  • Build Num: 1136
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Using Repos:

Repo: TRILINOS (brian-kelley/Trilinos)
  • Branch: FixLargeBlocks2
  • SHA: 0e724f4
  • Mode: TEST_REPO

Pull Request Author: brian-kelley

@jewatkins
Copy link
Contributor

lgtm, thanks! I'll give it a try today.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: PR_gcc-openmpi-openmp

  • Build Num: 1088
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_gcc

  • Build Num: 1138
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
BLOCKING_BUILD false
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_gcc-openmpi_debug

  • Build Num: 1139
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-gnu-8.5.0-openmpi-4.1.6-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_clang

  • Build Num: 1137
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-clang-11.0.1-openmpi-4.0.5-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_cuda

  • Build Num: 1136
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8-gpu
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_intel

  • Build Num: 1057
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-intel-2021.3-sems-openmpi-4.1.6_release-debug_shared_no-kokkos-arch_no-asan_no-complex_fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4

Build Information

Test Name: PR_cuda-uvm

  • Build Num: 1136
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel8_sems-cuda-11.4.2-gnu-10.1.0-openmpi-4.1.6_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables
PR_LABELS type: bug;pkg: Ifpack2;client: SPARC
PULLREQUESTNUM 13792
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL rhel8
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA 0e724f4
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA ff721c4


CDash Test Results for PR# 13792.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO REVIEWS HAVE BEEN PERFORMED ON THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

Copy link
Contributor

@jewatkins jewatkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirmed issue is fixed, thanks!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ jewatkins ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@brian-kelley brian-kelley merged commit 95b49d7 into trilinos:develop Feb 7, 2025
17 checks passed
@brian-kelley brian-kelley deleted the FixLargeBlocks2 branch February 7, 2025 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: SPARC Issues related to or needed more specifically by the ATDM SPARC code pkg: Ifpack2 type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants