Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check the runtime version of PMIx #2

Merged
merged 1 commit into from
Jul 11, 2024

Conversation

hppritcha
Copy link
Member

It has been reported (and confirmed) that building against one version of PMIx and then running with another version will cause PRRTE to segfault. This isn't a universal rule. For example, one can switch v5.0 and master without a problem. However, switching v5.0 and v4.2 is a definite segfault.

The root cause of the problem is a change in the layout of the base pmix_object_t definition. This renders all PMIx objects binary incompatible when crossing between the v5 and v4 (and below) series.

Changing the v5 definition back to match v4 is an
overly complex task. The changes were required to
accommodate the new shared memory support that
was introduced in v5.

So instead, we check the runtime version of PMIx against the build version. If the runtime version is incompatible with the build version, then we print an explanatory error message and error out.

Signed-off-by: Ralph Castain [email protected]

dd

Signed-off-by: Ralph Castain [email protected]
(cherry picked from commit d02ad07)

@hppritcha hppritcha requested a review from jsquyres June 21, 2024 21:34
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

0fa074e: Check the runtime version of PMIx

  • check_cherry_pick: contains a cherry pick message that refers to a commit that exists, but is in an as-yet unmerged pull request: d02ad07

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

0ec08e3: Check the runtime version of PMIx

  • check_cherry_pick: contains a cherry pick message that refers to a commit that exists, but is in an as-yet unmerged pull request: d02ad07

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

@rhc54
Copy link

rhc54 commented Jun 21, 2024

You probably need to add 2ac45f3 to disable the MacOS builds as Homebrew has broken them.

@jsquyres
Copy link
Member

If PR openpmix#1987 is accepted upstream, it fixes the MacOS github action tests. We could pull the commit from that PR here.

@jsquyres
Copy link
Member

If PR openpmix#1987 is accepted upstream, it fixes the MacOS github action tests. We could pull the commit from that PR here.

@hppritcha The upstream PR was accepted: you can cherry-pick commit 4a682ef670d2582ffda2e990428446138f7c10d7

Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

5033ba0: Check the runtime version of PMIx

  • check_cherry_pick: contains a cherry pick message that refers to a commit that exists, but is in an as-yet unmerged pull request: d02ad07

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

It has been reported (and confirmed) that building against
one version of PMIx and then running with another version
will cause PRRTE to segfault. This isn't a universal rule.
For example, one can switch v5.0 and master without a
problem. However, switching v5.0 and v4.2 is a definite
segfault.

The root cause of the problem is a change in the layout
of the base pmix_object_t definition. This renders all
PMIx objects binary incompatible when crossing between
the v5 and v4 (and below) series.

Changing the v5 definition back to match v4 is an
overly complex task. The changes were required to
accommodate the new shared memory support that
was introduced in v5.

So instead, we check the runtime version of PMIx against
the build version. If the runtime version is incompatible
with the build version, then we print an explanatory
error message and error out.

Signed-off-by: Ralph Castain <[email protected]>

bot:notacherrypick

dd

Signed-off-by: Ralph Castain <[email protected]>
@hppritcha
Copy link
Member Author

@jsquyres ready for a "re" review

@hppritcha hppritcha merged commit 225468c into open-mpi:master Jul 11, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants