-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: support JSM ecosystem on coral system #90
Conversation
a035804
to
dee6f50
Compare
Codecov Report
@@ Coverage Diff @@
## main #90 +/- ##
=======================================
Coverage 78.13% 78.13%
=======================================
Files 12 12
Lines 1413 1413
=======================================
Hits 1104 1104
Misses 309 309 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I dropped the WIP on the title. This addresses some problems that have nothing to do with coral. As far as coral status, jsrun booting flux has been demonstrated, but we've yet to find a pmix server package that we can properly link with, nor a hwloc library it turns out. Flux may end up being packaged as a module in /usr/tce on this system. At that point we'll see if any issues remain. |
ccae174
to
9ef5c28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I've got an alternate solution to the shmem debacle working so I'm going to split this PR into parts and resubmit. Sorry for the flailing around! |
OK, the flux-pmix tests in the CI build against pmix 3.1.2 are failing the same way as noted in #85, e.g
It's hard to piece together what is going on in pmix/ompi land but it seems like maybe not linking the mca dsos against libpmix.so was an oversight in that old version? This describes the problem as one with static builds (not applicable here): openpmix/openpmix#1188 So maybe the installed 3.1.2 on lassen is just unusable? |
Well. 3.1.2 may not even be in use on coral - it just happens to be the newest packaged version that includes the server headers. The version that jsm is built with is 3.1.4. Pushing out a new package for 3.1.4 may be a logical thing to do there. I'll try changing pmix's minimum version and the CI build to 3.1.4 and see how that goes. |
Problem: some versions of pmix don't have a .pc file, and some packaged versions have a broken one. Add a configure --with-pmix[=PREFIX]. If this is specified without a PREFIX, default system paths are assumed. If PREFIX is specified, then PMIX_CFLAGS and PMIX_LIBS are set based on that. No checks are performed to ensure PREFIX refers to a working pmix install of the minimum version. This is intended to be an override mechanism for exceptional situations. The default pkg-config method is the preferred one, when it works.
d0b630d
to
1b50093
Compare
lots of failures in flux-pmix unit tests with 3.1.4. Sigh. |
Rumor has it that this and modern flux is running on the sierra LSF/JSM system so closing this old work. |
This is a WIP to collect fixes needed to get flux-pmix working on the LLNL lassen system as proposed in #85