-
Notifications
You must be signed in to change notification settings - Fork 859
WeeklyTelcon_20200707
- Dialup Info: (Do not post to public mailing list or public wiki)
- Jeff Squyres (Cisco)
- Artem Polyakov (nVidia/Mellanox)
- Aurelien Bouteiller (UTK)
- Austen Lauria (IBM)
- Barrett, Brian (AWS)
- Brendan Cunningham (Intel)
- Christoph Niethammer (HLRS)
- Edgar Gabriel (UH)
- Geoffrey Paulsen (IBM)
- George Bosilca (UTK)
- Howard Pritchard (LANL)
- Joseph Schuchart
- Josh Hursey (IBM)
- Joshua Ladd (nVidia/Mellanox)
- Matthew Dosanjh (Sandia)
- Noah Evans (Sandia)
- Ralph Castain (Intel)
- Naughton III, Thomas (ORNL)
- Todd Kordenbrock (Sandia)
- Tomislav Janjusic
- William Zhang (AWS)
- Akshay Venkatesh (NVIDIA)
- Brandon Yates (Intel)
- Charles Shereda (LLNL)
- David Bernhold (ORNL)
- Erik Zeiske
- Geoffroy Vallee (ARM)
- Harumi Kuno (HPE)
- Mark Allen (IBM)
- Matias Cabral (Intel)
- Michael Heinz (Intel)
- Nathan Hjelm (Google)
- Scott Breyer (Sandia?)
- Shintaro iwasaki
- William Zhang (AWS)
- Xin Zhao (nVidia/Mellanox)
- mohan (AWS)
- Sessions is now in in.
- Partition communication voted in.
- OpalTSDCreate - takes a thread storage local key that would be tracked locally in opal.
- But when we go to delete, it's not being deleted.
- But want flexibility to destroy on our own or explicitly
- George thinks the mode we have today, since tracking all keys to be released by main thread.
- George thinks Artem's approach is the correct approach.
- Would have to change the way that keys are USED, and different components are using it in a different way.
- Something similar should be done in different places.
- If you do it just for UCX, then others can see how you did it and check for their code.
- So we think current PR is good, but it leaves old API and new API.
- But it might be better to remove OLD way and make broken components do SOMETHING to update their code.
- Should be easy for components to add explicit cleanup calls
- Master branch only.
- RHEL6 has a weird issue.
- If we want this new feature just for new systems only, it'd be okay.
- hardware has AVX2, but assembler doesn't understand code.
- George needs some input on PR
- We're assuming that
- We don't need
_atomic_
in most cases just need volatile - patch linked to the issue PR7914
- TBD if master-only Probably more than
- We're not breaking things, we just get alot of valid complaints from intel compiler.
- STDOUT of
make
is ~16 MB due to all intel compiler warnings without this fix
- STDOUT of
- Schizo SLURM binding detection - Might not need a solution on v4.0.x
- Summary of the issue: if running on new SLURM version, we do the wrong thing.
- Even worse, customer setting binding what they want in SLURM, but open-mpi binding incorrectly.
- We may need a better way to tell OMPI schizo a better way to "do nothing"
- This won't happen for v4.0.5 release
- 7511 - Opened against v4.0.x - Some hisory here. Reason we set binding on direct launch jobs.
- People were complaining that if they ran with mpirun, they got one performance, but direct launch through SLURM (or others) they got different behavior
- If they run direct-launch - they might
- Our solution was if we don't see that direct launch has attempted to bind, then we bind for them.
- A better solution would be to have direct manager to inform us that they're not binding.
- But once again srun has changed the way they do things
- Seems unexpected to be binding when not requested.
- This has been in there over a decade.
- Summary of the issue: if running on new SLURM version, we do the wrong thing.
- Should we be doing this at all? (Binding when using a direct launch)
- We won't be doing this at all in v5.0.
- Consensus - They need to fix this in direct-launcher.
- Do we do changes in v4.0.x, or v4.1.x, or leave it.
- Proposal to add item in README - the autobinding behavior of SLURM, only works before versions of X, and will be removed in SLURM.
- What if we just remove all of the ompi-binds code in SLURM?
- If we can find SLURM version string in env, we can not bind above certain version of SLURM.
- Is this only SLURM or all direct-launch?
- Wells there is only SLURM, Cray, JSM. We think Cray and JSM don't do anything, so probably just a SLRUM schizo issue.
- In schedMD they only support 2 years back. 201802 is as old as they support.
- if this older version has this issue, they it's a bug, and don't do it.
- If it's regressed in 2019 version, then base if we do it on SLURM version.
- We SHOULD do the SAME thing in v4.0.x and v4.1.x
- May take 2 weeks to do this, need to install SLURM versions.
- Could see if
- Do we want to block v4.0.5 and v4.1.0 for this?
- Yes probably a blocker.
- We have a band-aid for v4.1.0 collective tuning.
- Consensus - going to merge incremental improvement.
- Need to push for people to test HAN.
- Consensus - Should we make HAN on default on master?
- To get better testing, and want for v5.0
- Wait until we have understanding
- AWS CI - see timeout every couple of jobs ALL CI
- ring or hello-world about 10%
- Yesterday IBM test suite get a random small percentage of failures 2 rank over 2 hosts
- incomplete corefiles indicate inside INIT - still investigate.
Blockers All Open Blockers
Review v4.1.x Milestones v4.1.0
-
Schedule: Want to release end-of-July
-
Posted a v4.1.0 rc1 to go through mechanisms to ensure we can release.
-
Release Engineers: Brian (AWS) Jeff Squyres (Cisco)
-
Still want: George's Collectives Tunings for tuned coll AVX UCX PRs awaiting review.
-
Past: We've come to consensus for a v4.1.0 release
- Need include/exclude selection, worried about consistent selection.
- Alot of PRs outstanding, but can't merge until
- Patch for OFI stuff messed up v4.1.x branch.
- Howard has a fix PR, Jeff is looking at.
- Howard changed new OFI BTL parameters to be consistent with MTL
- Not breaking ABI or backwards compatibility.
- v4.1.x branch, branched from v4.0.4 tag.
- NOT touching runtime!!!
- Not going to be pulling in a new PMIx version.
-
All MTT is online on v4.1.x branch
-
Not compiling under SLURM EFA test. (OFI BTL issue)
Review v4.0.x Milestones v4.0.4
-
Discussed Open-MPI binding when direct-launched (see above)
-
v4.0.5 schedule: End of July
- PR7898 - We need resolution on this on master
- 7893 - master release
- Two potential drivers for a quick v4.0.5 turn-around.
- OSC RDMA Bug - May drive a v4.0.5 release.
- Program Aborts on detach.
- PR7898 - We need resolution on this on master
-
OSC pt2pt we have on v4.0.x
-
Fragmented Puts, the counting is not correct for a particular user request
- Non-continguous rPuts.
- Also needed in a v4.0.5
-
How urgent is ROMIO fix?
- Good to have in v4.0.5, but hard to make testcase to hit.
-
usNic failing almost all multi-node tests on v4.0.x
- Jeff started to look at last week, but didn't get to look at this last week.
- v4.0.x WAS working, and seeing Master failing.
- ACTION - check back next week.
-
iWarp support Issue 7861.
- How are we supposed to run iWarp in Open-MPI v4.0.x?
- How much do we care about iWarp?
- At a minimum need to update FAQ.
Review v5.0.0 Milestones v5.0.0
-
No update this week other than master discussion.
-
Need to put OSC pt2pt
- OS RDMA requires a single BTL that can contact every single process.
- This didn't use to be the case. (Comment in the code)
- OS RDMA requires a single BTL that can contact every single process.
-
We can't use the OSC pt2pt.
- It is not thread safe. Doesn't conform to MPI4 standard. Not safe.
- This is just a testing falicy. Could add tests to show this, but still at same boat.
- Either product A or B is broken and we need to fix it.
-
RDMA Onesided should fall back to "my atomics" because TCP will never have rdma atomics.
- The idea was to put the atomics into the BTL base, which could do all of the one-sided atomics under the covers.
-
Jeff will close the PR, and
-
Jeff will Nathan will fetching, get, compare and swap.
-
Two new PRs for MPI4.0 Error handling - new PRs from Aurelien Bouteiller.
-
Does UCX support iWarp?
- Does libFabric support iWarp via verbs provider?
- https://github.com/openucx/ucx/issues/2507 suggest it doesn't.
- Brian thinks that libFabric
- OFI can support iWarp, just need to specify the provider in the include list.
- This person who's asking is a partner not a customer
-
PMIX
- Working on PMIx v4.0.0 which is what Open MPI v5.0 will use.
- Sessions needs something from PMIx v4
- ULFM - not sure if it needs PMIx, think it needs PRRTE changes.
- PPN scaling issue - simple algorithmic issue in this function
- PMIX talked about it. Artem might know someone who might be interested in working on it.
- Algorithm behind one of the interfaces doesn't scale well.
- Not a regression. Above ~ 4K nodes, becomes quadratic.
-
PRRTE
- Nothing's happening there.
- Mostly discussed above.
- Many companies are not allowing a face to face travel until 2021 due to COVID19.
- Instead lets do a series of virtual-face to face?
- Yes this summer to discuss for v5.0
- Maybe we can do it by topic?
- Maybe not 4 or 8 hour things.
- Different topics on different days.
- Do a doodle poll of least-worse days in late July/August.
- Start a list of topics.
- George and Jeff will help plan and come to community.
- May not have Super Computing conference at ALL this year.
- Many other projects are doing a virtual state of the union type meeting to try to cover what they'd usually do in a Birds of a feather meeting.
- Then this works pretty well, and do this a couple of times a year.
- Not constrained to Super Computing
- scale-testing, PRs have to opt-into it.