-
Notifications
You must be signed in to change notification settings - Fork 859
WeeklyTelcon_20180116
Geoffrey Paulsen edited this page Jan 15, 2019
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
--- Will fill out as meeting starts
- Geoff Paulsen
- Brian
- David Bernholdt
- Edgar Gabriel
- Geoffroy Vallee
- Jeff Squyres
- Howard
- Matthew Dosanjh
- Mohan
- Ralph
- Todd Kordenbrock
- Joshua Ladd
- Josh Hursey
Review All Open Blockers
Review v2.x Milestones v2.1.2
- Delayed until next week.
- No one has URGENT need, but would like to get this out
Review v3.0.x Milestones v3.0
- Schedule: RC2
- On 3.x series trying to cut RCs on nightly tarballs.
- Didn't get RC last week
- Will get RC today.
- No Blockers on v3.0.x (one we JUST merged)
- Will Pull in PR4715
- Will Pull in PR4716
- Issue 4563 - not seeing on little arm boxes here, Jenkins uses --disable-builtin-atomics.
- Comm Spawn - Documentation PR ready or pulled
-
Issue 4509
- We believe this is closed. Asked Nathan to close.
Review v3.1.x Milestones v3.1
- SCHEDULE:
- Will shoot on getting Release Canidate out Friday.
-
BLOCKER:
- OSC monitoring fix (doesn't build with Portals 4)
- PMIx 2.1 PR4605
- Ralph - there is cleanup issue with PMIx 2.1, but we have cleanup issues today
- UCX one sided violating PR4688
-
Issue 4303
- Probably just need to build a patch.
Review Master Master Pull Requests
- Issue PR4686
- Jeff Tried to reproduce and failed.
- Thought HCOLL was an issue, Artem took out, and put back.
- Something going on in there. Possibly atomic related.
- Might need Nathan's attention.
- Someone could try reverting the one change to atomics to see if that caused it.
- Mellanox will try to reproduce after reverting atomic change. Timing issue.
- Dynamic operations, a TON of sigfaults. All in opal_progress, during ompi_sync_wait multi-credit.
- Something is wrong with atomics. Intercomm_create or Spawn.
- Cisco is tickling the most, and will look at.
- PR4697 seems to have stalled. * Opal Progress change looks good for most interconnects. * TCP performance regression. * Pointer solution seems reasonable. * mellanox will try to implement pointer.
- Reg-ex expression creation.
- PR4710
- someone created a test and put it in make-check rather than MTT.
- Then made the component static so that don't have to do make install
- Dont think we should be adding tests to make-check
- Question - Is there a Regex library we could use? Reg-ex is hard.
- This is working pretty well, but did add Framework to allow for future components.
- When your PR has been accepted into a release branch, please go to the issue, and remove the target of the release branch that it was just merged into. Attempting to automate this in the future.
- New Topic - We currently can't write unit tests against components.
- Some way to say "this unit test is against this component".
- Intel went through and did this internally for orte. Already hosted in public domain.
- Ralph will send link to Brian to take a look.
- Python Client can't report back to database.
- https://github.com/open-mpi/mtt/issues/614
- Josh Hursey will look at.
Review Master MTT testing
- Probably looking at March or early April
- San Jose or Dallas
- Geoff will send out two Doodles for date and time.
- San Jose or Dallas
- Discuss abandoning openib btl.
- LNLL - is no longer paying anyone to maintain openib btl.
- Nathan has a UCX BTL
- ETA on GPU in UCX - basic minus CUDA IPC in test now.
- Any warning message if on iWarp
- What's the roadmap for this? 3.x or 4.x?
- LNLL - is no longer paying anyone to maintain openib btl.
- pushed date to late feb or march.
- Mellanox, Sandia, Intel
- LANL, Houston, IBM, Fujitsu
- Amazon,
- Cisco, ORNL, UTK, NVIDIA