-
Notifications
You must be signed in to change notification settings - Fork 859
WeeklyTelcon_20180130
Geoffrey Paulsen edited this page Jan 15, 2019
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen
- Brian
- Edgar Gabriel
- Geoffroy Vallee
- Howard
- Matthew Dosanjh
- Josh Ladd
- Mohan --- A number of usuals no here today:
- Jeff Squyres
- akvenkatesh
- Artem
- Josh Hursey
- Todd Kordenbrock
- Nathan
- News: Ralph will not be able to work on Open MPI anymore. He will continue to work on PMIx, but not even the Open MPI PMIx merge.
- We Need a v3.1 release engineer to help Brian will send email to devel-core
- MPI forum is in Portland in over a month.
- Nothing for OMPI Community at this time.
- Face2Face -
- Ralph offered to have a brain dump day. Email Brian if interested.
- Current thought is to co-locate PMIx and MPI meeting in Dallas.
- Consensus for Week of March 20-22
- Face to Face - if we set it for the week of March 20-22 we can co-locate PMIx and OMPI
- Assuming about half day of orte.
Review All Open Blockers
Review v2.x Milestones v2.1.3
- No chance to look at.
- Lack of schedule and interest, this is Pushed back to March 1st.
Review v3.0.x Milestones v3.0.1
- Schedule: RC2 posted.
- Looking for feedback, thumbs up or down.
- Will have an RC3 after a few more PRs are pulled in.
- Target v3.0.x in PR4715
- Review required.
- Will Pull in PR4716
-
Issue 4563
- not seeing on little arm boxes here, Jenkins uses --disable-builtin-atomics.
-
Issue 4563
- Comm Spawn - Documentation PR ready or pulled
-
Issue 4509
- We believe this is closed. Asked Nathan to close.
- Issue - hwloc can't handle cuda from a different location
- On Master specifically disabling hwloc cuda.
- External component does NOT disable build, since
- 4677 - hwloc2 WIP, may need help with.
Review v3.1.x Milestones v3.1.0
- SCHEDULE:
- RC2 posted last week.
- PMIx v2.1.0 - merge in rc2 or wait for final?
- Ralph says PMIx v2.1.0 final is this week.
- Please download and test OMPI v3.1.x RC2.
- Blocker on v3.1.x
- Waiting on PMIx v2.1.0 final to pull in:
- PR4605 in there.
- PR4516
- OSC monitoring fix (doesn't build with Portals 4)
- PR4523
- waiting review.
- PMIx 2.1 PR4605
- PR4746
- Ralph - there is cleanup issue with PMIx 2.1, but we have cleanup issues today
- Mellanox will help work on this.
- UCX one sided violating PR4688
-
Issue 4303
- Probably just need to build a patch.
- Waiting on PMIx v2.1.0 final to pull in:
Review Master Master Pull Requests
- Issue Issue4686
- Something going on in there. Possibly atomic related.
- Might need Nathan's attention.
- Mellanox will try to reproduce after reverting atomic change. Timing issue.
- Dynamic operations, a TON of sigfaults. All in opal_progress, during ompi_sync_wait multi-credit.
- Something is wrong with atomics. Intercomm_create or Spawn.
- Cisco is tickling the most, and will look at.
- Delayed.
- PR4697 Got resolved and merged to master. * Opal Progress change looks good for most interconnects. * TCP performance regression was resolved and merged to master. * Going to PR this into v3.1.x * George has some thoughts with this * Don't have any non-OS wrappers for TLS * Master now checks for Cx11 Can we make it default? * Mac Sierra may/maynot even with _Thread_local * Would be nice if we could require Cx11 for v4.0
- Reg-ex expression creation.
- PR4710
- someone created a test and put it in make-check rather than MTT.
- Then made the component static so that don't have to do make install
- Dont think we should be adding tests to make-check
- Question - Is there a Regex library we could use? Reg-ex is hard.
- This is working pretty well, but did add Framework to allow for future components.
- Change behavior of opal_check_package
- Brian will send email to devel
- Make it more explicit when it finds issues
- Issue Issue4423
- When your PR has been accepted into a release branch, please go to the issue, and remove the target of the release branch that it was just merged into. Attempting to automate this in the future.
- New Topic - We currently can't write unit tests against components.
- Some way to say "this unit test is against this component".
- Intel went through and did this internally for orte. Already hosted in public domain.
- Ralph will send link to Brian to take a look.
- Python Client can't report back to database.
- https://github.com/open-mpi/mtt/issues/614
- Josh Hursey will look at.
Review Master MTT testing
- Discuss abandoning openib btl.
- LNLL - is no longer paying anyone to maintain openib btl.
- Nathan has a UCX BTL
- ETA on GPU in UCX - basic minus CUDA IPC in test now.
- Any warning message if on iWarp
- What's the roadmap for this? 3.x or 4.x?
- LNLL - is no longer paying anyone to maintain openib btl.
- pushed date to late feb or march.
- Mellanox, Sandia, Intel
- LANL, Houston, IBM, Fujitsu
- Amazon,
- Cisco, ORNL, UTK, NVIDIA