-
Notifications
You must be signed in to change notification settings - Fork 859
WeeklyTelcon_20180522
Geoffrey Paulsen edited this page Jan 15, 2019
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen
- Edgar Gabriel
- Joshua Ladd
- Todd Kordenbrock
- Xin Zhao
- Jeff Squyres
- Geoffroy Vallee
- Thomas Naughton
- Brian
- David Bernholdt
- Howard Pritchard
- Matthew Dosanjh
- Dan Topa (LANL)
- Akvenkatesh
- Nathan Hjelm
- Ralph
- Josh Hursey
Review All Open Blockers
Review v2.x Milestones v2.1.4
- v2.1.4 - Targeting Oct 15th,
- lower priority to v3.0 and v3.1
- No new news on v2.1.x
Review v3.0.x Milestones v3.0.2
- Schedule:
- Hope to post v3.0.2 later this week
- Ready to ship v3.0.2
- Need to sort out shared library version numbers.
- No suprises here. Just some fortran bits. After some testing (forgot if this was 3.0.2 or 3.1.1) Decided code changes, not interface adds.
- Done and ready to go.
- Need to sort out shared library version numbers.
Review v3.1.x Milestones v3.1.0
- Merged in most of outstanding changes on v3.1.x
- PR4397 - UCX
- Schedule
- Shooting for early June.
- Next week, will cut a release canidate.
- Long outstanding list of PR for v3.1.x branch.
- 4 or 5 need review. one is Geoff tagged for review. (done)
- will hold of about a week in case we need to do a quick turn-around oops release.
- Mellanox v3.1.x MTT has many many failures. Josh will look at.
- Boris was looking at this, something with new PMIx. Should be fixed.
- IBM MTT hasn't run since April, and is now running again.
- RESOLVED UCX xpmem issue
- looking pretty good. Howard brought up some issues on single node with xpmem.
- xpmem can be disabled via env var.
- Create an issue for v3.1.1 - gpaulsen
- Issue with Connect-X3 attomic support. UCX limitation.
- For v3.1.1 Some want fallbacks, or Errors, but don't segv.
- For v4.0
- Mellanox planning to do emulation on CPU if IB card can't do HCA attomics.
- Still need a check in OMPI, incase they're running with old UCX.
- Issue with Connect-X3 attomic support. UCX limitation.
- Schedule: mid-July branch. mid-Sept relelase.
- Start meeting weekly.
- iWarp have a person to contact.
- Unclear if UCT BTL will work on Connect-X3 or Broadcomm rocky.
- At developer meeting we discussed removing the old use-mpi fortran module.
- Can't remove since RHEL 7.x is using gfortran 4.8.5
- UCX Community has committed to doing Emulation in UCX.
- UCX + Connect-X3 Will work for pt2pt and collectives, but not RMA
- Will emulate for v4.0.0
- It would be nice to have a doc that is the set of supported hardware / and which drivers to use.
- MPI Standards removal for MPI removed items in Open MPI v4.0
- Nathan sent out email about PR 5127 - to remove all MPI2.x standard items.
- A little weird to be able to pull back MPI1 removed items.
- Lets remove these too at the same time.
- Delete all of this in OMPI 5.0
- C++ bindings are seperate pull request. PR 5128 Goal is to have these removed as well.
- Nathan has a PR to put Deprectated warnings for ALL MPI1 stuff.
- Delete all of this in OMPI 5.0
- Lets turn off more building by default.
- Forum didn't REMOVE everything that was deprecated in MPI v3.0 standard.
- go over v4.0
- I'll ping george on both first two.
- does uct btl include pt2pt. add in support for send/recv methods.
- I'll ping george on both first two.
- hwloc v2.0.x
- jeff or Geoff
- SPC - jeff had some comments before pulling.
- Need to check status of MPI 3.2 standard.
- Fujitsu - PMIX persistant collectives... look at mailing list.
- libevent and hwloc jeff will look at configury for making external prefered
- OSHMEM
- Mellanox is making good progress.
- Do not build OSHMEM if a viable SPML cannot be built - Brian posted PR.
- Update ROMIO giles might be good canidate for that. (experienced).
- Not going to remove fortran MPI TKR module - will
- Imporved performance for single and MT -nathan
- MPMD Support for SLURM 17.11 - Howard.
- Feature might be somewhat buggy.
- Nasty way to launch dameons if using SRUN.
- If you use PMI2 instead of PMIx
- Won't be a PMIx v3.0 for v4.0 timeframe.
- Howard is driving to make it at least build, but not use new features.
- iWarp - has a rewrite to rename openib
- Will test on new cards this/next week.
- Connect-X3 UCX uses pt2pt, from this perspective good to drop openib btl.
- Add to v4.0 list:
- Edgar Vulcan component -waiting for one more commit from student.
- Add support for Cuda buffers in OMPI-IO
- A couple of of updates for luster component, but not sure if it will make it default. time we could switch
Review Master Master Pull Requests
-
PR5180 - Remove MXM MTL action item from developer's meeting.
- Mellanox approved.
- Last week: OSHMEM v1.4 - not sure if we have to drop the depricated APIs, curious OMPI is dropping depricated APIs...
- Only remove things removed from the OSHMEM standard, not things Deprecated as "deprecated" means it will be removed from a future version of the standard. If some APIs were removed from the standard, then ask oshmem email list their thoughts.
- Xin should be able to push first version of OSHMEM v1.4 changes to master next week or so.
- Xin should be pushing today or tomorrow... It's been passing some simple tests.
- Egar has a new component with weird name, we need to
- As a heads up ULFM support may require PMIx v3.0
- All Tarballs in S3
- Set an end date for web-mirrors... end of june.
- Got compiler licenses for NAG compiler, and Absoft
- Both Fortran
- No progress.
- Get copy of perl JSON, and put it on MTT.
- DONE
Review Master MTT testing
- Mellanox, Sandia, Intel
- LANL, Houston, IBM, Fujitsu
- Amazon,
- Cisco, ORNL, UTK, NVIDIA