-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganize for MPI / NCCL transport layers #109
Conversation
13405f4
to
8de569a
Compare
I haven't looked into the code in detail, but a few notes:
Thank you for the work on this big PR; things are progressing nicely! |
I was debating this myself - I was thinking about the MPI interop angle. I think it makes sense for us to actually provide stable-ish MPI wrappers since that is one of the motivations for this library and what our current users want from it.
Will do
This doesn't actually implement anything except MPI, so no danger here 😄 . We could implement Serial (or not) in the future as needed.
Good idea. |
d72f966
to
9126092
Compare
@dssgabriel I'm proposing some modifications to how the comm mode is used 9126092 this could potentially be a separate PR |
I fear that if we provide direct MPI wrappers people will just start using those instead of our "core" API, as it may be easier for them to adopt. This won't add any value to their code, and they may as well keep the MPI interop they already have in place. Some others questions/remarks:
Let's discuss this PR during next week's meeting, we may want to split it up even further. I'll give a better overview of what NCCL offers and how it should interact with this PR. 👍 |
It's done with structs in Kokkos because you can't do partial function specialization - e.g. you CAN'T do template <typename View, typename CommunicationSpace>
void send(View v);
// KokkosComm implementation for MPI in here
template <typename View>
void send<View, Mpi>(View v) {...}; but you CAN do template <typename View, typename CommunicationSpace>
struct send;
// KokkosComm implementation for MPI in here
template <typename View>
struct send<View, Mpi> {...}; |
The value of MPI wrappers for the users is two places
There is demonstrated interest in these. They can be realized in the near term, and provide value to the overall effort:
Like you (?), I think the core API should be more thoughtful and novel. However, if what we come up with is such a small step forward that people find MPI-Kokkos interop more compelling, then they weren't going to use the new thing anyway. |
419f8f8
to
c454f51
Compare
c454f51
to
f40fece
Compare
f40fece
to
86ce85c
Compare
86ce85c
to
9c8066a
Compare
5e32a41
to
595794a
Compare
@dssgabriel how are we feeling about this? We have an external party asking about writing a backend to support their research, plus our own NCCL work. |
I think this is on the right track! A few points of discussion that we probably need to figure out before merging:
I will review the code once again tomorrow so that we can discuss it on next monday's meeting and quickly merge it afterwards. 👍 |
I tried reading the NCCL docs but I couldn't figure out what the semantics are if I do two ncclSend/ncclRecv in a row to the same destination. Do you know if they are non-overtaking like MPI? I think there needs to be some way to distinguish multiple in-flight messages with the same source and dest, but keeping messages ordered may be enough., I'll drop the tags for now and we can discuss.
I think just keep them for MPI since they are basically an MPI thing.
I think you're right about this, let me take another stab at it.
I prefer this too, I was just following Kokkos Core and Kernels. Let me see if I can make this change. Then we'll really be touching every file 😄 |
As far as I understand, NCCL enqueues communication operations on a CUDA stream. If two successive NCCL calls are on the same stream, they are non-overtaking. However, I think two communications enqueued on different streams may be able to overtake each other? I believe some folks at CEA know the lead developer of NCCL, so we'll try to set up a meeting with them and get answers to all our semantics-related questions.
LGTM 👍
I agree. Maybe we can automatically enable MPI if users compiled their code with
Great! I'm looking forward to reviewing every file 😁 Thank you for all the work on this PR! |
I ended up moving all the perf tests to an |
@dssgabriel I think all future work is waiting behind this PR, how are we feeling? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not particularly in favor of adding a public KokkosComm::mpi
namespace. Can we put it in an Experimental
namespace?
Else, I am in favor of merging this PR as it really improves code architecture., thanks a lot!
@@ -0,0 +1,48 @@ | |||
//@HEADER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure naming this file "mpi.hpp" is a good idea.
|
||
#include <type_traits> | ||
|
||
#include "../concepts.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to avoid relative paths in "public" includes.
Do we want to go ahead and just merge this? We could open small issues and gradually fix the various remarks we have brought up here. |
Let's merge this, and then I can open up issues for the comments here. I need an approving review though 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge this and open issues for the relevant remarks we've made in the PR.
Main points I have in mind at the moment are:
- re-organize the files to match ISO C++ library design
- move the
mpi
namespace behindExperimental
First part of #100
Introduce
Transport
CommunicationSpace
conceptThis is all modeled after Kokkos Core execution spaces. Each will have its own subdirectory under src (like Kokkos' backends). The interface that each comm space needs to implement is a struct. For example,
Irecv
, fromKokkosComm_fwd.hpp
:Why a
struct
? Because in practice what this means is that the comm space will have a partial specialization (not allowed for functions) that looks like this (frommpi/KokkosComm_mpi_irecv.hpp
):Where
Mpi
is astruct
in theKokkosComm
namespace, analogous to e.g.Kokkos::Cuda
. A Serial transport would have a correspondingNCCL would have
and so on.
To support this, a lot of the include structure needs to be refined and adjusted to more closely match how Kokkos Core does it.
Future work:
Introduce a
KokkosComm::Handle
This hides the CommunicationSpace details from the KokkosComm interface. It's specialized on each CommunicationSpace. For MPI, this holds a Kokkos execution space instance and an MPI communicator.
Reduce main interface to 3 functions:
Move MPI-specific wrappers to
mpi
namespaceThis is so we can still have useful MPI interop stuff but it's not part of the "blessed" interface
Update unit tests and perf tests for new layout
Many unit-tests are now MPI specific. All perf_tests are only built if MPI is built (this could be revisited when other Transports are done).
Move
CommMode
and related interfaces tompi
namespaceThis is MPI-specific stuff. I also made the CommMode stuff more canonically C++
ReadyCommMode -> CommModeReady
Add some additional support for non-contiguous views along existing lines
A
Req<Mpi>
knows how to keep track of some lambda functions it can call after MPI_Wait (e.g. to unpack the result of an MPI_IRecv).Grab-bag
req.wait
->KokkosComm::wait(req);
wait_any
andwait_all
.