Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Op <-> User_function mapping #31

Open
jeffhammond opened this issue Mar 30, 2024 · 2 comments
Open

fix Op <-> User_function mapping #31

jeffhammond opened this issue Mar 30, 2024 · 2 comments
Assignees

Comments

@jeffhammond
Copy link
Owner

If we are going to store the mapping forever, we need to make Op_free a no-op, to avoid the ABA problem (thanks to Jed Brown for noticing this) if we get the same Op value back from subsequent calls Op_create.

The other method is to associate the Op used in an Iallreduce with the Request and mark it as free-able when the Iallreduce completes. This is quite messy and I don't want to do it because, unlike Ialltoallw, people actually use Iallreduce, and I don't want to add to that code path more than necessary.

Option 1 is easier. We just need to make sure to free all the Ops that are mapped in Finalize so that e.g. MPICH doesn't complain about dangling handles.

The biggest problem here is that MPI implementations are expected
to ref-count everything, so it is legal to call Op_create,
Iallreduce and Op_free before calling Wait, which means
that the op-callback mapping will be destroyed while the reduction
is in-flight.
Our first few designs were bad. We used a heap object for the attribute,
which is freed during Op_free. We can fix this and just
attach the function pointer directly to the datatype
so no heap allocation is necessary. However, we store the mapping
between the MPI_Op and MPI_User_function(_c) forever, which
has trivial space and time overheads in all but the most pathological
use cases. (If you call MPI_Op_create millions of times in your program,
I don't know what to tell you.)

@jeffhammond jeffhammond self-assigned this Mar 30, 2024
@dalcinl
Copy link
Collaborator

dalcinl commented Mar 30, 2024

(If you call MPI_Op_create millions of times in your program,
I don't know what to tell you.)

The runtime cost of op create+free compared to the actual reduction is likely negligible. Therefore, users my find more convenient to op_create -> reduce -> op_free in an inner function that may be called many many times (one per time step during millons of timesteps?) rather than storing the op in global state for reutilization. This is particularly true when you want to write pure or reentrant functions, because using global state may require mutexes, or otherwise messing with initialization/finalization logic.

@jeffhammond
Copy link
Owner Author

yes, the issue is i'm sticking all the MPI_Op in a std::map so if you create a million of them, that uses 24MB of memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants