Skip to content

Commit

Permalink
Clean up README, improve docs theme
Browse files Browse the repository at this point in the history
  • Loading branch information
cwpearson committed Mar 13, 2024
1 parent c8fc55e commit 40b2e09
Show file tree
Hide file tree
Showing 10 changed files with 97 additions and 105 deletions.
85 changes: 3 additions & 82 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,91 +1,12 @@
# kokkos-mpi

> [!WARNING]
> UNOFFICIAL MPI interfaces for [Kokkos](https://github.com/kokkos/kokkos) C++ Performance Portability Programming EcoSystem
> UNOFFICIAL MPI interfaces for the [Kokkos](https://github.com/kokkos/kokkos) C++ Performance Portability Programming EcoSystem
## Getting Started

[cwpearson.github.io/kokkos-mpi/](https://cwpearson.github.io/kokkos-mpi/)

### macOS

macOS standard toolchain has incomplete `std::mdspan` support: `std::layout_stride` is not implemented as of xcode 15.3 release candidates

At sandia, with MPICH and the VPN enabled, you may need to do this before running any tests:
```bash
export FI_PROVIDER=tcp
```


```
mkdir -p build && cd build
cmake .. \
-DCMAKE_CXX_COMPILER=mpicxx \
-DKokkosComm_ENABLE_MDSPAN=ON \
-DKokkosComm_USE_KOKKOS_MDSPAN=ON \
-DKokkos_DIR=/path/to/kokkos-install/lib/cmake/Kokkos
make
ctest
```
See [getting started](https://cwpearson.github.io/kokkos-mpi/usage/getting_started.html) in the documentation.

## Documentation



## Design

| | `Kokkos::View` | `mdspan` |
|-|-|-|
| MPI_Isend | x | x |
| MPI_Recv | x | x |
| MPI_Send | x | |
| MPI_Reduce | | |

- [ ] Grab and reuse Kokkos Core configuration
- [ ] MPI Communicator wrapper

- [ ] Packing
- [x] Tentative `Packer` interface.
- [x] Packer::MpiDatatype which just constructs an MPI Datatype matching the mdspan and hands it off to MPI to deal with the non-contiguous data
- [x] Packer::DeepCopy uses `Kokkos::deep_copy` to handle packing and unpacking of non-contiguous `Kokkos::View`
- [ ] second pass would be to somehow associate a Kokkos memory space with the `mdspan` so we know how to allocate intermediate packing buffers
- When non-contiguous views are passed to an MPI function, a temporary contiguous view of matching extent is allocated, and `Kokkos::deep_copy` is used to pack the data.
- [x] "Immediate" functions (e.g. `isend`) return a `KokkosComm::Req`, which can be `wait()`-ed to block until the input view can be reused. `Req` also manages the lifetimes of any intermediate views needed for packing the data, releasing those views when `wait()` is complete.
- [x] `KokkosComm::Traits` is specialized for `Kokkos::View` and `mdspan`
- whether `View` needs to be packed or not
- what `pack` does for `View`
- what `unpack` does for `View`
- spans (distance between beginning of first byte and end of last byte)
- [ ] Future work
- [x] host data `mdspan`
- [ ] device data `mdspan`

## Considerations

- macOS xcode 15.3 doesn't support `std::mdspan`, so we use `kokkos/mdspan`.
- Pluggable packing strategies
- This would probably be a template parameter on the interface, which would be specialized to actually implement the various MPI operations
- Constructing matching MPI datatype and sending
- Packing into a contiguous buffer and sending
- How to handle discriminate between mdspans of host or device data, for packing
- Alternatively, construct an MPI datatype matching the non-contiguous sp
- MPI threaded-ness and Kokkos backends (Serial with multiple instances, Threads, etc)
- Are there circumstances in which we can fuse packing into another kernel?
- A better pack/unpack interface
- Maybe a `PackTraits<View>` where users can specialize `PackTraits` for any types they want to handle
- Also, could introduce a runtime packing argument to the various functions, like a pack tag
- More convenient collective wrappers
- Outer dimension has destination rank?
- Custom reductions?

## Performance Tests

* `test_2dhalo.cpp`: a 2d halo exchange
* `test_sendrecv.cpp`: ping-pong between ranks 0 and 1

## Contributing

```bash
shopt -s globstar
clang-format-8 -i {src,unit_tests,perf_tests}/**/*.[ch]pp
```
[cwpearson.github.io/kokkos-mpi/](https://cwpearson.github.io/kokkos-mpi/)
24 changes: 24 additions & 0 deletions docs/api/core.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,30 @@
Core
====

.. list-table:: MPI API Support
:widths: 40 30 15 15
:header-rows: 1

* - MPI
- KokkosComm::
- Kokkos::View
- mdspan
* - MPI_Send
- send
- ✓
- ✓
* - MPI_Recv
- recv
- ✓
- ✓
* - MPI_Isend
- isend
- ✓
- ✓
* - MPI_Reduce
- reduce
- ✓
- ✓

Point-to-point
--------------
Expand Down
7 changes: 3 additions & 4 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = 'KokkosComm'
copyright = '2024, NTESS'
project = 'Kokkos Comm'
copyright = '2024 NTESS'
author = 'Carl Pearson'
release = '0.0.2'

Expand All @@ -23,6 +23,5 @@

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'alabaster'
html_theme = 'sizzle'
html_static_path = ['_static']
23 changes: 23 additions & 0 deletions docs/design.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
Design
======

Asynchronous MPI operations and view lifetimes
----------------------------------------------

"Immediate" functions (e.g. `isend`) return a `KokkosComm::Req`, which can be `wait()`-ed to block until the input view can be reused. `Req` also manages the lifetimes of any intermediate views needed for packing the data, releasing those views when `wait()` is complete.

Non-contiguous Data
-------------------

- Packer::MpiDatatype which just constructs an MPI Datatype matching the mdspan and hands it off to MPI to deal with the non-contiguous data
- Packer::DeepCopy uses `Kokkos::deep_copy` to handle packing and unpacking of non-contiguous `Kokkos::View`. This requires an intermediate allocation, which only works for Kokkos Views, see `Device Data`_.

Device Data
-----------

Contiguous device data is handed to MPI as-is.

For non-contiguous Kokkos::Views in a non-``Kokkos::HostSpace``, any temporary buffers are allocated in the same memory space as the view being operated on.

For non-contiguous mdspan, there is no standards-compliant way to get an allocator that can produce the same kind of allocation as the mdspan.
In that case, the Packer::MpiDatatype packer needs to be used, where a datatype is created to describe the mdspan (without accessing any of the mdspan's data!) and then that is handed off to the MPI implementation.
12 changes: 12 additions & 0 deletions docs/dev/contributing.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Contributing
============

Code Formatting
---------------

All code should be formatted by clang-format 8:

.. code-block:: bash
shopt -s globstar
clang-format-8 -i {src,unit_tests,perf_tests}/**/*.[ch]pp
7 changes: 4 additions & 3 deletions docs/dev/docs.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Extending the Documentation
===========================

reStructedText
--------------
Using reStructedText
--------------------

* `Basics of rST <https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html>`_
* `Documenting C++ with rST <https://www.sphinx-doc.org/en/master/usage/domains/cpp.html>`_
Expand All @@ -24,4 +24,5 @@ Building a local copy of the docs
# builds the docs
make -C docs html
# open docs/_build/html/index.html in your favorite browser
# open docs/_build/html/index.html in your favorite browser
29 changes: 17 additions & 12 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,35 +1,40 @@
.. KokkosComm documentation master file, created by
sphinx-quickstart on Tue Dec 19 11:32:06 2023.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Kokkos MPI Documentation

KokkosComm documentation!
=========================

Usage
-----

.. toctree::
:maxdepth: 2

usage/getting_started
usage/performance_tests

API Reference
-------------

.. toctree::
:maxdepth: 2
:caption: API:

api/core
api/traits
api/packing

Usage
-----
Design
------

.. toctree::
:maxdepth: 2
:caption: Usage:

usage/getting_started
design

Developer
---------

.. toctree::
:maxdepth: 2
:caption: For Developers:

dev/contributing
dev/docs

Indices and tables
Expand Down
3 changes: 2 additions & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
sphinx==7.2.6
sphinx==7.2.6
sphinx-sizzle-theme
7 changes: 4 additions & 3 deletions docs/usage/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Known Quirks
------------

At Sandia, with the VPN enabled while using MPICH, you may have to do the following:
```bash
export FI_PROVIDER=tcp
```

.. code-block:: bash
export FI_PROVIDER=tcp
5 changes: 5 additions & 0 deletions docs/usage/performance_tests.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Performance Tests
=================

* `test_2dhalo.cpp`: a 2d halo exchange
* `test_sendrecv.cpp`: ping-pong between ranks 0 and 1

0 comments on commit 40b2e09

Please sign in to comment.