Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP configuration for dask example #654

Open
wants to merge 1 commit into
base: branch-0.17
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ Values:
- ``get_zcopy``
- ``auto`` (default)

.. _tcp-config:

UCX_TCP_RX_SEG_SIZE
```````````````````

Expand Down
8 changes: 8 additions & 0 deletions docs/source/dask.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,14 @@ Using with Dask
``UCX/UCX-Py`` can be used with `Dask <https://dask.org/>`_ as a drop-in replacement for the communication protocol between workers. Below we show how to use UCX-Py with both helper utilities such as `dask-cuda <https://github.com/rapidsai/dask-cuda>`_
and manually starting a dask cluster with UCX enabled. Additionally, we demonstrate using UCX with a `cuDF Example`_ and `CuPy Example`_.


.. note::

If using TCP without NVLink or Infiniband, TCP alone may require additional configuration
Copy link
Member

@pentschev pentschev Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why I think we should try to be less aggressive in trying to solve everyone's problems, "require" is definitely not the right word. It will still work without setting any of that, and we don't know whether limitations may exist when increasing the segment size for TCP depending on hardware, network stability, etc., it has worked for us and improved performance but doesn't mean it will for every case and we don't extensively test for that. Furthermore, there may be a reason for segment sizes to be that small by default, and if I had to guess I would say this has to do with robustness, even for networks that are less stable and more susceptible to packet losses.

With the above said, I don't mind either having this or not, but I think we should balance if we really want to propose solutions we don't really know much about and don't really test for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about just saying something like "one may consider"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's ok too, I'm just saying that by doing this kind of statement we're implicitly saying that we support those configurations somehow, which isn't really the case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it? When we started using UCX, there were a lot of details that were unknown to us and we spent a lot of time figuring things out and writing them down. As I see it, us writing this down is just for the benefit of others so they need not complete the same exercise. IOW we are just giving users guidance and it is up to them to do what they will with it 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with that if we really knew what are the potential side-effects of those configurations. But IMO, we know of one and only one case for which it had better performance and I'm not confident in giving advice based on a single observation. With that said, I would personally prefer that users refer to official UCX docs for those, after all, I never observed UCX being faster than Python Sockets when we don't have NVLink or IB available.

settings. Please consult the :ref:`TCP Configuration<tcp-config>` reference



Starting with Dask-cuda
-----------------------

Expand Down