Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: backport bitcoin#24356, extend -socketevents to Sock, implement Sock::WaitMany{Epoll, KQueue} #6018

Draft
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

kwvg
Copy link
Collaborator

@kwvg kwvg commented May 9, 2024

Additional information

  • Dependent on backport: merge bitcoin#21167, #22782, #21943, #22829, #24079, #24108, #24157, #25109 (network backports: part 5) #6004

  • Dependent on refactor: move {epoll, kqueue} (de)init logic and wakeup pipes logic out of CConnman and into EdgeTriggeredEvents and WakeupPipes #6007

  • Dependent on fix: actually use -socketevents #6027

  • Deviations from upstream

    Bitcoin Dash Reason
    EventsPerSock is a unordered map of shared_ptrs of Sock wrappers and Events EventsPerSock is an unordered map of raw socket file descriptors (SOCKET) and Events Dash implements "wakeup select pipe", which is constructed and destroyed using WakeupPipes, an entity outside Sock's control.

    Dash needs to be able to insert the read pipe raw socket into equivalent of the recv socket set and query for it later on.

    It would be technically possible, though cumbersome, to wrap the read pipe raw socket in a Sock and overwrite the destructor if it wasn't for the support of edge-triggered modes which have an event-socket relationship, as opposed to level triggered modes, that have a socket-event relationship.
    Sockets passed in an EventsPerSock map will always return with event data for every corresponding entry. Sockets passed in an EventsPerSock map may return with event data for its corresponding entry. The behaviour defined for Bitcoin will also be presented in Dash if the socket events mode (SEM) is poll or select. Otherwise, it will be behave as described.

    This is due to the inversion of the socket-event relationship in edge-triggered modes (epoll and kqueue), as alluded to earlier. As edge-triggered modes return events and their corresponding socket (sockets registered through EdgeTriggeredEvents::RegisterEntity() and friends), the EventsPerSock map, should there be events reported, will have its contents completely discarded and substituted with the results of {epoll, kqueue}.
    You must have a Sock entity to call Sock::WaitMany() You can directly access Sock::WaitMany()'s underlying logic through calling Sock::IWaitMany() (and access any specific event mode's implementation) without a Sock entity. This change has been made as Bitcoin's behaviour was to call WaitMany by seeking to the first element to access it. This was possible because the unordered map consisted of Sock entities. As that isn't the case for Dash and WaitMany doesn't truly rely on instance-specific member values of a particular Sock instance (the values it relies on should remain constant throughout program runtime), it can be safely made a static function and that was exactly what was done.

    It has been named IWaitMany() as one of Sock's purposes is mockability and WaitMany() (simply a passthrough to IWaitMany() but leveraging member values) has been defined as a virtual function.

    In the interest of preventing future conflicts in backports, its characteristics haven't been changed, opting to use new function names for Dash-specific functionality.
    Sock's usage of platform-specific APIs is decided exclusively at compile-time. Sock's usage of platform-specific APIs is determined by what is supported at compile-time and decided at runtime (mostly). Before this pull request, the only usage of Sock::Wait() (which is transformed into Sock::WaitMany() in this pull request) was in I2P code (source), supported only poll and select (source) and behaved as described for Bitcoin.

    The described behaviour for Dash was only applicable for CConnman::SocketEvents(). But, as SocketEvents() is being replaced wholesale with WaitMany(), WaitMany() needed to be adapted to mirror SocketEvents() behaviour.

    This has resulted in changes to Sock that also now require knowledge of the expected runtime SEM and file descriptor (if using an edge-triggered mode).

    Note: Some portions of the codebase do not possess this knowledge and will default to using select as their SEM
    Sock::Wait() and Sock::WaitMany() behave identically Sock::Wait() will respect the SEM selection argument if it is level-triggered but will fallback to poll or select (determined at compile-time) if the SEM selection is edge-triggered. Due to the event-socket relationship of edge-triggered modes, they are unsuitable for querying the state of a particular socket (which is necessary if socket creation is asynchronous, see here and here).

    Because of that and a) the unliklihood of the socket probed being registered with EdgeTriggeredEvents::RegisterEntity() and b) the overhead involved in fetching a list, filtering out for the particular socket we care about and flagging the result, it is more practical to use an LT-SEM instead.

    Forcing LT-SEM is possible by calling IWaitMany with lt_only=true.

Breaking Changes

None expected. Behaviour should remain unchanged.

Checklist:

Go over all the following points, and put an x in all the boxes that apply.

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

@kwvg kwvg added this to the 21 milestone May 9, 2024
@kwvg kwvg force-pushed the net_processing_6 branch from 442f3c8 to ca1d635 Compare May 9, 2024 18:16
@kwvg kwvg changed the title refactor: bitcoin#21879, #23604, #24357, #24356, extend -socketevents to Sock, implement Sock::WaitMany{Epoll, KQueue} refactor: backport bitcoin#21879, #23604, #24357, #24356, extend -socketevents to Sock, implement Sock::WaitMany{Epoll, KQueue} May 9, 2024
PastaPastaPasta added a commit that referenced this pull request May 10, 2024
, bitcoin#22829, bitcoin#24079, bitcoin#24108, bitcoin#24157, bitcoin#25109 (network backports: part 5)

5dde8e7 merge bitcoin#25109: Strengthen AssertLockNotHeld assertions (Kittywhiskers Van Gogh)
a1f005e merge bitcoin#24157: Replace RecursiveMutex cs_totalBytesSent with Mutex and rename it (Kittywhiskers Van Gogh)
de4b4bf merge bitcoin#24108: Replace RecursiveMutex cs_addrLocal with Mutex, and rename it (Kittywhiskers Van Gogh)
2f7a138 merge bitcoin#24079: replace RecursiveMutex cs_SubVer with Mutex (and rename) (Kittywhiskers Van Gogh)
23b152c merge bitcoin#22829: various RecursiveMutex replacements in CConnman (Kittywhiskers Van Gogh)
362e310 merge bitcoin#21943: Dedup and RAII-fy the creation of a copy of CConnman::vNodes (Kittywhiskers Van Gogh)
bf98ad6 merge bitcoin#22782: Remove unused MaybeSetAddrName (Kittywhiskers Van Gogh)
2b65526 merge bitcoin#21167: make CNode::m_inbound_onion public, initialize explicitly (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  * Dependent on #6001
  * Dependency for #6018
  * Partially reverts ff69e0d from #5336 due to `Span<CNode*>`'s incompatibility with `CConnman::NodesSnapshot::Snap()` (returning `const std::vector<CNode*>&`)

    ```
    masternode/sync.cpp:147:18: error: no matching member function for call to 'RequestGovernanceObjectVotes'
            m_govman.RequestGovernanceObjectVotes(snap.Nodes(), connman);
            ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ./governance/governance.h:360:9: note: candidate function not viable: no known conversion from 'const
    std::vector<CNode *>' to 'CNode &' for 1st argument
        int RequestGovernanceObjectVotes(CNode& peer, CConnman& connman) const;
          ^
    ./governance/governance.h:361:9: note: candidate function not viable: no known conversion from 'const std::vector<CNode *>' to 'Span<CNode *>' for 1st argument
        int RequestGovernanceObjectVotes(Span<CNode*> vNodesCopy, CConnman& connman) const;
          ^
    1 error generated.
    ```
  * Dash already implements its own `CNode*` iteration logic in [dash#1382](#1382) and implemented additional capabilities in [dash#1575](#1575), which meant backporting [bitcoin#21943](bitcoin#21943) involved migrating Dash-specific code to upstream logic that needed to be modified to implement expected functionality.

  * Unlike Bitcoin, Dash maintains a map of every raw `SOCKET` corresponding to a pointer of their `CNode` instance and uses it to translate socket sets to their corresponding `CNode*` sets. This is done to accommodate for edge-triggered modes which have an event-socket relationship, as opposed to level-triggered modes, which have a socket-event relationship.

    This means that `CConnman::SocketHandlerConnected()` doesn't require access to a vector of all `CNode` pointers and therefore, the argument `nodes` has been omitted.

  ## Checklist:

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)**
  - [x] I have added or updated relevant unit/integration/functional/e2e tests **(note: N/A)**
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  PastaPastaPasta:
    utACK 5dde8e7

Tree-SHA512: 5685d8ebb4fa1f10d018e60d9b0efc3100ea13ac437e7892a09ad3f86d6ac6756e4b5a08ebe70de2eabb27740678e10b975d319f2d553ae5b27dafa71dba0a9f
PastaPastaPasta added a commit that referenced this pull request May 14, 2024
…keup pipes logic out of `CConnman` and into `EdgeTriggeredEvents` and `WakeupPipes`

bd8b5d4 net: add more details to log information in ETE and `WakeupPipes` (Kittywhiskers Van Gogh)
ec99294 net: restrict access `EdgeTriggerEvents` members (Kittywhiskers Van Gogh)
f24520a net: log `close` failures in `EdgeTriggerEvents` and `WakeupPipe` (Kittywhiskers Van Gogh)
b8c3b48 refactor: introduce `WakeupPipe`, move wakeup select pipe logic there (Kittywhiskers Van Gogh)
ed7d976 refactor: move wakeup pipe (de)registration to ETE (Kittywhiskers Van Gogh)
f50c710 refactor: move `CConnman::`(`Un`)`registerEvents` to ETE (Kittywhiskers Van Gogh)
3a9f386 refactor: move `SOCKET` addition/removal from interest list to ETE (Kittywhiskers Van Gogh)
212df06 refactor: introduce `EdgeTriggeredEvents`, move {epoll, kqueue} fd there (Kittywhiskers Van Gogh)
3b11ef9 refactor: move `CConnman::SocketEventsMode` to `util/sock.h` (Kittywhiskers Van Gogh)

Pull request description:

  ## Motivation

  `CConnman` is an entity that contains a lot of platform-specific implementation logic, both inherited from upstream and added upon by Dash (support for edge-triggered socket events modes like `epoll` on Linux and `kqueue` on FreeBSD/Darwin).

  Bitcoin has since moved to strip down `CConnman` by moving peer-related logic to the `Peer` struct in `net_processing` (portions of which are backported in #5982 and friends, tracking efforts from bitcoin#19398) and moving socket-related logic to `Sock` (portions of which are aimed to be backported in #6004, tracking efforts from bitcoin#21878).

  Due to the direction being taken and the difference in how edge-triggered events modes operate (utilizing interest lists and events instead of iterating over each socket) in comparison to level-triggered modes (which are inherited from upstream), it would be reasonable to therefore, isolate Dash-specific code into its own entities and minimize the information `CConnman` has about its internal workings.

  One of the visible benefits of this approach is comparing `develop` (as of this writing, d44b0d5) and this pull request for interactions between wakeup pipes logic and {`epoll`, `kqueue`} logic.

  This is what construction looks like:

  https://github.com/dashpay/dash/blob/d44b0d5dcb9b54821d582b267a8b92264be2da1b/src/net.cpp#L3358-L3397

  But, if we segment wakeup pipes logic (that work on any platform with POSIX APIs and excludes Windows) and {`epoll`, `kqueue`} logic (calling them `EdgeTriggeredEvents` instead), construction looks different:

  https://github.com/dashpay/dash/blob/907a3515170abed4ce9018115ed591e6ca9f4800/src/util/wpipe.cpp#L12-L38

  Now wakeup pipes logic doesn't need to know what socket events mode is being used nor are the implementation aspects of (de)registering it its concern, that is now `EdgeTriggeredEvents` problem.

  ## Additional Information

  * This pull request will need testing on macOS (FreeBSD isn't a tier-one target) to ensure that lack of breakage in `kqueue`-specific logic.

  ## Breaking Changes

  * Dependency for #6018
  * More logging has been introduced and existing log messages have been made more exhaustive. If there is parsing that relies on a particular template, they will have to be updated.
  * If `EdgeTriggeredEvents` or `WakeupPipes` fail to initialize or are incorrectly initialized and not destroyed immediately, any further attempts at calling any of its functions will result in an `assert`-induced crash. Earlier behavior may have allowed for silent failure but segmentation of logic from `CConnman` means the newly created instances must only exist if the circumstances needed for it to initialize correctly are present.

    This is to ensure that `CConnman` doesn't have to concern itself with internal workings of either entities.

  ## Checklist:

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas
  - [x] I have added or updated relevant unit/integration/functional/e2e tests **(note: N/A)**
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  PastaPastaPasta:
    utACK bd8b5d4

Tree-SHA512: 8f793d4b4f2d8091e05bb9cc108013e924bbfbf19081290d9c0dfd91b0f2c80652ccf853f1596562942b4433509149c526e111396937988db605707ae1fe7366
Copy link

This pull request has conflicts, please rebase.

@kwvg kwvg force-pushed the net_processing_6 branch from ca1d635 to 6384f42 Compare May 15, 2024 09:45
@kwvg kwvg force-pushed the net_processing_6 branch from 6384f42 to ac867bb Compare May 19, 2024 07:44
Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few suggestions. also, kqueue still doesn't work :/

src/net.cpp Show resolved Hide resolved
src/net.cpp Outdated Show resolved Hide resolved
src/util/sock.cpp Outdated Show resolved Hide resolved
src/util/sock.cpp Outdated Show resolved Hide resolved
src/net.cpp Outdated Show resolved Hide resolved
Copy link

This pull request has conflicts, please rebase.

Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few more suggestions

src/net.cpp Outdated Show resolved Hide resolved
src/net.cpp Outdated
// Check for the readiness of the already connected sockets and the
// listening sockets in one call ("readiness" as in poll(2) or
// select(2)). If none are ready, wait for a short while and return
// empty sets.
SocketEvents(snap.Nodes(), recv_set, send_set, error_set, only_poll);
events_per_sock = GenerateWaitSockets(snap.Nodes());
if (events_per_sock.empty() || !Sock::IWaitMany(socketEventsMode, GetModeFileDescriptor(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

events_per_sock.empty() should prevent IWaitMany execution for select and poll only, this check should be done inside the corresponding handlers

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not comfortable doing this, I think the responsibility of ensuring the set is populated for the appropriate mode lies with the caller (more so now that the wait code is moved out of CConnman). Instead I've made the events_per_sock.empty() check conditional on being in a LT SEM since in an ET SEM, we'd be discarding the set anyways.

#ifdef USE_POLL
bool Sock::WaitManyPoll(wrap_fn wrap_func, std::chrono::milliseconds timeout, EventsPerSock& events_per_sock)
{
std::vector<pollfd> pfds;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<pollfd> pfds;
if (events_per_sock.empty()) return false;
std::vector<pollfd> pfds;


bool Sock::WaitManySelect(wrap_fn wrap_func, std::chrono::milliseconds timeout, EventsPerSock& events_per_sock)
{
fd_set recv;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fd_set recv;
if (events_per_sock.empty()) return false;
fd_set recv;

src/net.cpp Outdated
if ((socketEventsMode == SocketEventsMode::Poll && !only_poll) ||
socketEventsMode == SocketEventsMode::Select)
{
interruptNet.sleep_for(timeout);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should actually always be SELECT_TIMEOUT_MILLISECONDS, should not depend on only_poll

Suggested change
interruptNet.sleep_for(timeout);
interruptNet.sleep_for(std::chrono::milliseconds(SELECT_TIMEOUT_MILLISECONDS));

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The !only_poll condition was originally added in to make sure the timeout would always be SELECT_TIMEOUT_MILLISECONDS by making sure that we aren't using timeout when it'll be 0 (when only_poll is true) (source).

It wasn't originally meant to mirror the !only_poll conditional sleep in SocketEventsPoll since the GenerateSelectSet was renamed to GenerateWaitSockets and we don't run Sock::IWaitMany at all if the set's empty, removing the need for a conditional timeout.

The original code did use the wait in every mode (with the above condition to avoid a sleep of 0), mirroring upstream (source). With that in mind, I've changed the condition to only do the sleep if we're in an LT SEM (mirroring upstream behaviour), skipping it for ET SEMs.

src/net.cpp Outdated Show resolved Hide resolved
@kwvg kwvg force-pushed the net_processing_6 branch from e3f2ae7 to 63c093d Compare June 6, 2024 10:09
@kwvg kwvg requested a review from UdjinM6 June 6, 2024 16:48
Copy link

This pull request has conflicts, please rebase.

PastaPastaPasta added a commit that referenced this pull request Jun 12, 2024
, bitcoin#25426, bitcoin#24378 (sockets backports)

c24804c merge bitcoin#24378: make bind() and listen() mockable/testable (Kittywhiskers Van Gogh)
be19868 merge bitcoin#25426: add new method Sock::GetSockName() that wraps getsockname() and use it in GetBindAddress() (Kittywhiskers Van Gogh)
6b159f1 merge bitcoin#24357: make setsockopt() and SetSocketNoDelay() mockable/testable (Kittywhiskers Van Gogh)
9c751ef merge bitcoin#23604: Use Sock in CNode (Kittywhiskers Van Gogh)
508044c merge bitcoin#21879: wrap accept() and extend usage of Sock (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  * Dependency for #6018

  ## Breaking Changes

  None expected.

  ## Checklist:

  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)**
  - [x] I have added or updated relevant unit/integration/functional/e2e tests
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_

ACKs for top commit:
  PastaPastaPasta:
    utACK c24804c

Tree-SHA512: 5149de0f1983bb56517c30b31d137b33b8a49b0e695be2dada71ff3e3bb22908556db343391b7df7e3c7c2ed60ae1fc11a4f4af4f47e35a2a1d3ce7463c03d41
@kwvg kwvg force-pushed the net_processing_6 branch from 63c093d to 6519581 Compare June 12, 2024 15:38
@kwvg kwvg changed the title refactor: backport bitcoin#21879, #23604, #24357, #24356, extend -socketevents to Sock, implement Sock::WaitMany{Epoll, KQueue} refactor: backport bitcoin#24356, extend -socketevents to Sock, implement Sock::WaitMany{Epoll, KQueue} Jun 12, 2024
@UdjinM6 UdjinM6 removed this from the 21 milestone Jul 26, 2024
Copy link

github-actions bot commented Sep 4, 2024

This pull request has conflicts, please rebase.

Copy link

This pull request has conflicts, please rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants