Optimization: Remove constraints on arrow firing #378

nathanwbrei · 2024-10-29T17:22:36Z

Previously, the scheduler would assign arrows to workers with only a hint as to whether they could actually fire. Firing logic was complex, attempting to pop some number of events for each input queue, reserving space for a predetermined number of events on each output queue, and in case fewer events proved poppable/reservable than desired, reverting the operation. This led to two problems:

Locks needed to be acquired for each input and output queue both before firing and after, whether firing was successful or not.
If the preconditions were not met, the worker would retry firing, but with a delay. Since the worker has no indication of whether the arrow was about to become fire-able or not, this leads to unnecessary waiting in places where no work is available, when work is available elsewhere. Conversely, when an event source is delayed due to having no data ready on a socket, or due to a barrier event, this leads to the scheduler rapidly hammering the event source when a more appropriate action would be to wait.

This PR refactors JArrow in order to address issue (1) and clear the way for addressing issue (2). It does the following:

Making reservations on output queues is no longer necessary, as queues are now sized to always hold max_inflight_events items.
Arrows have been refactored to support a much simpler (and easier to achieve) trigger. In order to successfully execute, each arrow needs exactly one event from one predetermined queue, (plus an optional delay parameter for polling sockets and barrier events). This should decrease the number of spurious retries in more complex processing topologies, and make debugging, inspection, visualization, and timeout logic much cleaner.
The very general JArrow::Place and JArrow::Data machinery (which was introduced in order to keep the now-removed pull/reserve/revert logic tractable) has been replaced by a much simpler JArrow::Port that removes a few virtual function calls and makes it possible to generically wire arbitrary JArrows together via a config parameter.

Resolves confusion between max_inflight_events and event_pool_capacity when the number of locations != 1

This is two fewer locks on every attempt to execute an arrow

nathanwbrei added 22 commits October 29, 2024 12:50

Always constrain the number of in-flight events

40d7d57

Clean up JEventPool

201dbed

Resolves confusion between max_inflight_events and event_pool_capacity when the number of locations != 1

Arrows, queues, and pools pass around JEvent*

582d03e

Disable queue reservations

977da32

This is two fewer locks on every attempt to execute an arrow

WIP: Refactor JArrow::execute

a30449f

Remove dead JJunctionArrow

593fd81

Remove queue reservations

97ade2d

Simplest example of new triggered arrow concept

9320920

Migrate JEventTapArrow

9debf01

Migrate all JPipelineArrows to JTriggeredArrows

032764a

Remove JPipelineArrow

3a0bef5

JTriggeredArrow supports no-input-event and timed triggers

d2cb904

JTriggeredArrow handles 'rejected' events correctly

abb5ba5

Migrate JEventSourceArrow to JTriggeredArrow

65e5246

Migrate JUnfoldArrow to JTriggeredArrow

8454a43

Migrate JFoldArrow to JTriggeredArrow

48d97bf

Rough cut of JEventFolder

4448282

Remove obsolete JArrow Place machinery

ff77aa0

Fix JFoldArrow

bd8dc2d

Replace JArrow::Place with JArrow::Port

014a666

Make arrow attachment mechanism consistent

8e5d097

Tiny fix

79e48a9

nathanwbrei merged commit a347e9b into master Oct 29, 2024
9 checks passed

nathanwbrei deleted the nbrei_optimization branch October 29, 2024 18:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization: Remove constraints on arrow firing #378

Optimization: Remove constraints on arrow firing #378

nathanwbrei commented Oct 29, 2024

Optimization: Remove constraints on arrow firing #378

Optimization: Remove constraints on arrow firing #378

Conversation

nathanwbrei commented Oct 29, 2024