Create a worker pool that supports job streaming #1

plastikfan · 2023-08-06T06:32:18Z

So many of the examples, documentation and other go packages do not support the model where the issuance of jobs to a worker pool can be made in bursts and at anytime up to and including some known end point. A lot of the examples discovered assumes the client knowns the full job stream up front; ie we see examples where a client will create a slice containing all the jobs which are then subsequently dispatched to the pool and the channel immediately closed. These are really noddy examples that don't reflect the complexity of the real world. So we need to roll our own.

The worker pool must have the following features/properties:

must contain a means of reporting errors, via an error channel
must have an output channel off which the client can receive results
must be observable compatible; this is because this needs to be able to be wrapped inside an observable to support the reactive model
must be able to support reactive operators (only a small set of operators, ie only the ones required by extendio)
must be able to support reactive options
appropriate channels must be based upon generics rather that reflection/interface{}. But we also need a specific type let's call this envelope, which defines fixed properties for internal requirements and a client defined generic parameter, let's call this the payload.

Channels in play:
🔆 jobs (input)
🔆 results (output)
🔆 errors (output)
🔆 cancel (signal)
🔆 done (signals no more new work)

▶️ ProducerGR(observable):
- writes to job channel

▶️ PoolGR(workers):
- reads from job channel

▶️ ConsumerGR(observer):
- reads from results channel
- reads from errors channel

Both the Producer and the Consumer should be started up immediately as
separate GRs, distinct from the main GR.

* ProducerGR(observable) --> owns the job channel and should be free to close it
when no more work is available.

* PoolGR(workers) --> the pool owns the output channels

So, the next question is, how does the pool know when to close the output channels?
In theory, this should be when the jobs queue is empty and the current pool of
workers is empty. This realisation now makes us discover what the worker is. The
worker is effectively a handle to the go routine which is stored in a scoped collection.
This collection should probably be a map, who key is a uniquely generated ID
(see "github.com/google/uuid"). When the map is empty, we know there are no more
workers active to send to the outputs, therefore we can close them.

TODO:

create a basic worker pool #2
redesign worker GR lifetime to span lifetime of app instead of a new GR for each job #4
add error reporting channel to worker pool
add optional heartbeat into worker GRs #5
add ability to define worker pool options
add ability to define worker pool operators

The text was updated successfully, but these errors were encountered:

plastikfan added the feature New feature or request label Aug 6, 2023

plastikfan self-assigned this Aug 6, 2023

plastikfan mentioned this issue Aug 13, 2023

implement pool cancellation #15

Closed

plastikfan mentioned this issue Aug 25, 2023

use annotated work group in worker pool #37

Closed

plastikfan closed this as completed Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a worker pool that supports job streaming #1

Create a worker pool that supports job streaming #1

plastikfan commented Aug 6, 2023 •

edited

Loading

Create a worker pool that supports job streaming #1

Create a worker pool that supports job streaming #1

Comments

plastikfan commented Aug 6, 2023 • edited Loading

plastikfan commented Aug 6, 2023 •

edited

Loading