[FM-751] Add system task offset evaluation strategy #179

jaro0149 · 2024-11-11T05:44:07Z

FM-751 Add system task offset evaluation strategy

Pull Request type

Bugfix
Feature
Refactoring (no functional changes, no api changes)
Build related changes (Please run ./gradlew generateLock saveLock to refresh dependencies)
WHOSUSING.md
Other (please describe):

Changes in this PR

- Added option to customize strategy used for computation
  of a postponed system task per task type:

  conductor.app.system-task-offset-evaluation.[task-type]=[strategy]

  [task-type] - type of the task, e.g. join, simple, ...
  [strategy] - strategy used for computation of the system task
    offset; currently supported options are:
    a. 'constant_default_offset'
    b. 'backoff_to_default_offset'
    c. 'scaled_by_queue_size'
    d. 'scaled_by_task_duration'

- 'constant_default_offset' - uses constant value of set
  'systemTaskWorkerCallbackDuration' configuration property;
  by default, it is used by all but 'join' system tasks
- 'backoff_to_default_offset' - scales offset based on task
  poll-count in exponential way (2^n) up to value of the
  'systemTaskWorkerCallbackDuration' configuration property;
  by default, it is used by 'join' system task
- 'scaled_by_queue_size' - scales offset based on task poll-count
  and actual queue size in exponential way (2^n) up to value of:
  a. 'backoff_to_default_offset', if queue size == 0
  b. 'backoff_to_default_offset'*'queue_size' otherwise
  this strategy is not used in the default configuration
- 'scaled_by_task_duration' - Computes the evaluation offset for
  a postponed task based on the task's duration and settings that
  define the offset for different levels of task durations.

Reasoning:
- New strategies were implemented primarily to solve performance
  issues on join queues that contain a large number of join tasks
  blocked by wait/human actions in some forks for several
  days/weeks.
- Implemented strategies can easily be extended in the future
  while preserving backwards compatibility.
- Improved configurability of the task offset evaluation.

Alternatives considered

https://nitish1503.medium.com/decoding-challenges-with-netflix-conductor-6a623b47291f - it would require too big changes in the core architecture of the conductor

- Added option to customize strategy used for computation of a postponed system task per task type: conductor.app.system-task-offset-evaluation.[task-type]=[strategy] [task-type] - type of the task, e.g. join, simple, ... [strategy] - strategy used for computation of the system task offset; currently supported options are: a. 'constant_default_offset' b. 'backoff_to_default_offset' c. 'scaled_by_queue_size' - 'constant_default_offset' - uses constant value of set 'systemTaskWorkerCallbackDuration' configuration property; by default, it is used by all but 'join' system tasks - 'backoff_to_default_offset' - scales offset based on task poll-count in exponential way (2^n) up to value of the 'systemTaskWorkerCallbackDuration' configuration property; by default, it is used by 'join' system task - 'scaled_by_queue_size' - scales offset based on task poll-count and actual queue size in exponential way (2^n) up to value of: a. 'backoff_to_default_offset', if queue size == 0 b. 'backoff_to_default_offset'*'queue_size' otherwise this strategy is not used in the default configuration - Implemented new 'scaled_by_queue_size' strategy is appropriate for relatively big queues (100-1000s tasks) that contain long-running tasks (days-weeks) with high number of poll-counts. Reasoning: - New strategy was implemented primarily to solve performance issues on join queues that contain a large number of join tasks blocked by wait/human actions in some forks for several days/weeks. - Implemented strategies can easily be extended in the future while preserving backwards compatibility. - Improved configurability of the task offset evaluation.

- from BACKOFF_TO_DEFAULT_OFFSET - to SCALED_BY_QUEUE_SIZE

- goal: cleaner goals, separated configuration and implementation aspects - we can directly inject ConductorProperties into implementations of strategies that are represented by Spring components - introduction of TaskOffsetEvaluationSelector that allows other component to load implementation of specific strategy

- Computes the evaluation offset for a postponed task based on the task's duration and settings that define the offset for different levels of task durations. - In this strategy offset increases by steps based on settings that define the offset for different levels of task durations. Task duration is derived from {@link TaskModel#getScheduledTime()} and current time. - This strategy is appropriate for tasks that have a wide range of durations and the offset should be scaled based on the task's duration. - The defined keys in the settings compose the duration intervals for which the offset will be set to the corresponding value: <0, d1) = 0, <d1, d2) = d1, <d2, d3) = d2. - The order of the keys is not important as the map is sorted by the key before the evaluation.

jaro0149 added the enhancement New feature or request label Nov 11, 2024

jaro0149 requested review from Jozefiel and MatejGlemba November 11, 2024 05:51

MatejGlemba approved these changes Nov 11, 2024

View reviewed changes

Jozefiel approved these changes Nov 11, 2024

View reviewed changes

jaro0149 and others added 5 commits November 11, 2024 14:58

FM-751 Change default offset strategy of join task

4f4478c

- from BACKOFF_TO_DEFAULT_OFFSET - to SCALED_BY_QUEUE_SIZE

FM-751 Add config property for SCALED_BY_TASK_DURATION strategy

49e2a98

FM-751 Revert offset settings to default value

05137a3

jaro0149 merged commit 54a66d3 into master Jan 8, 2025
2 checks passed

jaro0149 deleted the fm-751 branch January 8, 2025 09:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FM-751] Add system task offset evaluation strategy #179

[FM-751] Add system task offset evaluation strategy #179

jaro0149 commented Nov 11, 2024 •

edited

Loading

[FM-751] Add system task offset evaluation strategy #179

[FM-751] Add system task offset evaluation strategy #179

Conversation

jaro0149 commented Nov 11, 2024 • edited Loading

Pull Request type

Changes in this PR

Alternatives considered

jaro0149 commented Nov 11, 2024 •

edited

Loading