Improve busy loop stability #90

douglas-raillard-arm · 2019-10-09T17:15:17Z

The current implementation of the busy loop seems to lead to somehow non-reproducible calibration values, and also potentially different duty cycle when the same workload is executed twice. Old versions of rt-app seem to behave well, but for some reason building a recent version with the same toolchain leads to these issues.

In order to solve that:

Shield the busy loop against compiler optimizations, since the function is "useless". It's cheap to do and should avoid future pain.
Use a simpler loop body that avoids branches to other functions. Hopefully, the behaviour of such body should stay the same in the future.

This PR will however change the behavior of rt-app on asymmetric systems with so called CPU PELT invariance. The invariance described by CPU capacities only holds for a given mix of instructions. Since the CPU capacities have typically been established using a benchmark X (supposedly Dhrystone), the duty cycle of any other periodic workload will scale differently when moved around on different CPUs. Changing the rt-app loop body will therefore change the utilization of the task when running on a little CPU. This can be accounted for when creating the JSON when the task will be pinned on a given class of CPUs, but there is no real solution when the task is free to move on any CPU.

Since there is no way of actually ensuring that rt-app calibration values will be inversely proportional to CPU capacities, it's a lost battle so IMO we should aim at getting reproducible results. People interested in reproducing very accurate util signals should update the in-kernel capacities of their CPUs based on rt-app calibration values before running their tests.

Fixes #89

Since the function is pure, the compiler is free to do anything it wants to that function, including removing all its call site. To avoid any such issues: * disable optimizations for that function * forbid inlining * add some no-op statement that is guaranteed to be treated as a side-effect by the compiler. Signed-off-by: Douglas RAILLARD <[email protected]>

Previous implementation based on ldexp() was not always giving consistent results from one run to another. Using more basic operations without extra branches makes the execution time of the body much more predictable, leading to more stable calibration values and precise reproduction of duty cycles. Signed-off-by: Douglas RAILLARD <[email protected]>

douglas-raillard-arm changed the title ~~[RFC] Improve busy loop stability~~ Improve busy loop stability Oct 9, 2019

douglas-raillard-arm mentioned this pull request Nov 4, 2019

rtapp execution gives unreliable actual duty cycle #89

Open

douglas-raillard-arm added 2 commits March 12, 2021 18:16

douglas-raillard-arm force-pushed the fix_busy_loop branch from ca88d46 to 884775a Compare March 12, 2021 18:16

credp requested review from jlelli and vingu-linaro October 14, 2021 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve busy loop stability #90

Improve busy loop stability #90

douglas-raillard-arm commented Oct 9, 2019

Improve busy loop stability #90

Are you sure you want to change the base?

Improve busy loop stability #90

Conversation

douglas-raillard-arm commented Oct 9, 2019