Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nzxt-kraken3: PWM setting fixes; status reporting optimization #34

Merged
merged 5 commits into from
Jan 10, 2023

Conversation

aleksamagicka
Copy link
Member

This should resolve the issues noted in #7 and optimize Z series status reporting.

  • Values that convert to less than PUMP_DUTY_MIN will now be silently set to it without returning error codes
  • When setting PWM duty directly, the same value that was set will be reported until the next sensor report interrupt comes through, so that userspace tools will be happ(y/ier)
  • I've removed priv->duty_input in favor of recording the value in kraken3_channel_info for each channel
  • Z series don't send the sensor report on their own. Instead of requesting it every time something is read, request it only when the validity interval has passed
  • Reformatted code

Seems to work when using fan2go, here's my config: fan2go.txt. I had to set maxRpmDiffForSettledFan: 20 because X53 varied between 14-20 as far as I've seen, and the default value is 10. Didn't yet test with pwmconfig and fancontrol.

@stalkerg
Copy link
Contributor

stalkerg commented Jan 9, 2023

Thanks a lot! I will check today or tomorrow!

nzxt-kraken3.c Outdated
*/
spin_lock_bh(&pwm_write_lock);
priv->channel_info[channel].reported_duty = val;
spin_unlock_bh(&pwm_write_lock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any other place where this spin lock is used.
Isn't the assignment atomic here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, might be. Will investigate later, this was a first shot at resolving this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, reported_duty is u16 it should be trivial to do as atomic.

Copy link
Member

@jonasmalacofilho jonasmalacofilho Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't understand this spinlock: unless it's also taken when reading, it doesn't do much.


@amezin ,

Regarding 16-bit stores being atomic: they aren't, at least not in the context of either the C or the Linux memory models (well, the one documented in tools/memory-model/Documentation/). By those definitions, either store to reported_duty can cause a data race with each other or the load in kraken3_read.

(For completeness: in the context of the CPU, 16-bit stores might be compiled into a single/simple store, in which case they would be atomic on x86. Same thing for the load, and looking at the surrounding code, this is probably the case here. And I suppose there is almost no one connecting these coolers to an ARM machine, even if just for external monitoring of a x86 system).

But it seems that, at least in the HWMON subsystem (and, to be fair, other parts of the kernel), plain accesses can be used even if they theoretically can cause data races, as long as the compiler (currently) produces sensible code (reminder: a data race is UB), and that the impact of a perhaps stale or garbage value isn't relevant or can't be observed in practice.

I say this based on a discussion I had with Guenter on the mailing list. There's a bit of noise due to some communication issues, but you can find the entire thread here.

TL;DR: @aleksamagicka, removing the spinlock and just keeping the plain store and load is probably fine (like we already do for the rest of the structure). But please test it well.

(Foreshadowing: this is fine meme).


@stalkerg,

Yes, making the store and load atomic, or even just "marked" (i.e. using the kernel's READ_ONCE/WRITE_ONCE macros, which essentially do volatile accesses), would also be possible.

Still, I think the (usually) cheaper plain accesses are preferred if no issues can be observed when using them.

Copy link
Member Author

@aleksamagicka aleksamagicka Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spinlock was me copying the nzxt-smart code, thinking about it it probably does not do anything here, I'll test (although I'm no expert on kernel concurrency).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, judging from Guenter's emails (thanks for the link!), it's not terribly important for this to be synchronized because it's for display/info only. (Do correct me if I got that wrong.)

That's more or less what I understood from his comments too.

The spinlock might not have done much locking, but as the function names suggest it does disable bottom half routines - it remains to be seen/tested if that applies to kraken3_raw_event() as the actual processing function for interrupts.

In the past I remember looking into on which context the raw event handler runs.

Assuming this old comment of mine is correct, the handler runs in a hard IRQ. And IIRC, that means that spin_lock_irq[save] would be necessary to disable it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonasmalacofilho I know, I worked with atomic a lot, especially for PowerPC. Just in case, spinlock internally works based on an atomic operation (CAS or TAS). It means if possible using atomics directly it will not be so costly as spinlock in any case. But as Guenter is mention - in such a case we shouldn't worry about guarding reported_duty at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonasmalacofilho Ok, I should've said "effectively atomic". I. e. another thread will read either the old value or new one, not garbage. Even on ARM. Not in generic C, but in the Linux kernel with supported compilers and flags. Or is this incorrect too?

Copy link
Member

@jonasmalacofilho jonasmalacofilho Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amezin,

Even on ARM.

You're right, thanks for the correction. (I had previously implied that a simple 16-bit load/store would only be atomic, at the CPU level, on x86).

Not in generic C, but in the Linux kernel with supported compilers and flags. Or is this incorrect too?

According to the LKMM:1

if it [the program] does [contain race candidates which may execute concurrently] then the LKMM says there is a potential data race and makes no predictions about the program's outcome.

Regarding compiler flags used by the kernel to make compilers work more conservatively, I'm only aware of a few of them, related to other types of optimizations. So I can't answer that part.


By the way, there's an interesting (and reasonably recent) article on LWM about this: Who's afraid of a big bad optimizing compiler?

P.S. I have the impression that there are two groups within the kernel: some tend to approach these problems from the point of view of what the compiler is allowed to do; some from the point of view of whether stricter accesses or synchronization can be shown to be necessary.

P.P.S. Personally, I find performance arguments against READ_ONCE/WRITE_ONCE not compelling when they are a viable way to prevent a clear potential data race: if those marked accesses prevent optimizations, that usually suggests that they are indeed necessary; and if they don't, they are free now, and a safeguard against future compiler versions producing different and potentially broken code.

Footnotes

  1. https://github.com/torvalds/linux/blob/1fe4fd6f5cad/tools/memory-model/Documentation/explanation.txt#L2065-L2068

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonasmalacofilho Thanks, if it runs in hard IRQ then those spinlock bottom half calls really do nothing.

@stalkerg
Copy link
Contributor

stalkerg commented Jan 9, 2023

It seems like working as expected, I also have no any significant issues with fan2go. Thanks again!

nzxt-kraken3.c Outdated
*/
spin_lock_bh(&pwm_write_lock);
priv->channel_info[channel].reported_duty = val;
spin_unlock_bh(&pwm_write_lock);
Copy link
Member

@jonasmalacofilho jonasmalacofilho Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't understand this spinlock: unless it's also taken when reading, it doesn't do much.


@amezin ,

Regarding 16-bit stores being atomic: they aren't, at least not in the context of either the C or the Linux memory models (well, the one documented in tools/memory-model/Documentation/). By those definitions, either store to reported_duty can cause a data race with each other or the load in kraken3_read.

(For completeness: in the context of the CPU, 16-bit stores might be compiled into a single/simple store, in which case they would be atomic on x86. Same thing for the load, and looking at the surrounding code, this is probably the case here. And I suppose there is almost no one connecting these coolers to an ARM machine, even if just for external monitoring of a x86 system).

But it seems that, at least in the HWMON subsystem (and, to be fair, other parts of the kernel), plain accesses can be used even if they theoretically can cause data races, as long as the compiler (currently) produces sensible code (reminder: a data race is UB), and that the impact of a perhaps stale or garbage value isn't relevant or can't be observed in practice.

I say this based on a discussion I had with Guenter on the mailing list. There's a bit of noise due to some communication issues, but you can find the entire thread here.

TL;DR: @aleksamagicka, removing the spinlock and just keeping the plain store and load is probably fine (like we already do for the rest of the structure). But please test it well.

(Foreshadowing: this is fine meme).


@stalkerg,

Yes, making the store and load atomic, or even just "marked" (i.e. using the kernel's READ_ONCE/WRITE_ONCE macros, which essentially do volatile accesses), would also be possible.

Still, I think the (usually) cheaper plain accesses are preferred if no issues can be observed when using them.

@amezin
Copy link
Member

amezin commented Jan 10, 2023

@aleksamagicka @stalkerg By the way, have you tested the driver with lm-sensors (i. e. pwmconfig+fancontrol)? These utilities sometimes fail in surprising ways. As far as I remember, they treat almost any sysfs read error as "oh the device is completely broken let's die".

@stalkerg
Copy link
Contributor

stalkerg commented Jan 10, 2023

@amezin I didn't test it in fancontrol but fan2go has totally the same behavior for read error (and it's ok).

@jonasmalacofilho jonasmalacofilho merged commit 4e91a7a into liquidctl:master Jan 10, 2023
@aleksamagicka aleksamagicka deleted the kraken3-misc-fixes branch January 10, 2023 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants