Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zed locks up machine periodically #16982

Open
SimonGreenaway opened this issue Jan 23, 2025 · 4 comments
Open

zed locks up machine periodically #16982

SimonGreenaway opened this issue Jan 23, 2025 · 4 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@SimonGreenaway
Copy link

System information

Type Version/Name
Distribution Name Fedora Core
Distribution Version 41
Kernel Version 6.12.9-200
Architecture x86
OpenZFS Version 2.3.0-1

Describe the problem you're observing

For the last 3 months, every day or so a complete machine lockup. Nothing obvious in kernel logs, but just caught 'zed' locking the machine. Load averages rises by about 1 per second, all cores go 100% until machine locks. Memory only slightly increased with lots free. Saw 10-12 z_* threads running just before zed went 100%.

Describe how to reproduce the problem

At random during normal system use when problem occurs, nothing to disk/cpu heavy (e.g. web browsing).

Include any warning/errors/back-traces from the system logs

Nothing found in the logs, only caught issue as 'top' was open.

@SimonGreenaway SimonGreenaway added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jan 23, 2025
@SimonGreenaway
Copy link
Author

SimonGreenaway commented Jan 23, 2025

I've modified '/usr/lib/systemd/system/zfs-zed.service' to run zed with -v. Saving zed state to file with -s.

Will post output when problem re-occurs.

@justinpryzby
Copy link

Have you been running 2.3 branch or GIT head since 3 months ago ?
Did the problem start at a certain update? From 2.2 or ??

@SimonGreenaway
Copy link
Author

I remember it was when kernel 6.10 came out (July '24), as I assumed it was a kernel issue. 6.10 had the issue, but 6.9 didn't. Annoyingly, I upgraded to FC41 and no longer had 6.9 available - and the problem came back.

I upgrade ZFS as it become available on fedora updates (which is usually a couple of days after release), so I guess I was running 2.2.4 when the problem first appeared.

I've been chasing it as a kernel issue, but now having caught zed it's maybe a kernel-zfs interaction issue?

@SimonGreenaway
Copy link
Author

Image
Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants