Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for late resume #139

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from
Open

Support for late resume #139

wants to merge 6 commits into from

Conversation

truffle0
Copy link
Contributor

Adds a new late_resume parameter, which moves the trigger point from init_early to init_premount.
This allows cryptsetup and LVM to initialize before the system attempts to resume, making encrypted or otherwise non-trival-to-access swap partitions accessible.
In addition it will use allow using resume=UUID=* to specify the resume device, as any partitions that aren't directly on disk don't have a PARTUUID.

Which this option is safe in theory and I have been using it without issue for a few months, it is important to note that configurations that cause writes to the device before triggering resume run the risk of system instability and data loss.
I have purposefully used init_premount section to trigger before filesystems are mounted as the kernel shouldn't make any metadata changes (save for possibly LVM access times) before mounting filesystems.

I have also made sure to include warnings in the configuration docs and the output of ugrd when this option is enabled to make sure users understand the implications (including a link to the kernel docs that detail the risks).
I'm not sure whether the warning is overkill, as I believe this option is safe to use under normal encrypted swap and lvm setups, but to cater to potentially non-standard setups I think it's important to have.

controlled by the 'late_resume' parameter, will move the trigger point
for resume to after devices have been decrypted or discovered
allows resume from encrypted swap or devices requiring extra handling be visible to the kernel
@truffle0 truffle0 force-pushed the dev branch 2 times, most recently from 872fe17 to 6067efc Compare December 11, 2024 08:50
currently it's more of a rearrangment of the original code, with some
added safety/integrity checks
`resume` is now a separate function, the plan is to have the code for
actually resuming within this function and code for locating the device
outside.
in the more general case, if `late_resume` is enabled, the initram will
check at both the `init_early` and again at the `init_premount` stage
for the presence of the resume device. Once host-only is implemented
then it'll be able to determine at build time which state it'll need to
check at.
For safety checks, currently it checks whether the device is a
valid block device, whether resume has already been attempted previous
(based on the content of `/sys/power/resume`) & whether there are are
any other mounted block devices.
@desultory desultory force-pushed the dev branch 2 times, most recently from b86a171 to 80f9d3c Compare December 22, 2024 22:07
@desultory desultory force-pushed the dev branch 6 times, most recently from 4a904da to a1cf835 Compare December 31, 2024 03:25
@spflaumer
Copy link

spflaumer commented Jan 2, 2025

hello everyone,
i'm having an issue using these proposed changes, on my current setup, since i'm storing a detached LUKS header and a GPG encrypted key on a detachable drive, which would need to be mounted to unlock the LUKS partition where my swapfile containing the resume information is contained to be able to resume. however, that mounted partition then would only ever get unmounted after handle_late_resume ran, which causes the entire boot process to fail.

as such, i would like to propose to move the umount_fstab function from init_mount_late to init_late, which fits right in between the crypt_init function within init_main and handle_late_resume within init_premount.
this is the patch (works flawlessly with my config, which is basically a combination of the detached_header.toml and gpg_keyfile.toml examples):

--- a/src/ugrd/fs/mounts.toml
+++ b/src/ugrd/fs/mounts.toml
@@ -41,11 +41,14 @@
 [imports.init_main]
 "ugrd.fs.mounts" = [ "mount_fstab" ]

+[imports.init_late]
+"ugrd.fs.mounts" = [ "umount_fstab" ]
+
 [imports.functions]
 "ugrd.fs.mounts" = [ "mount_root" ]

 [imports.init_mount_late]
-"ugrd.fs.mounts" = [ "mount_late", "umount_fstab" ]
+"ugrd.fs.mounts" = [ "mount_late" ]

 [custom_parameters]
 mounts = "dict"  # Add the mounts property, used to define the mounts to be made in the fstab

or, perhaps crypt_init, mount_base and mount_fstab could be moved into init_pre?
if that's not possible, maybe an option could be added to try and automatically unmount the partitions with a warning message instead of outright failing?
as a side note: maybe i'm misunderstanding how the mechanism behind hooks and imports works exactly, but i feel like the addition of dependencies or requirements between individual functions, akin to how it's done by openrc services, would be an extremely powerful way to order the functions with a potential reduction in the amount of stages necessary (e.g.: functions within init_main that require the mount_fstab function could do an after = [ "mount_fstab" ], eliminating the usage of init_pre, but keeping the init script the same. similarly, handle_late_resume could do an after = [ "umount_fstab" ], as well as before = [ "mount_cmdline_root" ] or even before_hook = ["init_mount"])

@desultory
Copy link
Owner

hello everyone, i'm having an issue using these proposed changes, on my current setup, since i'm storing a detached LUKS header and a GPG encrypted key on a detachable drive, which would need to be mounted to unlock the LUKS partition where my swapfile containing the resume information is contained to be able to resume. however, that mounted partition then would only ever get unmounted after handle_late_resume ran, which causes the entire boot process to fail.

as such, i would like to propose to move the umount_fstab function from init_mount_late to init_late, which fits right in between the crypt_init function within init_main and handle_late_resume within init_premount. this is the patch (works flawlessly with my config, which is basically a combination of the detached_header.toml and gpg_keyfile.toml examples):

This situation is a bit tricky, I need to look into it a bit more, but I'm not 100% certain it's safe to mount storage like that. I may be overthinking it, but mounting or using any storage which will be resumed to could potentially cause issues.

@desultory
Copy link
Owner

desultory commented Jan 2, 2025

or, perhaps crypt_init, mount_base and mount_fstab could be moved into init_pre? if that's not possible, maybe an option could be added to try and automatically unmount the partitions with a warning message instead of outright failing? as a side note: maybe i'm misunderstanding how the mechanism behind hooks and imports works exactly, but i feel like the addition of dependencies or requirements between individual functions, akin to how it's done by openrc services, would be an extremely powerful way to order the functions with a potential reduction in the amount of stages necessary (e.g.: functions within init_main that require the mount_fstab function could do an after = [ "mount_fstab" ], eliminating the usage of init_pre, but keeping the init script the same. similarly, handle_late_resume could do an after = [ "umount_fstab" ], as well as before = [ "mount_cmdline_root" ] or even before_hook = ["init_mount"])

I've considered this, and am considering a "v2" (likely meriting a project major version bump) module format that supports this, possibly with other changes.

Right now this can sorta be hacked in by letting a module directly modify lists in a runlevel. If it knows it needs to run before something which is already in the list, it can insert itself before that.

Currently, I've tried to split init levels up enough that these issues are simply avoided by using an appropriate runlevel.

For a v2 module type, im somewhat considering forcing the module info to be defined in a python class, instead of toml. I like that the modules use the exact same parser as the user config, but this can make module loading/usage a bit more complicated.

This would help some because users could easily set config as properties in classes so it can dynamically resolve the value of things, for example.

This could also theoretically be implemented with some config like "before" or "after" which can be processed somewhere like here, to reorder stuff within a runlevel before actually making the init file: https://github.com/desultory/ugrd/blob/main/src/ugrd/initramfs_generator.py#L141
Doing this, I think the main consideration would be anything directly running a func, without being part of hook-runner. (this could also be an issue with masks). I would consider this an unsupported method though.

It's also a bit tricky because this type of info would be best parsed right before it's used. It's a minor thing but I like the config saying what it will do as soon as possible. If you set something to run "before" something else, and that other thing wasn't processed yet, what happens if it just adds itself but later the thing it says it should be "before" wants to run before something even earlier? I don't think it can be as simple as having things check they are at the right place as they are added.

@truffle0
Copy link
Contributor Author

truffle0 commented Jan 5, 2025

hello everyone, i'm having an issue using these proposed changes, on my current setup, since i'm storing a detached LUKS header and a GPG encrypted key on a detachable drive, which would need to be mounted to unlock the LUKS partition where my swapfile containing the resume information is contained to be able to resume. however, that mounted partition then would only ever get unmounted after handle_late_resume ran, which causes the entire boot process to fail.

as such, i would like to propose to move the umount_fstab function from init_mount_late to init_late, which fits right in between the crypt_init function within init_main and handle_late_resume within init_premount. this is the patch (works flawlessly with my config, which is basically a combination of the detached_header.toml and gpg_keyfile.toml examples):

I haven't experimented much with this configuration to be honest, aside from adding in a safety check for mounted block devices at the time of resume (which doesn't exactly solve the issue).

In theory it is perfectly safety to mount storage as long as you can ensure there is absolutely no writes (or it's just not currently mounted in the hibernated kernel, but that can't be check and isn't safe to assume).

The issue I've run into previously is that there isn't a universally safe way to ensure mounts are read-only. Mounting with mount -o ro should work fine if the particular filesystem driver respects it, however the kernel docs don't commit to this as there are exceptions where some still write metadata (would be nice and simple if it wasn't allowed).

Another mechanism is using the block layer, using blockdev --setro, buuuut this has a similar disclaimer. While it's more assurance than purely using mount -o ro and will propagate through most device mappers, dm_crypt and other layers on top of the raw block device, there are a few layers that just don't respect it.
LVM is one of these annoying. Lets say you start with a read-only block device, add an LVM layer on top, then mount a filesystem from one of the logical volumes, you will actually be able to write to that filesystem as it pretty much just ignores the read-only property once it hits LVM.

Currently the safety check I added is pretty much fail if it sees any physical devices being used. There is a way (using blkid) of tracing which devices are set to read-only though. I'll try to get a check working that allows actually read-only devices to pass the test (and un-mount before resume just to be extra cautious). Might have to add in extra hooks to run blockdev --setro on entry and blockdev --setrw on exit (if the system doesn't resume).

@desultory desultory force-pushed the dev branch 10 times, most recently from 0229cd6 to 5127ef8 Compare January 12, 2025 23:41
@desultory desultory force-pushed the dev branch 2 times, most recently from f9e6c74 to 0de17f7 Compare January 12, 2025 23:50
@desultory desultory force-pushed the dev branch 11 times, most recently from 727803b to acfc085 Compare January 21, 2025 03:32
@desultory desultory force-pushed the dev branch 11 times, most recently from 38cdecd to d1e0e0d Compare January 26, 2025 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants