Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for Failed to open system bus: No such file or directory #14

Closed
UweSauter opened this issue Jan 2, 2024 · 6 comments
Closed

Comments

@UweSauter
Copy link

Again for archzfs/archzfs/issues/521.

As described in this Arch Linux Bug Tracker Thread one workaround is to bind-mount /run/dbus/system_bus_socket into the container.
In order to do that the worker definition inside docker-compose.yml needs to be extended to include

        volumes:
            - /run/dbus/system_bus_socket:/run/dbus/system_bus_socket

This gets clear of the Failed to open system bus: No such file or directory message but brings up a new one:

cd "/worker/all/build/packages/_utils/zfs-utils" && ccm64 s 
Output: 
----> Attempting to build package...
==> Synchronizing chroot copy [/scratch/.buildroot/root] -> [buildbot]...done
Failed to create /../../devtools.slice/devtools-buildbot.slice/arch-nspawn-2501.scope/payload subcgroup: Not a directory
==> Making package: zfs-utils 2.2.2-1 (Tue Jan  2 18:11:41 2024)
==> Retrieving sources...
  -> Downloading zfs-2.2.2.tar.gz...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0 32.2M    0  2747    0     0   3400      0  2:45:46 --:--:--  2:45:46  3400
100 32.2M  100 32.2M    0     0  22.7M      0  0:00:01  0:00:01 --:--:-- 52.6M
  -> Found zfs-utils.initcpio.install
  -> Found zfs-utils.initcpio.hook
  -> Found zfs-utils.initcpio.zfsencryptssh.install
==> Validating source files with sha256sums...
    zfs-2.2.2.tar.gz ... Passed
    zfs-utils.initcpio.install ... Passed
    zfs-utils.initcpio.hook ... Passed
    zfs-utils.initcpio.zfsencryptssh.install ... Passed
Failed to create /../../devtools.slice/devtools-buildbot.slice/arch-nspawn-3334.scope/payload subcgroup: Not a directory
==> ERROR: Build failed, check /scratch/.buildroot/buildbot/build
Command returned: 1

Searching the net provides several hits of various age:

FFY00/build-arch-package#8
systemd/systemd#14247
moby/moby#44402
https://serverfault.com/questions/1053187/systemd-fails-to-run-in-a-docker-container-when-using-cgroupv2-cgroupns-priva

If I configure Docker to use User Namespaces and Remapping yet a different error occurs…

I'll try to continue tomorrow…

@minextu
Copy link
Member

minextu commented Jan 2, 2024

Thank you! That is already further than I got when I last checked

@techmunk
Copy link
Contributor

techmunk commented Jan 7, 2024

This is my current understanding of how this all works (or does not work). I might be incorrect on some points, but I do have a solution at the end.

Bind mounting /run/dbus/system_bus_socket:/run/dbus/system_bus_socket will likely never work, as this is the hosts dbus socket. So /sys/fs/cgroup on the host gets the devtools.slice, but in the container, this slice does not exist.

There are a few options here.

  1. Mount the hosts cgroup into the guest.... This is less than ideal, and not a path I'd recommend. Feels insecure as the container could mess up the host.
  2. Build a container that boots systemd (as in the whole init). This can work, but seems a bit heavy.
  3. Force systemd-nspawn to work in the same "slice" that we're already running in. This is the best option as we're already running in a private cgroup. (Both docker and podman do this by default I believe)

If we create a wrapper systemd-nspawn script with the below, and put it in the path before the default one, everything just "works".

#!/bin/bash

exec /usr/bin/systemd-nspawn --keep-unit "$@"

I've done my testing in podman, but this should all equally apply to docker. My setup can be seen at https://gist.github.com/techmunk/26e75c44745baf343b6c1d5b8e3c1576

start.sh kicks it all off. systemd-nspawn, and build are in a directory called scripts next to start.sh.

@techmunk
Copy link
Contributor

techmunk commented Jan 7, 2024

This devtools issue from 2017 is relevant. Took me a while to find the issue again. https://bugs.archlinux.org/task/55082

@UweSauter
Copy link
Author

This devtools issue from 2017 is relevant. Took me a while to find the issue again. https://bugs.archlinux.org/task/55082

This issue is what led me to bind-mount /run/dbus/system_bus_socket into the container.

I think your second point (boot the container with Systemd instead of just run a process inside) is the most clean approach. I'm still trying to figure out the Buildbot configuration in #15 but my setup with 3 "booted" systemd-nspawn, Arch Linux containers shows no issues regarding Cgroups.

Regarding you thinking that this approach is on the heavy side I'm not entirely sure what you mean by that but then again using Docker is on the heavy side as well compared to just building systemd-nspawn containers.

(And I get a PostgresQL 16 container instead of the old PostgresQL 9.6 Debian container that currently is used.)

@techmunk
Copy link
Contributor

techmunk commented Jan 8, 2024

This devtools issue from 2017 is relevant. Took me a while to find the issue again. https://bugs.archlinux.org/task/55082

This issue is what led me to bind-mount /run/dbus/system_bus_socket into the container.

The OP still had issues when using the host bus bind mount and suggested the fix might be as simple as adding the --keep-unit argument, which does in fact seem to work. I'll accept that issue was fixing a different error that does now seem to be resolved.

I feel systemd is heavy for a container, as in general, a full init system is not really designed to run inside a container, and if there's a clean way to get it to work without it, I think that would be preferable, both from a maintenance, and resource usage perspective. My opinion of course.

In playing around with this repo, I was unable to get it to work correctly. When the worker would start, my host would be hosed (because of systemd being run inside it), and I'd have to reboot to get back control (Might be another way, did not look into it). I suspect this is because of how the container in docker shares certain cgroup/bus resources with the host. Either way, I tried tricks I had used in the past, such as setting the container env variable to something like docker, but I could not get it to work. I'd probably have to edit the build.sh script in the main repo I suspect.

I've made a pull request at #16 which at least on my system runs a build to completion.

If running a full systemd init is desired, then a different approach would have to be taken.

@UweSauter
Copy link
Author

As the build environment is working again I'll close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants