Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argus libraries have many incompatibilities with nixos-unstable #243

Open
hacker1024 opened this issue Sep 4, 2024 · 1 comment
Open

Comments

@hacker1024
Copy link
Contributor

hacker1024 commented Sep 4, 2024

I am trying to use GStreamer to access an IMX477 camera over CSI, but this is proving difficult due to several library incompatibilities.

For reference, my relevant NixOS configuration:

{
  # Installs multiple important L4T drivers.
  hardware.graphics.enable = true;

  # Configure device trees for an Orin Nano on a developer kit with dual IMX477s.
  # Real hardware is probably not required to replicate this issue, though.
  hardware.deviceTree = {
    enable = true;
    name = "tegra234-p3767-0003-p3768-0000-a0.dtb";
    dtboBuildExtraIncludePaths = [ (config.hardware.deviceTree.kernelPackage.src + "/nvidia/soc/tegra/kernel-include") ];
    overlays = [
      rec {
        name = "tegra234-p3767-camera-p3768-imx477-dual";
        dtsFile = pkgs.substitute {
          src = config.hardware.deviceTree.kernelPackage.src + "/nvidia/platform/t23x/p3768/kernel-dts/${name}.dts";
          substitutions = [ "--replace-warn" "JETSON_COMPATIBLE_P3768" ''"nvidia,p3768-0000+p3767-0003"'' ]; # https://github.com/NixOS/nixpkgs/issues/339205
        };
      }
    ];
  };

  # Enable the Argus daemon
  services.nvargus-daemon.enable = true;
}

Issue 1: GStreamer

It appears that the plugins in the l4t-gstreamer package are built against an older version of GStreamer and GLib. GStreamer has ABI backwards-compatibility, but (it appears that) GLib does not: attempting to use GStreamer + GLib from the latest nixos-unstable results in symbol errors stemming from something using a fairly new API.

To solve this, it may be worth building the Argus GStreamer plugins from source; the code is provided by NVIDIA in the BSP sources package.

In the meantime, a shell with an older Nixpkgs revision can be used. Note that l4t-gstreamer too will need to be built against this older revision.

$ nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/4dd072b68c5c146981b61634b58aa13a8f2d7ba2.tar.gz -p gst_all_1.{gstreamer,gst-plugins-{base,good,bad,ugly}}
$ GST_PLUGIN_SYSTEM_PATH_1_0="$GST_PLUGIN_SYSTEM_PATH_1_0:/path/to/l4t-gstreamer/lib/gstreamer-1.0" gst-launch-1.0 nvarguscamerasrc ! fakesink

Issue 2: nvargus-daemon

Once GStreamer is running, it is apparent that nvargus-daemon itself is broken as well. It crashes with the following output when running the GStreamer command written above:

=== NVIDIA Libargus Camera Service (0.99.33)=== Listening for connections...=== gst-launch-1.0[114353]: Connection established (FFFFBAE5B8C0)SCF: Error NotSupported: Failed to load EGL library (in src/services/gl/GLService.cpp, function initializeEGLExportFunctions(), line 190)
SCF: Error NotSupported:  (propagating from src/services/gl/GLService.cpp, function initialize(), line 147)
SCF: Error NotSupported:  (propagating from src/services/gl/GLService.cpp, function startService(), line 46)
SCF: Error NotSupported:  (propagating from src/components/ServiceHost.cpp, function startServices(), line 142)
SCF: Error NotSupported:  (propagating from src/api/CameraDriver.cpp, function initialize(), line 178)
SCF: Error InvalidState: Services are already stopped (in src/components/ServiceHost.cpp, function stopServicesInternal(), line 193)
SCF: Error NotSupported:  (propagating from src/api/CameraDriver.cpp, function getCameraDriver(), line 117)
(Argus) Error NotSupported:  (propagating from src/api/GlobalProcessState.cpp, function createCameraProvider(), line 210)
=== gst-launch-1.0[114353]: CameraProvider initialized (0xffffb40d09c0)=== gst-launch-1.0[114353]: CameraProvider destroyed (0xffffb40d09c0)Segmentation fault (core dumped)

Using LD_DEBUG=libs, the issue is apparent:

find library=libnvidia-eglcore.so.35.4.1 [0]; searching
 search path=/nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib           (RUNPATH from file /nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib/libEGL_nvidia.so.0)
  trying file=/nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib/libnvidia-eglcore.so.35.4.1

calling init: /nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib/libnvidia-eglcore.so.35.4.1

/nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib/libnvidia-eglcore.so.35.4.1: error: symbol lookup error: undefined symbol: ErrorF (fatal)
/nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib/libnvidia-eglcore.so.35.4.1: error: symbol lookup error: undefined symbol: __malloc_hook (fatal)
/nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib/libnvidia-eglcore.so.35.4.1: error: symbol lookup error: undefined symbol: __realloc_hook (fatal)
/nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib/libnvidia-eglcore.so.35.4.1: error: symbol lookup error: undefined symbol: __free_hook (fatal)
/nix/store/zfgs6bfwk460fmqglxhm3lj51w03h3ls-nvidia-l4t-3d-core-35.4.1-20230801124926/lib/libnvidia-eglcore.so.35.4.1: error: symbol lookup error: undefined symbol: __memalign_hook (fatal)

Most of these symbols were removed in glibc 2.34. I'm not sure what the best way to fix this is - we may need to force an older glibc to load somehow instead. L4T uses 2.31.

@danielfullmer
Copy link
Collaborator

Thanks for the well-documented issue! Here's are a couple comments based on my initial read of this:

For gstreamer:

To solve this, it may be worth building the Argus GStreamer plugins from source

I definitely agree. If we can build from source we should.


For nvargus-daemon:

Given that the symbol lookup errors are happening while loading libnvidia-eglcore.so.35.4.1, I'd expect that it's not just nvargus-daemon that is broken against nixos-unstable.

Most of these symbols were removed in glibc 2.34. I'm not sure what the best way to fix this is - we may need to force an older glibc to load somehow instead. L4T uses 2.31.

Here's some text from the glibc release notes link you referenced:

The malloc debugging DSO libc_malloc_debug.so currently supports hooks and can be preloaded to get this functionality back for older programs. However this is a transitional measure and may be removed in a future release of the GNU C Library.

Hopefully this would suffice for now. If possible, I'd try to avoid mixing different versions of glibc as it can lead to diamond dependency-type problems.


As an aside, most of the testing we currently do is based off the latest stable NixOS release, so we often don't find these issues with nixpkgs-unstable until you report them. (which we appreciate) It is on my TODO list to start running nightly tests against master as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants