Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to Jetpack 5.1.3 / L4T 35.5.0 #198

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Princemachiavelli
Copy link
Contributor

@Princemachiavelli Princemachiavelli commented Mar 28, 2024

Description of changes

Jetpack release notes: https://docs.nvidia.com/jetson/archives/jetpack-archived/jetpack-513/release-notes/index.html
Major release notes are "Double-Bit ECC Error Detection for the Jetson AGX Orin Industrial" and "Fixes for known security vulnerabilities.".

However, a major undocumented change is edk2-nvidia has upgraded from OpenSSL 1.1.1 to using 3.0. This PR removes vendoredOpenSSL and instead follows upstream nixpkg's approach of using buildPackages.openssl.src.

Notes

  • Upstream edk2-nvidia fix(eqos): Use correct TX clock name NVIDIA/edk2-nvidia#76 to fix the Eqos TX clock name appears to be unresolved; vendoring the patch here until fixed in upstream.
  • Upstream linux-tegra broke DMI_SYSFS by disabling DMI itself (OE4T/linux-tegra-5.10@bc94634) for some unknown bug; this PR reverts this change to re-enable DMI and DMI_SYSFS.
  • Upstream started using chipsku in addition to boardsku to differentiate Orin NX and Orin Nano so this needs to be tracked under firmware variants.

Updating

  • Update l4tVersion, jetpackVersion, and cudaVersion in default.nix
  • Update branch/revision/sha256s in:
    • default.nix
    • kernel/default.nix
    • uefi-firmware.nix
    • Grep for "sha256 = " and "hash = ", see if there is anything else not covered
  • Update gitrepos.json using sourceinfo/gitrepos-update.py and result/source_sync.sh from bspSrc.
  • Update the kernel version in kernel/default.nix if it chaged.
  • Grep for the previous version strings e.g. "35.5.0"
  • Compare files from unpackedDebs before and after
  • Grep for NvOsLibraryLoad in libraries from debs to see if any new packages not already handled in l4t use the function
  • Ensure the soc variants in modules/flash-script.nix match those in jetson_board_spec.cfg from BSP
  • Ensure logic in ota-utils/ota_helpers.func matches nvidia-l4t-init/opt/nvidia/nv-l4t-bootloader-config.sh
  • Run nix build .#genL4tJson and copy output to pkgs/containers/l4t.json

Testing

  • Run nix flake check
  • Build installer ISO
  • Flash all variants
  • Boot all variants
  • Test UEFI Capsule update (35.3.1 -> 35.5.0)
  • Run our (Anduril's) internal automated device tests

Copy link
Collaborator

@danielfullmer danielfullmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this so far!

modules/flash-script.nix Outdated Show resolved Hide resolved
pkgs/samples/default.nix Outdated Show resolved Hide resolved
pkgs/uefi-firmware/default.nix Outdated Show resolved Hide resolved
@Princemachiavelli
Copy link
Contributor Author

Princemachiavelli commented Jun 3, 2024

Should be ready for review again now.
Besides resolving all of the issues identified, there are a few changes.

  1. Added chip_sku to hardware.nvidia-jetpack.firmware.variants, this functions just like boardsku when flashing except now the flash script needs chip_sku to be externally set in order to correctly differentiate between Orin NX and Orin Nano.
  2. nvfancontrol: noticed this service would sometimes start too early and fail so enable Restart=on-failure.
  3. optee-gen-ekb: Since regenerating the EKB is necessary for fused boards, I've provided this tool as a standalone package.
  4. optee-ftpm-manufacturer: Not strictly necessary for this PR but needed for using the new fTPM feature.

There is one unresolved bug. When updating from previous firmware (e.g 35.4.1) to 35.5.0 via UEFI capsule update, nvbootctrl dump-slots-info shows incorrect capsule update status. Instead of showing a successful capsule update, the tool shows 0 as if no update occurred. I've opened an issue on Nvidia's forums [1] but I don't think this minor issue should block the entire 35.5.0 update.

[1] https://forums.developer.nvidia.com/t/nvbootctrl-shows-incorrect-status-after-uefi-capsule-update/295041

@danielfullmer
Copy link
Collaborator

Could you fix the formatting CI failures and then I'd be ready to merge. Thanks!

* Jetpack 5.1.2 -> 5.2.3
* l4t 35.4.1 -> 35.5.0
* kernel: 5.10.120(-rt70) -> 5.10.192(-rt96)
* * OE4T: oe4t-patches-l4t-r35.4.ga (2023-09-27) -> oe4t-patches-l4t-35.5.0 (2024-03-08)
* * Remove BTF patches fixed in OE4T/linux-tegra-5.10@c5006ab
* * Remove gcc13-synchronize-bond patch now in upstream
* * Remove crng_ready patch now in upstream
* nvidia-display-driver: fix sourceRoot
* board-automation: remove python 2-> 3 patch (upstream now uses python3)
* flash-tools: update flash-tools.patch
* jetson-benchmarks: 43892b9 -> c029c7d
* multimedia-samples: enable separateDebugInfo
* edk2: update to match l4tVersion
* edk2-(platforms,non-osi,nvidia-non-osi): update to match l4tVersion
* edk2-nvidia: r35.4.1-updates (2023-08-07) -> r35.5.0-updates (2024-03-08)
* * remove obsolete fix-disabled-serial.patch (Upstream PR#68 merged)
* * vendor Eqos TX clock name patch (Upstream PR#76 still open)
* * update edk2-uefi-dtb patch
* edk2-jetson: update to match l4tVersion
* * Switch to nixpkgs OpenSSL (1.1.1t -> 3.0.x)
* * remove obsolete edk2 openssl patches
Re-enable DMI so DMI_SYSFS is also re-enabled so firmware version can
be checked from userspace sysfs.
In 35.5.0 some logic to generate the Bup now depends on the CHIP_SKU
instead of the BOARD_SKU. See relevent Nvidia forum thread [1] and
corresponding change in OE4T/meta-tegra [2].

Also correct flash-tools-flashcmd which seems to have been
incorrectly using chiprev as CHIP_SKU.

[1] https://forums.developer.nvidia.com/t/failed-to-turn-isp-power-on-error-at-orinnano-8gb-l4t-35-5-0/289788/3
[2] OE4T/meta-tegra@a9e7e19
@Princemachiavelli
Copy link
Contributor Author

@danielfullmer This is ready for review again. It passes our internal device tests. The CI failure appears to be a GitHub timeout issue because it succeeds locally for me and nixpkgs-fmt --check also passes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants