Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mellanox] Fix the issue with ASIC detection on the SN4280 platform (#20397) #20621

Open
wants to merge 1 commit into
base: 202405
Choose a base branch
from

Commits on Oct 25, 2024

  1. [Mellanox] Fix the issue with ASIC detection on the SN4280 platform (s…

    …onic-net#20397)
    
    - Why I did it
    Fix the issue with ASIC detection on the SN4280 platform.
    
    The root cause of the issue is in the PCI subsystem race condition. When the Dark Mode is enabled on the system start we do the following actions in parallel:
    
    The dpuctl service starts and powers down the DPUs which causes the DPU PCI devices removal.
    At the same time the syncd service starts. It launches mlnx-fw-upgrade.sh script which queries the available ASIC devices from the PCI subsystem using the lspci command.
    There is a small period after the removal of the DPU PCI device when the PCI subsystem in Linux remains inconsistent and lspci command might return an error upon execution. This might cause an error in mlnx-fw-upgrade.sh which interrupts the syncd container start.
    
    - How I did it
    Add a retry mechanism for the lspci command. Cache lspci output to reduce the number of command executions.
    
    - How to verify it
    Run regression.
    oleksandrivantsiv committed Oct 25, 2024
    Configuration menu
    Copy the full SHA
    72bcc3b View commit details
    Browse the repository at this point in the history