Cannot program the FPGA #96

ravicorning · 2021-03-11T00:42:21Z

Hi

Using this API to program the binary in the FPGA:
kubectl rsu program -f <signed_RTL_image> -n -d <RSU_PCI_bus_function_id>

The documentation doesn't specify the arguments clearly, so assume signed_RTL_image is the name of the .bin file. Hostname, for some reason, the API doesn't seem to work with the node name k8S controller sees, have to use the IP address. The PCI bus id is the what i get doing "lspci |grep acc". Running this i see the, the fpga-opae.. container is in a pending state, complains about mismatch in the nodeselector, is there a corrsponding .yml i can modify to remove this filtering..?

[root@corningopenness opt]# kubectl describe pods fpga-opae-10.12.87.80-0b30-xbrx2
Name: fpga-opae-10.12.87.80-0b30-xbrx2
Namespace: default
Priority: 0
Node:
Labels: controller-uid=8a798e1c-5a1e-43d9-bb37-a9049458d61f
job-name=fpga-opae-10.12.87.80-0b30
Annotations:
Status: Pending
IP:
IPs:
Controlled By: Job/fpga-opae-10.12.87.80-0b30
Containers:
fpga-opae:
Image: fpga-opae-pacn3000:1.0
Port:
Host Port:
Command:
sudo
-E
/bin/bash
-c
--
Args:
./check_if_modules_loaded.sh && fpgasupdate /root/images/20ww27.5-2x2x25G-5GLDPC-v1.6.1-3.0.0-unsigned.bin 0b30 && rsu bmcimg 0b30
Environment:
PYTHONIOENCODING: utf-8
Mounts:
/root/images from image-dir (rw)
/sys/devices from class (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-fzdsz (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
class:
Type: HostPath (bare host directory volume)
Path: /sys/devices
HostPathType:
image-dir:
Type: HostPath (bare host directory volume)
Path: /temp/vran_images
HostPathType:
default-token-fzdsz:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-fzdsz
Optional: false
QoS Class: BestEffort
Node-Selectors: kubernetes.io/hostname=10.12.87.80
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message

Warning FailedScheduling 58m default-scheduler 0/2 nodes are available: 2 node(s) didn't match node selector.
Warning FailedScheduling 58m default-scheduler 0/2 nodes are available: 2 node(s) didn't match node selector

aniket-intel · 2021-03-11T09:14:43Z

The node selector values are not matching because you have run the program command with IP. Please try running it with the node hostname. This will he the same as the one in /etc/hostname of the node.

As for the kubectl rsu command, please execute kubectl rsu discover prior to the program command and copy the signed or unsigned image name from the output of that command and also the device ID. Consequently, paste the values in the kubectl rsu program command.

ravicorning · 2021-03-11T17:54:09Z

Ok, sure, can you tell me which one is the device here 8086:0b30 or 54:00:0
[root@corningopenness ravi]# kubectl rsu discover -n 10.12.87.80

Available RTL images:
[email protected]'s password:

Mar 10 44M 20ww27.5-2x2x25G-5GLDPC-v1.6.1-3.0.0-unsigned.bin

FPGA devices installed:

[email protected]'s password:
54:00.0 Processing accelerators [1200]: Intel Corporation Device [8086:0b30]
Subsystem: Intel Corporation Device [8086:0000]
Kernel driver in use: intel-fpga-pci
Kernel modules: intel_fpga_pci

ravicorning · 2021-03-11T19:37:38Z

Was able to download the image to FPGA, was able to configure the VFs in the FPGA, but cannot see it in available resources, any idea..what might be the problem ?, it says the resources should map to ConfigMap.yml for device plugin, but where is correlated with the bb_config helm chart provisioning ?

[root@corningopenness helm-charts]# kubectl get node opennesswkn-1 -o json | jq '.status.allocatable'
{
"cpu": "46",
"devices.kubevirt.io/kvm": "110",
"devices.kubevirt.io/tun": "110",
"devices.kubevirt.io/vhost-net": "110",
"ephemeral-storage": "96589578081",
"hugepages-1Gi": "20Gi",
"intel.com/intel_sriov_netdevice": "12",
"memory": "110455600Ki",
"pods": "110"
}

[root@corningopenness helm-charts]# kubectl logs intel-fpga-cfg-opennesswkn-1-jwlpn
ERROR: Section (FLR) or name (flr_time_out) is not valid.
FEC FPGA RTL v3.0
UL.DL Weights = 3.3
UL.DL Load Balance = 128.128
Queue-PF/VF Mapping Table = READY
Ring Descriptor Size = 256 bytes

--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
| PF | VF0 | VF1 | VF2 | VF3 | VF4 | VF5 | VF6 | VF7 |
--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+
UL-Q'00 | | X | | | | | | | |
UL-Q'01 | | X | | | | | | | |
UL-Q'02 | | X | | | | | | | |
UL-Q'03 | | X | | | | | | | |
UL-Q'04 | | X | | | | | | | |
UL-Q'05 | | X | | | | | | | |
UL-Q'06 | | X | | | | | | | |
UL-Q'07 | | X | | | | | | | |
UL-Q'08 | | X | | | | | | | |
UL-Q'09 | | X | | | | | | | |
UL-Q'10 | | X | | | | | | | |
UL-Q'11 | | X | | | | | | | |
UL-Q'12 | | X | | | | | | | |
UL-Q'13 | | X | | | | | | | |
UL-Q'14 | | X | | | | | | | |
UL-Q'15 | | X | | | | | | | |
UL-Q'16 | | | X | | | | | | |
UL-Q'17 | | | X | | | | | | |
UL-Q'18 | | | X | | |

ravicorning · 2021-03-11T20:09:33Z

Also posted this on github, mailing too expecting a faster response : Was able to download the image to FPGA, guess was able to configure the VFs in the FPGA, but cannot see it in available resources, any idea..what might be the problem ?, documentation says the resources should map to ConfigMap.yml for device plugin, but where is correlated with the bb_config helm chart provisioning ? ***@***.*** helm-charts]# kubectl get node opennesswkn-1 -o json | jq '.status.allocatable' { "cpu": "46", "devices.kubevirt.io/kvm": "110", "devices.kubevirt.io/tun": "110", "devices.kubevirt.io/vhost-net": "110", "ephemeral-storage": "96589578081", "hugepages-1Gi": "20Gi", "intel.com/intel_sriov_netdevice": "12", "memory": "110455600Ki", "pods": "110" } ***@***.*** helm-charts]# kubectl logs intel-fpga-cfg-opennesswkn-1-jtg5f ERROR: Section (FLR) or name (flr_time_out) is not valid. FEC FPGA RTL v3.0 UL.DL Weights = 3.3 UL.DL Load Balance = 128.128 Queue-PF/VF Mapping Table = READY Ring Descriptor Size = 256 bytes

…

--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | PF | VF0 | VF1 | VF2 | VF3 | VF4 | VF5 | VF6 | VF7 |

--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ UL-Q'00 | | X | | | | | | | | UL-Q'01 | | X | | | | | | | | UL-Q'02 | | X | | | | | | | | UL-Q'03 | | X | | | | | | | | UL-Q'04 | | X | | | | | | | | UL-Q'05 | | X | | | | | | | | UL-Q'06 | | X | | | | | | | | UL-Q'07 | | X | | | | | | | | UL-Q'08 | | X | | | | | | | | UL-Q'09 | | X | | | | | | | | UL-Q'10 | | X | | | | | | | | UL-Q'11 | | X | | | | | | | | UL-Q'12 | | X | | | | | | | | UL-Q'13 | | X | | | | | | | | UL-Q'14 | | X | | | | | | | | UL-Q'15 | | X | | | | | | | | UL-Q'16 | | | X | | | | | | | UL-Q'17 | | | X | | | | | | | UL-Q'18 | | | X | | | | | | | UL-Q'19 | | | X | | | | | | | UL-Q'20 | | | X | | | | | | | UL-Q'21 | | | X | | | | | | | UL-Q'22 | | | X | | | | | | | UL-Q'23 | | | X | | | | | | | UL-Q'24 | | | X | | | | | | | UL-Q'25 | | | X | | | | | | | UL-Q'26 | | | X | | | | | | | UL-Q'27 | | | X | | | | | | | UL-Q'28 | | | X | | | | | | | UL-Q'29 | | | X | | | | | | | UL-Q'30 | | | X | | | | | | | UL-Q'31 | | | X | | | | | | | DL-Q'32 | | X | | | | | | | | DL-Q'33 | | X | | | | | | | | DL-Q'34 | | X | | | | | | | | DL-Q'35 | | X | | | | | | | | DL-Q'36 | | X | | | | | | | | DL-Q'37 | | X | | | | | | | | DL-Q'38 | | X | | | | | | | | DL-Q'39 | | X | | | | | | | | DL-Q'40 | | X | | | | | | | | DL-Q'41 | | X | | | | | | | | DL-Q'42 | | X | | | | | | | | DL-Q'43 | | X | | | | | | | | DL-Q'44 | | X | | | | | | | | DL-Q'45 | | X | | | | | | | | DL-Q'46 | | X | | | | | | | | DL-Q'47 | | X | | | | | | | | DL-Q'48 | | | X | | | | | | | DL-Q'49 | | | X | | | | | | | DL-Q'50 | | | X | | | | | | | DL-Q'51 | | | X | | | | | | | DL-Q'52 | | | X | | | | | | | DL-Q'53 | | | X | | | | | | | DL-Q'54 | | | X | | | | | | | DL-Q'55 | | | X | | | | | | | DL-Q'56 | | | X | | | | | | | DL-Q'57 | | | X | | | | | | | DL-Q'58 | | | X | | | | | | | DL-Q'59 | | | X | | | | | | | DL-Q'60 | | | X | | | | | | | DL-Q'61 | | | X | | | | | | | DL-Q'62 | | | X | | | | | | | DL-Q'63 | | | X | | | | | | |

--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ Mode of operation = VF-mode FPGA_5GNR PF [0000:56:00.0] configuration complete! From: aniket-intel ***@***.***> Reply-To: open-ness/openness-experience-kits ***@***.***> Date: Thursday, March 11, 2021 at 1:15 AM To: open-ness/openness-experience-kits ***@***.***> Cc: "Ravindran, Ravi (Ravishankar)" ***@***.***>, Author ***@***.***> Subject: [EXTERNAL]--Re: [open-ness/openness-experience-kits] Cannot program the FPGA (#96) Hi Ravi, The node selector values are not matching because you have run the program command with IP. Please try running it with the node hostname. This will he the same as the one in /etc/hostname of the node. As for the kubectl rsu command, please execute kubectl rsu discover prior to the program command and copy the signed or unsigned image name from the output of that command and also the device ID. Consequently, paste the values in the kubectl rsu program command. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#96 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AS6LNA7D7VSDHPTPT4ZRBCTTDCCZDANCNFSM4Y7GJ2CQ>.

aniket-intel · 2021-03-12T03:39:16Z

Run the kubectl rsu discover command with hostname.

Also, the device ID here is 54:00.0. Only once the FPGA card is configured properly, it will show in the list of allocable resources.

ravicorning · 2021-03-16T00:05:07Z

I got this working after reinstalliing the worker node after programing the FPGA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot program the FPGA #96

Cannot program the FPGA #96

ravicorning commented Mar 11, 2021

aniket-intel commented Mar 11, 2021 •

edited

Loading

ravicorning commented Mar 11, 2021

ravicorning commented Mar 11, 2021

ravicorning commented Mar 11, 2021 via email

aniket-intel commented Mar 12, 2021

ravicorning commented Mar 16, 2021

Cannot program the FPGA #96

Cannot program the FPGA #96

Comments

ravicorning commented Mar 11, 2021

aniket-intel commented Mar 11, 2021 • edited Loading

ravicorning commented Mar 11, 2021

FPGA devices installed:

ravicorning commented Mar 11, 2021

ravicorning commented Mar 11, 2021 via email

aniket-intel commented Mar 12, 2021

ravicorning commented Mar 16, 2021

aniket-intel commented Mar 11, 2021 •

edited

Loading