Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
On iface add, call systemctl directly rather than via SYSTEMD_WANTS
Experience shows that the SYSTEMD_WANTS setting can get lost, resulting in a mismatch between systemd's internal state and the state of the ec2-net-ifup@$iface.service unit (and consequently the actual state of the interface itself). With SYSTEMD_WANTS, we can induce a state change mismatch by simulating repeated plug/unplug events. A typical active interface will be visible as "plugged" to systemd, and the corresponding [email protected] will be running, as in: [root@ip-10-0-0-28 ~]# systemctl status /sys/devices/pci0000:00/0000:00:06.0/net/eth1 ● sys-devices-pci0000:00-0000:00:06.0-net-eth1.device - Elastic Network Adapter (ENA) Loaded: loaded Active: active (plugged) since Fri 2022-08-12 18:13:34 UTC; 30s ago Device: /sys/devices/pci0000:00/0000:00:06.0/net/eth1 [root@ip-10-0-0-28 ~]# systemctl status -n0 [email protected] ● [email protected] - Enable elastic network interfaces eth1 Loaded: loaded (/usr/lib/systemd/system/[email protected]; static; vendor preset: disabled) Active: active (exited) since Fri 2022-08-12 18:13:34 UTC; 2min 3s ago Process: 5046 ExecStart=/usr/sbin/ec2ifup %i (code=exited, status=0/SUCCESS) Main PID: 5046 (code=exited, status=0/SUCCESS) CGroup: /system.slice/system-ec2net\x2difup.slice/[email protected] ├─5172 /sbin/dhclient -q -cf /etc/dhcp/dhclient-eth1.conf -lf /var/lib/dhclient/dhclient--eth1.lease -pf /var/run/dhclient-eth... └─5229 /sbin/dhclient -6 -nw -lf /var/lib/dhclient/dhclient6--eth1.lease -pf /var/run/dhclient6-eth1.pid eth1 -H ip-10-0-0-28 [root@ip-10-0-0-28 ~]# ip link show dev eth1 9: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 02:64:87:35:06:b5 brd ff:ff:ff:ff:ff:ff We can simulate an ENI detachment of this interface with: After which we see [root@ip-10-0-0-28 ~]# systemctl status /sys/devices/pci0000:00/0000:00:06.0/net/eth1 ● sys-devices-pci0000:00-0000:00:06.0-net-eth1.device Loaded: loaded Active: inactive (dead) [root@ip-10-0-0-28 ~]# systemctl status -n0 [email protected] ● [email protected] - Enable elastic network interfaces eth1 Loaded: loaded (/usr/lib/systemd/system/[email protected]; static; vendor preset: disabled) Active: inactive (dead) [root@ip-10-0-0-28 ~]# ip link show dev eth1 Device "eth1" does not exist. After which we see that systemd brings the interface back to its managed configuration using [email protected] as shown above. However, if we repeatedly detach and re-attach the interface, we wind up in an unepected state after a few iterations: [root@ip-10-0-0-28 ~]# unbind_eth1(){ echo '0000:00:06.0' > /sys/bus/pci/drivers/ena/unbind; }; bind_eth1(){ echo '0000:00:06.0' > /sys/bus/pci/drivers/ena/bind; }; while true; do rm -f /var/lib/dhclient/*eth1*; unbind_eth1; bind_eth1; sleep 2; ip -o link show | grep -q 'eth1:.*state UP' || break; done [root@ip-10-0-0-28 ~]# systemctl status /sys/devices/pci0000:00/0000:00:06.0/net/eth1 ● sys-devices-pci0000:00-0000:00:06.0-net-eth1.device - Elastic Network Adapter (ENA) Loaded: loaded Active: active (plugged) since Fri 2022-08-12 18:20:35 UTC; 58s ago Device: /sys/devices/pci0000:00/0000:00:06.0/net/eth1 [root@ip-10-0-0-28 ~]# systemctl status -n0 [email protected] ● [email protected] - Enable elastic network interfaces eth1 Loaded: loaded (/usr/lib/systemd/system/[email protected]; static; vendor preset: disabled) Active: inactive (dead) since Fri 2022-08-12 18:20:35 UTC; 1min 1s ago Process: 5820 ExecStop=/usr/sbin/ec2ifdown %i (code=exited, status=0/SUCCESS) Process: 5470 ExecStart=/usr/sbin/ec2ifup %i (code=exited, status=0/SUCCESS) Main PID: 5470 (code=exited, status=0/SUCCESS) [root@ip-10-0-0-28 ~]# ip link show dev eth1 11: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 02:64:87:35:06:b5 brd ff:ff:ff:ff:ff:ff The device is available according to the kernel, but systemd hasn't activated the unit and thus the device is never configured. With this change, the [email protected] is reliably started in the above repro sequence.
- Loading branch information