Skip to content

Commit

Permalink
On iface add, call systemctl directly rather than via SYSTEMD_WANTS
Browse files Browse the repository at this point in the history
Experience shows that the SYSTEMD_WANTS setting can get lost,
resulting in a mismatch between systemd's internal state and the state
of the ec2-net-ifup@$iface.service unit (and consequently the actual
state of the interface itself).

With SYSTEMD_WANTS, we can induce a state change mismatch by
simulating repeated plug/unplug events.  A typical active interface
will be visible as "plugged" to systemd, and the corresponding
[email protected] will be running, as in:

[root@ip-10-0-0-28 ~]# systemctl status /sys/devices/pci0000:00/0000:00:06.0/net/eth1
● sys-devices-pci0000:00-0000:00:06.0-net-eth1.device - Elastic Network Adapter (ENA)
   Loaded: loaded
   Active: active (plugged) since Fri 2022-08-12 18:13:34 UTC; 30s ago
   Device: /sys/devices/pci0000:00/0000:00:06.0/net/eth1
[root@ip-10-0-0-28 ~]# systemctl status -n0 [email protected][email protected] - Enable elastic network interfaces eth1
   Loaded: loaded (/usr/lib/systemd/system/[email protected]; static; vendor preset: disabled)
   Active: active (exited) since Fri 2022-08-12 18:13:34 UTC; 2min 3s ago
  Process: 5046 ExecStart=/usr/sbin/ec2ifup %i (code=exited, status=0/SUCCESS)
 Main PID: 5046 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/system-ec2net\x2difup.slice/[email protected]
           ├─5172 /sbin/dhclient -q -cf /etc/dhcp/dhclient-eth1.conf -lf /var/lib/dhclient/dhclient--eth1.lease -pf /var/run/dhclient-eth...
           └─5229 /sbin/dhclient -6 -nw -lf /var/lib/dhclient/dhclient6--eth1.lease -pf /var/run/dhclient6-eth1.pid eth1 -H ip-10-0-0-28
[root@ip-10-0-0-28 ~]# ip link show dev eth1
9: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 02:64:87:35:06:b5 brd ff:ff:ff:ff:ff:ff

We can simulate an ENI detachment of this interface with:

After which we see
[root@ip-10-0-0-28 ~]# systemctl status /sys/devices/pci0000:00/0000:00:06.0/net/eth1
● sys-devices-pci0000:00-0000:00:06.0-net-eth1.device
   Loaded: loaded
   Active: inactive (dead)
[root@ip-10-0-0-28 ~]# systemctl status -n0 [email protected][email protected] - Enable elastic network interfaces eth1
   Loaded: loaded (/usr/lib/systemd/system/[email protected]; static; vendor preset: disabled)
   Active: inactive (dead)
[root@ip-10-0-0-28 ~]# ip link show dev eth1
Device "eth1" does not exist.

After which we see that systemd brings the interface back to its
managed configuration using [email protected] as shown above.

However, if we repeatedly detach and re-attach the interface, we wind
up in an unepected state after a few iterations:

[root@ip-10-0-0-28 ~]# unbind_eth1(){ echo '0000:00:06.0' > /sys/bus/pci/drivers/ena/unbind; }; bind_eth1(){ echo '0000:00:06.0' > /sys/bus/pci/drivers/ena/bind; }; while true; do rm -f /var/lib/dhclient/*eth1*; unbind_eth1; bind_eth1; sleep 2; ip -o link show | grep -q 'eth1:.*state UP' || break; done
[root@ip-10-0-0-28 ~]# systemctl status /sys/devices/pci0000:00/0000:00:06.0/net/eth1
● sys-devices-pci0000:00-0000:00:06.0-net-eth1.device - Elastic Network Adapter (ENA)
   Loaded: loaded
   Active: active (plugged) since Fri 2022-08-12 18:20:35 UTC; 58s ago
   Device: /sys/devices/pci0000:00/0000:00:06.0/net/eth1
[root@ip-10-0-0-28 ~]# systemctl status -n0 [email protected][email protected] - Enable elastic network interfaces eth1
   Loaded: loaded (/usr/lib/systemd/system/[email protected]; static; vendor preset: disabled)
   Active: inactive (dead) since Fri 2022-08-12 18:20:35 UTC; 1min 1s ago
  Process: 5820 ExecStop=/usr/sbin/ec2ifdown %i (code=exited, status=0/SUCCESS)
  Process: 5470 ExecStart=/usr/sbin/ec2ifup %i (code=exited, status=0/SUCCESS)
 Main PID: 5470 (code=exited, status=0/SUCCESS)
[root@ip-10-0-0-28 ~]# ip link show dev eth1
11: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 02:64:87:35:06:b5 brd ff:ff:ff:ff:ff:ff

The device is available according to the kernel, but systemd hasn't
activated the unit and thus the device is never configured.

With this change, the [email protected] is reliably started in the
above repro sequence.
  • Loading branch information
Noah Meyerhans authored and nmeyerhans committed Aug 17, 2022
1 parent 78cc047 commit 2a33b70
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion 53-ec2-network-interfaces.rules.systemd
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
# for the specific language governing permissions and limitations under
# the License.

ACTION=="add", SUBSYSTEM=="net", KERNEL=="eth*", DEVPATH!="/devices/virtual/*", TAG+="systemd", ENV{SYSTEMD_WANTS}+="ec2net-ifup@$env{INTERFACE}"
ACTION=="add", SUBSYSTEM=="net", KERNEL=="eth*", DEVPATH!="/devices/virtual/*", RUN+="/usr/bin/systemctl start --no-block ec2net-ifup@$env{INTERFACE}"
ACTION=="remove", SUBSYSTEM=="net", KERNEL=="eth*", RUN+="/usr/sbin/ec2ifdown $env{INTERFACE}"

0 comments on commit 2a33b70

Please sign in to comment.