-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[4.19-9.6]: ostree.sync
kola test fails on s390x
#1720
Comments
This test is failing intermittently on s390x. Let's snooze it for now to unblock the pipeline while we investigate: openshift#1720
Also, for completeness, this test was recently added to coreos-assembler in: coreos/coreos-assembler#3998 |
This test is failing intermittently on s390x. Let's snooze it for now to unblock the pipeline while we investigate: #1720
This test is failing intermittently on s390x. Let's snooze it for now to unblock the pipeline while we investigate: openshift#1720
I will take it, @marmijo can you sync this to jira and assign it to me, I have no permission to do the sync, thanks! |
/jira |
I think you're looking for: @HuijingHei, next time try that yourself too. If you don't have access, we should look into it. |
@jlebon: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Hmm, I think we still have to wire it up for issues. Anyway, added it manually for now. |
Run
I thought it was because of low memory, and update memory of test VM to 8G (default 2G), but no lucky. |
I wonder if the changes in act different on s390x... Do we see anything like
in the journal or anything interesting around that output? In your console log I see
Not sure if that is the same sync we are calling. |
No, seems hang in
Not sure if it is blocked by dd, sd-sync, zipl. IMU, if reach the timeout (DefaultTimeoutStopSec=10s), will kill all process and restart. |
The fact that we set the NIC down is problematic because IIUC we don't get any more logs from the system after that happens. Can we just drop NFS port traffic instead? --git a/mantle/kola/tests/ostree/sync.go b/mantle/kola/tests/ostree/sync.go
index 0fc36d6f6..077fb6392 100644
--- a/mantle/kola/tests/ostree/sync.go
+++ b/mantle/kola/tests/ostree/sync.go
@@ -217,7 +217,9 @@ storage:
func doSyncTest(c cluster.TestCluster, client platform.Machine) {
c.RunCmdSync(client, "sudo touch /var/tmp/data3/test")
- // Continue writing while doing test
+ // I wonder if this would be better if the script itself was just
+ // an infinite loop and gets run by a systemd unit that we can
+ // `systemctl start` here instead of running it in a go func()
go func() {
_, err := c.SSH(client, "sudo sh /usr/local/bin/nfs-random-write.sh")
if err != nil {
@@ -225,18 +227,11 @@ func doSyncTest(c cluster.TestCluster, client platform.Machine) {
}
}()
- // Create a stage deploy using kargs while writing
- c.RunCmdSync(client, "sudo rpm-ostree kargs --append=test=1")
+ // block NFS traffic
+ c.RunCmdSync(client, "sudo iptables <drop NFS port traffic>")
- netdevices := c.MustSSH(client, "ls /sys/class/net | grep -v lo")
- netdevice := string(netdevices)
- if netdevice == "" {
- c.Fatalf("failed to get net device")
- }
- c.Log("Set link down and rebooting.")
- // Skip the error check as it is expected
- cmd := fmt.Sprintf("sudo systemd-run sh -c 'ip link set %s down && sleep 2 && systemctl reboot'", netdevice)
- _, _ = c.SSH(client, cmd)
+ // Create a stage deploy using kargs while writing
+ c.RunCmdSync(client, "sudo systemd-run sh -c 'rpm-ostree kargs --append=test=1 --reboot'")
time.Sleep(5 * time.Second)
err := util.Retry(8, 10*time.Second, func() error { |
In
s390x
, the ostree.sync kola test is failing in the4.19-9.6
stream with the following output.This test passed successfully in a recent run, but fails most of the time.
The journal log of one of the recent failed
s90x
build jobs shows:journal.txt
console.txt
The text was updated successfully, but these errors were encountered: