Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Update telemetry role to deploy Kepler" #777

Merged

Conversation

gibizer
Copy link
Contributor

@gibizer gibizer commented Oct 8, 2024

This reverts commit 0859e4e.

Related-Issue: OSPCIX-502
Related-Issue: OSPCIX-503

@openshift-ci openshift-ci bot requested review from abays and bshephar October 8, 2024 15:59
@openshift-ci openshift-ci bot added the approved label Oct 8, 2024
@gibizer
Copy link
Contributor Author

gibizer commented Oct 8, 2024

/hold we are trying to find the VM boot slowness happening since 09.24. and kepler was enabled at that time so this is a trial

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3eca5f4d8e874360b752ea9dfb27b7a4

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 05m 29s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 14m 35s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 28m 04s
edpm-ansible-tempest-multinode FAILURE in 1h 44m 47s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 5m 58s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 4m 49s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 02s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 7m 14s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 8m 13s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 8m 57s
✔️ edpm-ansible-molecule-edpm_frr SUCCESS in 5m 59s
✔️ edpm-ansible-molecule-edpm_iscsid SUCCESS in 4m 01s
✔️ edpm-ansible-molecule-edpm_ovn_bgp_agent SUCCESS in 6m 59s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 04s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 3m 36s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 6m 08s

@gibizer
Copy link
Contributor Author

gibizer commented Oct 9, 2024

According our data this PR proves that the VM boot slowdown is caused by the kepler integration see the last comment in https://issues.redhat.com/browse/OSPCIX-503 with the details.

@jlarriba
Copy link
Contributor

jlarriba commented Oct 9, 2024

recheck

Copy link
Contributor

openshift-ci bot commented Oct 9, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bshephar, gibizer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karelyatin
Copy link
Contributor

/lgtm

@bogdando
Copy link
Contributor

bogdando commented Oct 9, 2024

/lgtm

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/b7641da9eba34129aed96dc78cd5fd30

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 30m 00s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 15m 54s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 18m 07s
edpm-ansible-tempest-multinode FAILURE in 2h 18m 21s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 7m 04s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 5m 52s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 48s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 9m 07s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 9m 33s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 10m 52s
✔️ edpm-ansible-molecule-edpm_frr SUCCESS in 7m 42s
✔️ edpm-ansible-molecule-edpm_iscsid SUCCESS in 5m 00s
✔️ edpm-ansible-molecule-edpm_ovn_bgp_agent SUCCESS in 7m 41s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 12m 20s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 17s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 5m 49s

@gibizer
Copy link
Contributor Author

gibizer commented Oct 9, 2024

recheck
tempest never started as the pod cannot fit:

  Warning  FailedScheduling  3s (x37 over 68m)  default-scheduler  0/1 nodes are available: 1 Insufficient memory. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/0a0acbeac80844ba97452b9070a9c247

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 29m 17s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 16m 21s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 29m 09s
edpm-ansible-tempest-multinode FAILURE in 2h 17m 33s
✔️ edpm-ansible-molecule-edpm_bootstrap SUCCESS in 7m 03s
✔️ edpm-ansible-molecule-edpm_podman SUCCESS in 5m 52s
✔️ edpm-ansible-molecule-edpm_module_load SUCCESS in 4m 39s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 9m 36s
✔️ edpm-ansible-molecule-edpm_libvirt SUCCESS in 10m 03s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 10m 15s
✔️ edpm-ansible-molecule-edpm_frr SUCCESS in 7m 02s
✔️ edpm-ansible-molecule-edpm_iscsid SUCCESS in 4m 28s
✔️ edpm-ansible-molecule-edpm_ovn_bgp_agent SUCCESS in 7m 52s
✔️ edpm-ansible-molecule-edpm_ovs SUCCESS in 11m 23s
✔️ edpm-ansible-molecule-edpm_tripleo_cleanup SUCCESS in 4m 03s
✔️ edpm-ansible-molecule-edpm_tuned SUCCESS in 5m 51s

@gibizer
Copy link
Contributor Author

gibizer commented Oct 9, 2024

the tempest pod is still unscheduleable:

  Warning  FailedScheduling  1s (x33 over 68m)  default-scheduler  0/1 nodes are available: 1 Insufficient memory. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

https://softwarefactory-project.io/zuul/t/rdoproject.org/build/b8035c77e45141de8c4630729aca062f/log/controller/ci-framework-data/logs/openstack-k8s-operators-openstack-must-gather/namespaces/openstack/pods/tempest-tests-28kfd/tempest-tests-28kfd-describe

Either the available RAM in crc is decreased or some of our the controlplane memory consumption is increased making tempest not fit any more.

@karelyatin
Copy link
Contributor

The failure in edpm-ansible-tempest-multinode is real one. triggered by openstack-k8s-operators/test-operator#222 which doubled the memory requirement to 8 GB, it's too much for CI, don't know why tempest needs that much memory and cpu

@gibizer
Copy link
Contributor Author

gibizer commented Oct 9, 2024

The failure in edpm-ansible-tempest-multinode is real one. triggered by openstack-k8s-operators/test-operator#222 which doubled the memory requirement to 8 GB, it's too much for CI, don't know why tempest needs that much memory and cpu

We are discussing a way forward over slack...

@karelyatin
Copy link
Contributor

recheck test-operator revert landed

@karelyatin
Copy link
Contributor

/unhold

@openshift-merge-bot openshift-merge-bot bot merged commit 0d98b8c into openstack-k8s-operators:main Oct 10, 2024
35 checks passed
gibizer pushed a commit to gibizer/edpm-ansible that referenced this pull request Oct 10, 2024
…pler

Revert "Update telemetry role to deploy Kepler"
gibizer pushed a commit to gibizer/edpm-ansible that referenced this pull request Oct 10, 2024
…pler

Revert "Update telemetry role to deploy Kepler"
karelyatin added a commit to karelyatin/data-plane-adoption that referenced this pull request Oct 11, 2024
This reverts commit 8da971c
as the root cause is known and reverted[1].

[1] openstack-k8s-operators/edpm-ansible#777

Related-Issue: OSPCIX-503
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants