-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
boot VMIs from the checkup's setup #217
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
RamLavi
force-pushed
the
manually_boot_checup_VMIs
branch
from
January 21, 2024 09:42
a045753
to
88e900a
Compare
We had an offline discussion:
|
There is no need to pass the tuned-adm-set-marker file to the function. Removing it from the service and hard-coding it to the service script on both VM image scripts. Signed-off-by: Ram Lavi <[email protected]>
mounting the hugepages folder does not work well in multiple reboots. Moving to mounting the hugepages folder via /etc/fstab approach [0] on both vm images. [0] https://www.redhat.com/sysadmin/etc-fstab Signed-off-by: Ram Lavi <[email protected]>
The current approach in getting the vmi ready for the checkup is to set the tuned-adm + reboot commands before the guest-agent runs. The reason is that guest-agent ready is the signal that the checkup uses in order to know if the VMI has successfully booted. Rebooting before the guest-agent is ready is important to avoid a race where the checkup continues to the test before the tuned-adm kernel-args are set (requires reboot). This approach is flawed. The scheduling mentioned above applies only to the when the service starts, and does not ensure that it runs serially after the guest-agent service. This means that the reboot can actually be performed later in the systemd boot sequence - well after the guest-agent service is started. Setting the --force flag in the reboot command in order to hasten the reboot does not change this behavior. This commit removes the reboot from the dpdk-checkup-boot script on both vm images, in favor of a manual reboot done on the checkup itself, that will be introduced in later commits. Signed-off-by: Ram Lavi <[email protected]>
Currently the image is set to add a service running the commands needed for configuring the guest for DPDK. This service is set to schedule before the guest-agent service, but in reality it takes longer for it to finish running and by then the guest agent service is already up and running. Since the guest-agent service being ready is the criteria for the VMI being booted and the checkup moving to the test execution phase, this service finishing to run after guest-agent exposes the checkup for a race where the checkup continues before the guest was properly configured. Hence, there is no point in running this service the way it does. This commit removes the dpdk-checkup-boot.service in favor of adding it to the cloud-init service in future commits. This has the benefit of simplifying the image build process. Signed-off-by: Ram Lavi <[email protected]>
This commit adds the cloud-init script to the VMs, mounts it, and runs it in a new runcmd section of the cloud-init section, using the guest-agent-ping probe as an example [0]. [0] https://kubevirt.io/user-guide/virtual_machines/liveness_and_readiness_probes/ Signed-off-by: Ram Lavi <[email protected]>
This commit adds a new configmap that will be consumed by the vmi-under-test. The unit test is generalized in order to include any configmap deleted/created Signed-off-by: Ram Lavi <[email protected]>
This commit adds the cloud-init script to the VMs, mounts it, and runs it in a new runcmd section of the cloud-init section, using the guest-agent-ping probe as an example [0]. [0] https://kubevirt.io/user-guide/virtual_machines/liveness_and_readiness_probes/ Signed-off-by: Ram Lavi <[email protected]>
This commit enables the guest-agent-exec option on the guest, in order to use the probe polling option, that will be introduced in future commits. Signed-off-by: Ram Lavi <[email protected]>
Currently the setup function only waits for the VMI to boot, i.e. for the guest-agent condition to be ready. This commit moves the waitForVMIToBoot function to a new function setupVMIWaitReady, in preparation to next commit where more actions will taken on the VMIs. Signed-off-by: Ram Lavi <[email protected]>
Currently the checkup setup only waits for the VMI to finish "booting", i.e. the guest agent service to be ready. However this is not enough in order to ensure that the VMI has been configured, a procedure currently done on the cloud-init service. When the configuration is complete, the script configuring the guest in the cloud-init service adds a marker file. This commit: - introduces a new waiting mechanism, using guest-agent-ping probe [0] to poll-wait the guest until the file is present, and only then sets the VMI ready condition to true. - adds a wait for the VMI ready condition to be true. [0] https://kubevirt.io/user-guide/virtual_machines/liveness_and_readiness_probes/#defining-guest-agent-ping-probes Signed-off-by: Ram Lavi <[email protected]>
In order to allow soft reboot of a VirtualMachineInstance object during the checkup's setup phase - add the relevant Role object. Signed-off-by: Ram Lavi <[email protected]>
Signed-off-by: Ram Lavi <[email protected]>
Signed-off-by: Ram Lavi <[email protected]>
The guest-agent-ping wait-poll is waiting until the marker file, indicating that the VMI has been properly configured - is set before setting the VMI to condition ready. This commit is adding a soft reboot to the VMI, after it is set to ready. The reboot is necessary in order for the tuned-adm command set on the cloud-init service to take affect on the kernel args. Signed-off-by: Ram Lavi <[email protected]>
Currently the setup is performed on the VMIs in serial order. In order to reduce the wait time, run the waitForVMIReady in parallel. Signed-off-by: Ram Lavi <[email protected]>
RamLavi
force-pushed
the
manually_boot_checup_VMIs
branch
from
January 24, 2024 12:45
88e900a
to
fdf2ded
Compare
Passes e2e on a CNV4.15 cluster:
logs (verbose false as it has no value for this PR):
|
RamLavi
changed the title
Manually boot checkup VMIs
boot VMIs from the checkuo's setup
Jan 24, 2024
RamLavi
changed the title
boot VMIs from the checkuo's setup
boot VMIs from the checkup's setup
Jan 24, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current checkup's approach to waiting until the VMI is "booted", i.e. the guest-agent service is ready. It relies on the fact that the service running the tuned-adm + reboot runs before the guest-agent service starts running. This assumption however is incorrect.
In order to make sure that the tuned-adm commands are run before the setup decalres the VMIs ready, moving to a new approach: polling the existence of the marker file, added only after the tuned-adm is properly configured.
This PR introduces this new approach, using guest-agent-ping probe in order to wait for the VMI to be ready.
Additionally, it changes the Setup so that the two VMIs are being setup and polled in parallel.