Testing farm job canceled before configured timeout #209

mcattamoredhat · 2024-07-19T09:12:16Z

Type of issue

Bug Report

Description

We have seen in our downstream CI several testing-farm jobs canceled after 6h 0m . Although configured timeout default value is 480m in action inputs.

The error log message doesn't not provide any details, just the message Request was canceled on user request.

This is an example of the issue https://github.com/virt-s1/rhel-edge/actions/runs/9963311207/job/27529080681 edge-rhel-94-x86 job is using default timeout value of 480m

API request output is https://api.testing-farm.io/v0.1/requests/ee761663-f05f-43c2-84d9-673545b0f037

pipeline.log shows some tests failing:

| RHEL-9.4.0-Nightly:x86_64:/tmt/plans/edge-test/edge-x86-simplified-installer | ERROR       | guest-setup.pre-artifact-installation  | guest setup | https://artifacts.osci.redhat.com/testing-farm/ee761663-f05f-43c2-84d9-673545b0f037/guest-setup-e58d3804-fbd3-4214-aff4-7e12debd843d/guest-setup-output-pre-artifact-installation.txt                                                                                                                       |
| RHEL-9.4.0-Nightly:x86_64:/tmt/plans/edge-test/edge-x86-simplified-installer | ERROR       | guest-setup.post-artifact-installation | guest setup | https://artifacts.osci.redhat.com/testing-farm/ee761663-f05f-43c2-84d9-673545b0f037/guest-setup-e58d3804-fbd3-4214-aff4-7e12debd843d/guest-setup-output-post-artifact-installation.txt

Nevertheless guest pre/post installation logs don't have any failing playbook tasks.

May you please provide some help?

Reproducer

No response

The text was updated successfully, but these errors were encountered:

jamacku · 2024-07-19T10:29:48Z

This is very weird. @mcattamoredhat, could you please reproduce the issue with debug logging enabled?

And I agree the current log message could be better. I'll try to extend it with more information.

jamacku · 2024-07-23T10:48:10Z

So, this is a limitation of GitHub-hosted runners. From GitHub doc:

Job execution time - Each job in a workflow can run for up to 6 hours of execution time. If a job reaches this limit, the job is terminated and fails to complete.

Also, see this Discussion: https://github.com/orgs/community/discussions/25700#discussioncomment-3248791

jamacku · 2024-07-23T10:50:28Z

We can check if the execution time is greater than the timeout input and only then cancel the TF request.

mcattamoredhat · 2024-08-26T11:00:53Z

Hi @jamacku, although I've changed to sclorg/testing-farm-as-github-action v3.1.0, I still have this issue in a few tests such as https://github.com/virt-s1/rhel-edge/actions/runs/10553424096 (iot-f39-x86)
Is there something I am missing? May you please provide some guidance? Thanks!

jamacku · 2024-08-27T13:01:19Z

@mcattamoredhat, I may have missed something. I'll have a look. It should work without any additional configuration from your side.

jamacku · 2024-08-30T11:55:25Z

The problem might be that the Job run for 5h 59min 56s and then it was killed by runner. But we are expecting 6h.

I'll adjust the value.

mcattamoredhat · 2024-10-07T11:22:11Z

Hi @jamacku, our CI has detected some PRs failing due to this issue, despite we already updated our workflows to use sclorg/[email protected]
Examples of this can be found at edge-rhel-95-x86 and iot-rawhide-x86
Is there any chance to check this? Even open again this issue? Am I missing something?

jamacku · 2024-10-07T11:46:19Z

Hmm, there might still be some bug on our side.

github-actions bot added the type: bug Something isn't working label Jul 19, 2024

jamacku self-assigned this Jul 19, 2024

mcattamoredhat mentioned this issue Jul 19, 2024

RHEL-9.4.0-updates-20240718.2 - 2024-07-18 virt-s1/rhel-edge#8129

Closed

jamacku removed the type: bug Something isn't working label Jul 23, 2024

jamacku added the type: bug Something isn't working label Jul 23, 2024

jamacku mentioned this issue Jul 23, 2024

CI job passed and test script exit 0, but failed by timeout #186

Open

mcattamoredhat mentioned this issue Jul 24, 2024

Fedora 40 Daily Compose Test- 2024-07-23 virt-s1/rhel-edge#8154

Closed

jamacku linked a pull request Jul 25, 2024 that will close this issue

Don't cancel TF request when GHA Runner cancel the job #212

Merged

jamacku closed this as completed in #212 Aug 1, 2024

This was referenced Aug 6, 2024

RHEL-9.4.0-updates-20240805.1 - 2024-08-05 virt-s1/rhel-edge#8246

Closed

Fedora 40 Daily Compose Test- 2024-08-08 virt-s1/rhel-edge#8277

Closed

This was referenced Aug 26, 2024

RHEL-9.5.0-20240821.2 - 2024-08-21 virt-s1/rhel-edge#8364

Merged

Fedora 39 Daily Compose Test- 2024-08-26 virt-s1/rhel-edge#8388

Closed

mcattamoredhat mentioned this issue Aug 27, 2024

RHEL-9.5.0-20240826.4 - 2024-08-26 virt-s1/rhel-edge#8391

Merged

jamacku reopened this Aug 27, 2024

jamacku linked a pull request Aug 30, 2024 that will close this issue

fix(post): lower the limit for the post cancellations #226

Merged

This was referenced Sep 2, 2024

RHEL-8.8.0-updates-20240829.1 - 2024-08-30 virt-s1/rhel-edge#8421

Closed

Fedora 39 Daily Compose Test- 2024-08-31 virt-s1/rhel-edge#8427

Closed

Fedora 40 Daily Compose Test- 2024-09-03 virt-s1/rhel-edge#8451

Closed

This was referenced Oct 7, 2024

Fedora 41 Daily Compose Test- 2024-10-06 virt-s1/rhel-edge#8664

Closed

RHEL-9.5.0-20241006.2 - 2024-10-06 virt-s1/rhel-edge#8662

Closed

jamacku reopened this Oct 7, 2024

This was referenced Oct 16, 2024

RHEL-9.4.0-updates-20241015.1 - 2024-10-15 virt-s1/rhel-edge#8725

Closed

RHEL-9.5.0-20241015.7 - 2024-10-15 virt-s1/rhel-edge#8732

Closed

RHEL-9.5.0-20241014.2 - 2024-10-15 virt-s1/rhel-edge#8728

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing farm job canceled before configured timeout #209

Testing farm job canceled before configured timeout #209

mcattamoredhat commented Jul 19, 2024

jamacku commented Jul 19, 2024

jamacku commented Jul 23, 2024

jamacku commented Jul 23, 2024

mcattamoredhat commented Aug 26, 2024

jamacku commented Aug 27, 2024

jamacku commented Aug 30, 2024

mcattamoredhat commented Oct 7, 2024

jamacku commented Oct 7, 2024

Testing farm job canceled before configured timeout #209

Testing farm job canceled before configured timeout #209

Comments

mcattamoredhat commented Jul 19, 2024

Type of issue

Description

Reproducer

jamacku commented Jul 19, 2024

jamacku commented Jul 23, 2024

jamacku commented Jul 23, 2024

mcattamoredhat commented Aug 26, 2024

jamacku commented Aug 27, 2024

jamacku commented Aug 30, 2024

mcattamoredhat commented Oct 7, 2024

jamacku commented Oct 7, 2024