Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry uploading build events more often #1935

Merged
merged 3 commits into from
Aug 16, 2023

Conversation

avdv
Copy link
Member

@avdv avdv commented Aug 10, 2023

This change retries event upload 256 times on Windows (default is 4).

This remedies an issue (buildbuddy-io/buildbuddy#4467) where the build would fail eventually because the connections to the remote get closed intermittently.

Since we were seeing this mainly for PRs started from forks (where the BuildBuddy API secret is not set), deliberately opened from my fork.


Still no dice. Two jobs fail with a timeout error:

ERROR: The Build Event Protocol upload timed out. com.google.common.util.concurrent.TimeoutFuture$TimeoutFutureException: Timed out: NonCancellationPropagatingFuture@57251f74[status=PENDING, info=[delegate=[SettableFuture@12279bf4[status=PENDING]]]]

@avdv avdv requested a review from aherrmann August 10, 2023 11:52
@avdv avdv changed the title Retry uploading events more often Retry uploading build events more often Aug 10, 2023
@avdv avdv force-pushed the windows-build-event-upload branch from b305d8e to bb1c470 Compare August 10, 2023 13:41
@aherrmann
Copy link
Member

Given

Still no dice. Two jobs fail with a timeout error:

I guess this is something to still experiment a bit further on before a review is needed?

@avdv avdv force-pushed the windows-build-event-upload branch from bb1c470 to 53ea0d6 Compare August 11, 2023 15:02
@avdv
Copy link
Member Author

avdv commented Aug 11, 2023

I guess this is something to still experiment a bit further on before a review is needed?

Oh yes, I was too optimistic.

@avdv
Copy link
Member Author

avdv commented Aug 14, 2023

I guess third time's a charm, so the one failing job succeeded on the third try...

At least these changes seem to improve the situation. @aherrmann could you have a look please?

Copy link
Member

@aherrmann aherrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

find "$base" -mindepth 1 -maxdepth 1 -name "java*.log.*" -print0 | xargs -0rI % cp % logs/

- name: Upload Logs
if: steps.collect_logs.conclusion == 'success'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure, does this mean this step is skipped if the condition for collect_logs (i.e. failure()) is false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes:
image
But I still want to test if this step runs when the job failed, that may not be the case... I'll check that on a different branch and adjust here if it doesn't.

@avdv avdv force-pushed the windows-build-event-upload branch from 0a0e64b to 9ec413a Compare August 15, 2023 07:09
@avdv avdv added the merge-queue merge on green CI label Aug 15, 2023
@avdv avdv force-pushed the windows-build-event-upload branch from 9ec413a to 13312bf Compare August 16, 2023 06:43
@mergify mergify bot merged commit 347ccfe into tweag:master Aug 16, 2023
28 checks passed
@mergify mergify bot removed the merge-queue merge on green CI label Aug 16, 2023
@avdv avdv deleted the windows-build-event-upload branch August 17, 2023 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants