Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If the reconciliation loop fails the charm can be in blocked idle status without giving the operator a way to resolve that #143

Open
mthaddon opened this issue Oct 12, 2023 · 7 comments
Assignees

Comments

@mthaddon
Copy link
Contributor

Bug Description

As the title of the bug says if the reconciliation loop fails the charm can be in blocked idle status without giving the operator a way to resolve that.

If we're putting a charm into blocked status we should be giving the operator a way to resolve that. This could be running an action, changing configuration or something else.

To Reproduce

It's hard to give exact instructions, as this depends on a race condition within the reconcile loop. We've experienced this with wp theme list --format=json command intermittently returning a 255 error code.

Environment

N/A

Relevant log output

N/A

Additional context

No response

@weiiwang01
Copy link
Collaborator

The charm displays a status message wordpress-k8s/0 blocked idle 10.0.0.1 Failed to list addons, which provides the user with information of why it's blocked. In this instance, the message is Failed to list addons.

It's difficult for the charm to pinpoint the exact reason for the failure, as it only know that a command didn't exit with 0. It necessitates the user to check the log to determine the actual cause.

Perhaps we should include a prompt such as "check logs using juju debug-log" in the error message.

@mthaddon
Copy link
Contributor Author

Maybe we should just let the charm error out in this case. If we're not sure of what action users should take to fix things, that seems like it might be a better approach. We might also want to consider stopping apache to avoid serving content in a way that doesn't match the configuration they've specified.

This may need some discussion to determine the right approach.

@weiiwang01
Copy link
Collaborator

Yes, I think we should let the charm to fail if the problem isn't due to user error. And terminate the apache server on failing units is definitely a good idea. I will submit a pull request for this, thanks for the idea!

@cbartz
Copy link
Contributor

cbartz commented Nov 5, 2024

@weiiwang01 can you follow-up on this and close the issue, please?

@weiiwang01
Copy link
Collaborator

@weiiwang01 can you follow-up on this and close the issue, please?

Yes, I will follow-up on this, thanks!

@weiiwang01 weiiwang01 reopened this Nov 8, 2024
@amandahla
Copy link
Contributor

@weiiwang01 I saw that you closed and then re-opened the issue, should we keep it open?

@weiiwang01
Copy link
Collaborator

@weiiwang01 I saw that you closed and then re-opened the issue, should we keep it open?

Yes, I accidentally closed this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants