Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AzureStorage: purge corrupted queue messages. #1088

Open
jviau opened this issue May 10, 2024 · 2 comments
Open

AzureStorage: purge corrupted queue messages. #1088

jviau opened this issue May 10, 2024 · 2 comments
Labels
dt.azurestorage DurableTask.AzureStorage

Comments

@jviau
Copy link
Collaborator

jviau commented May 10, 2024

We should evaluate updating these two code locations to delete corrupted (fails deserialization) messages from the queue. It is not expected that deserialization failures is a transient issue and no amount of retries / time delay will fix these messages. Particularly because it is only framework types (and not user) being deserialized here.

Location 1:

messageData = await this.messageManager.DeserializeQueueMessageAsync(
queueMessage,
this.storageQueue.Name);

Location 2:

MessageData data = await this.messageManager.DeserializeQueueMessageAsync(
queueMessage,
this.storageQueue.Name);

@jviau jviau added the dt.azurestorage DurableTask.AzureStorage label May 10, 2024
@cgillum
Copy link
Collaborator

cgillum commented May 10, 2024

The one caveat to this policy is that we've seen cases where changes to Newtonsoft.Json settings can cause unintended deserialization failures. This can happen as part of a rollout of a new version of an app, whether due to changes made by the user (though hopefully we've rooted all those possibilities out) or changes made by the DTFx maintainers. Either way, giving time for users to roll back the change, e.g. 24 hours, before permanently deleting their data, might be prudent.

@jviau
Copy link
Collaborator Author

jviau commented May 10, 2024

Yeah will need some design. It could be an opt-in setting? Or a callback? User gets the exception and gets to return true/false for purge?

Either way, the framework needs to take action here as it is not something users can self-mitigate (they will be fighting with the workers to dequeue and delete the message!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dt.azurestorage DurableTask.AzureStorage
Projects
None yet
Development

No branches or pull requests

2 participants