Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow overriding schedule jobs #109

Closed
wants to merge 0 commits into from

Conversation

pconstantinou
Copy link
Contributor

@pconstantinou pconstantinou commented Dec 21, 2023

Here's a draft proposal for supporting overriding scheduled jobs.

This includes adding optional parameters when scheduling jobs and a TestSuite implementation that allows the development of tests that run against all the backends.

Copy link
Owner

@acaloiaro acaloiaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proof of concept, Philip!

If you're interested in getting this done, like you suggested, it will need to support Redis as well.

The asynq library provides some functionality that will be of help here: https://pkg.go.dev/github.com/SQYY/Asynq#Inspector.CancelProcessing

With that said, I'm afraid that to implement this feature more elegantly, we need a breaking API change to Enqueue.

Ideally, since Override only changes the enqueue behavior, it is not as much a job-level property as a behavior change to the system. It my mind, the concept is more of a "Job Option", i.e an option passed to the system at the time of enqueue to change behavior, which is not necessarily a property of the job itself.

I do actually have some appetite to make a breaking API change to the Neoq interface.

If you're interested in sticking with this PR/feature, I'd like to work it into a breaking API change.

backends/memory/memory_backend.go Outdated Show resolved Hide resolved
backends/postgres/postgres_backend.go Outdated Show resolved Hide resolved
@pconstantinou
Copy link
Contributor Author

Thanks for the proof of concept, Philip!

If you're interested in getting this done, like you suggested, it will need to support Redis as well.

The asynq library provides some functionality that will be of help here: https://pkg.go.dev/github.com/SQYY/Asynq#Inspector.CancelProcessing

I'll take a look at this once we nail down the API

With that said, I'm afraid that to implement this feature more elegantly, we need a breaking API change to Enqueue.

Ideally, since Override only changes the enqueue behavior, it is not as much a job-level property as a behavior change to the system. It my mind, the concept is more of a "Job Option", i.e an option passed to the system at the time of enqueue to change behavior, which is not necessarily a property of the job itself.

I do actually have some appetite to make a breaking API change to the Neoq interface.

If you're interested in sticking with this PR/feature, I'd like to work it into a breaking API change.

Cool, makes sense, I updated the proposal to use options like you've done with the queue instantiation. This has the upside of not breaking the API for existing code.

@pconstantinou
Copy link
Contributor Author

@acaloiaro I'm getting in a little over my head on this. As I started working on unit tests, I realized that the cached jobs that are already in memory need to be purged and reloaded. The unit tests are failing because the overwritten job is still run.
I'm not sure how we feel about my schema change, ALTER TABLE neoq_jobs ADD CONSTRAINT neoq_jobs_fingerprint_constraint_idx UNIQUE (fingerprint, status, ran_at); seems like it should give you what you want but I see from the change log that you've played around with this before.

It seems like the memory backend suffers from the same problem as the postgres backend. Even though the Store is updated, there's a channel that needs to be reset.

Let me know what you think about this effort. If it's helpful, I can keep working on it, but I've got a big learning curve on the code case. If the approach doesn't align with the direction you want to go, I can wait.

@acaloiaro
Copy link
Owner

acaloiaro commented Dec 28, 2023 via email

@acaloiaro
Copy link
Owner

@acaloiaro I'm getting in a little over my head on this. As I started working on unit tests, I realized that the cached jobs that are already in memory need to be purged and reloaded. The unit tests are failing because the overwritten job is still run.

Like the postgres implementation, the memory backend will need the jobs atomically replaced with new job metadata. Do you want to continue attempting to do that work, or would you rather not? If not, we'll have to hold off on merging any changes, since one of the project's goals it to not forgo feature parity where parity is possible.

I'm not sure how we feel about my schema change, ALTER TABLE neoq_jobs ADD CONSTRAINT neoq_jobs_fingerprint_constraint_idx UNIQUE (fingerprint, status, ran_at); seems like it should give you what you want but I see from the change log that you've played around with this before.

I believe this doesn't fully achieve job uniqueness, where the "uniqueness" constraint is defined as preventing any two unprocessed jobs with the same payload from being queued at the same time. I believe the constraint provided in this PR relaxes that definition, and allows jobs with identical payloads in the error state and new state, since ran_at gets set whether jobs are unsuccessful (in the error state), or successful. I'm not entirely sure that this should not be neoq's uniqueness behavior. Let's keep your change as-is for now and I'll give it more thought.

It seems like the memory backend suffers from the same problem as the postgres backend. Even though the Store is updated, there's a channel that needs to be reset.

If you have a reproduction of the behavior described here, I'd like to take a look at it.

Let me know what you think about this effort. If it's helpful, I can keep working on it, but I've got a big learning curve on the code case. If the approach doesn't align with the direction you want to go, I can wait.

The work is helpful and I appreciate the effort. Feel free to keep moving forward. I'm happy to jump in anywhere you feel out of your depth, and even happy to pair synchronously if you like.

Cheers

@pconstantinou
Copy link
Contributor Author

@acaloiaro thanks for the suggestions. I've finished a complete draft implementation. When I run the postgres and memory tests one at a time, they work correctly, but I can't get the make test to pass. I'm not clear how that test framework works.

I think it may be beneficial to have a base class for tests so you can write one test that works on all the backends with the same expected results. I got started on it, but if you like the approach then I can do a bit more and perhaps use testify which may make the tests cleaner.

I'm also unsure if I need to do more cleanup when I overwrite the job. It seems like futureJobs may need to be purged.

I coded up a solution for redis that should work based on the documentation, but I haven't been able to test it.

I'm happy to do some pairing on it and discuss how to do the unique constraints correctly.

No pressure to merge the code until it all works correctly.

@acaloiaro
Copy link
Owner

acaloiaro commented Jan 12, 2024 via email

@acaloiaro
Copy link
Owner

FYI @pconstantinou I'm just waiting for a passing test suite before re-reviewing code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants