-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impliment queueing for /search and /graphql queries #797
base: master
Are you sure you want to change the base?
Conversation
Final decision will be @teward's, but I doubt we're going to be able to run stuff on a second box - we're already heavy on hardware as it is. |
Not as heavy as other things I run for clients. A separate box for workers shouldn't be too hard, but a read-only copy of the database would be hard, because we'd have to set up live DB replication which is very intensive on the hardware, and we're very limited on SSD availability right now. We'll have to think on this, and Best Practices. I'm hoping to get the chance to get a business loan via my LLC for hardware upgrades and purchasing, but need to talk to the banks first. So this remains a draft and 'deferred' for now until I get more time to focus and work on this. |
To be clear, a read only replica isn't a requirement by any means, these changes are still helpful without that. That would just be an additional level of protection against random searches/queries exploding the CPU. |
Also, another note, if the backburner workers run on a separate box, we'll need to update the deploy script to deploy on that box as well. |
Hesitantly marking as ready for review. With these UI upgrades it is something that could go into prod. Deploy details need to be worked out before we can merge this though, so I'm happy to start that conversation or take feedback on this functionality. Deploy thoughts: Since it sounds like a read only replica is a bit much, I'd like to run the beanstalkd server and a couple backburner workers on a second box. This will mean that we'll need to modify the deploy script to deploy both boxes. The boxes will also need network access to each other so that rails can find the beanstalkd server and the backburner workers can find the DB. |
Pretty much what it sounds like -- now instead of sitting and waiting for a page to load, search and graphql requests will be queued. Search is queued for everyone, graphql is only queues if you're not core or you don't have an api token.
Currently both endpoints work via JSON and HTML, both will assign you a "job id" which you then poll until you get your results. Websockets may end up being easier, but have not yet been implemented. Other stuff on the TODO list is 1) Rendering the parameters used for a search, 2) Search pagination that doesn't require you to manually modify query params in the URL, 3) Websockets (maybe, also maybe not).
This PR will have a semi-complex install process. As you can see by the Procfile changes, we also now need separate backburner workers and a beanstalkd queue running in order to complete work. I'd suggest that these run on a seperate box to avoid taking down MS with expensive queries, and additionally it might make sense to run a read-only replica of the mysql database to again ensure that a heavy search/graphql load can't affect the main application.