-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow user to delete an instance while it is still spawning #334
Conversation
I was able to get the instance list in a weird state
I saw that there were two calls to
After writing all of that and refreshing the page, the dashboard looks like it's in its normal state, so maybe it's a non issue. I'm going to leave that in because it may be useful for issue #213. |
Good find! I had certainly not seen that behavior, but I was indeed able to reproduce it... Something wonky happening here with the state... I'll need to take a closer look tomorrow morning. Can maybe answer some of the oddities above:
|
Oops you're right about the preflight call-so that stuff is probably perfectly normal. I think it might be some sort of race condition. We spawn two asynch jobs for creating instances here. When we delete instances here, we spawn a new job to do the cleanup and add it to the celery job queue. I think when destroyRecord is called on the instance, it ends up calling @Xarthisius may have a better idea of what's going on; if it is a race condition we could fix it by letting the user know we'll delete the instance once it's done spawning. Or we should make sure we kill any instance creation jobs before we attempt to run the Edit: Sidenote: Is that an extra slash in the first error? It may not matter. |
Interesting... I obviously didn't alter the spawning logic here, but I also hadn't noticed the extra slash. In any case, it seems innocuous since the request still makes it to the correct endpoint: $ curl https://girder.dev.wholetale.org/api/v1//instance/5c17d4ca2b0fa10001d8d395 -H "Girder-Token: <valid-token-scraped-from-browser-cookies>"
{"_id": "5c17d4ca2b0fa10001d8d395", "access": {"groups": [], "users": [{"flags": [], "id": "5bdb63506141bc0001cc1b55", "level": 2}]}, "created": "2018-12-17T16:54:34.880000+00:00", "creatorId": "5bdb63506141bc0001cc1b55", "iframe": true, "lastActivity": "2018-12-17T16:54:34.880000+00:00", "name": "LIGO Tutorial", "status": 0, "taleId": "59f0b91584b7920001b46f2e", "updated": "2018-12-17T16:54:42.947000+00:00"} Regarding the Jobs, I still need to dig into the code but I am just realizing that I've actually never seen the Job statuses portion on our Girder dev instance, since I am apparently not yet an admin there... would @Xarthisius or some other kind soul be willing to add me? 😅 Is the "Jobs" view in Girder one that I should be visiting and/or validating regularly? Or is this just used as supplemental information when debugging? EDIT: Strange/unrelated thing that I just noticed, Girder seems to pick up headers, but not cookies? I thought this was accepted both ways, but apparently not:
|
I just made you a site admin. However, the job view doesn't require elevated privileges: https://girder.dev.wholetale.org/#jobs/user/5bdb63506141bc0001cc1b55 ^^ should show you your jobs. If anything goes wrong, it will probably show up there. Answering @ThomasThelen question: yeah, currently it's impossible to cleanly shutdown tasks related to instance spawning (as they depend on each other, are not wrapped in `cancelableTask, etc). We'll need to work on handling it more gracefully on the backend side. |
EDIT: Scratch that, I just found the "My jobs" link on the account dropdown at the top-right... I can't believe that I have literally never seen this before :o |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Per 1/17 discussion with @Xarthisius @bodom0015, the main issue here is that we are missing the event handler for volume creation on the backend and not properly setting the error status. If error status was propagated correctly, then this PR is unnecessary. Sorry Mike. |
Per 3/20 discussion with @bodom0015 and @Xarthisius we're leaving this open until we resolve the underlying problem. |
Problem
Fixes #330
When a user spawns an instance, very rarely a job can fail and the instance spins indefinitely. In this case, it would be helpful to be able to delete the faulty instance.
Approach
We currently hide the delete button from tales in a
LAUNCHING
state, but the API does not prevent operating on these objects. If we always show the delete button, the user can reset their state without needing to contact support.Test Case
/browse