Fix goroutine leak on closing #259

ItalyPaleAle · 2022-06-03T19:57:42Z

It appears that these goroutines do not receive the context, so the scheduler keeps running even after shutdown

richardpark-msft

We'd need to think about this a bit more. My concern is mostly about the scope of the cancellation.

In StartNonBlocking, the function call will have returned. If someone cancels the context after StartNonBlocking returns they'll end up cancelling their already started processor. That feels like unexpected new behavior. We do provide a way for them to do that properly (with Close()) so I think we'd want to stick with that.

I think we could argue that Start() should work properly with a cancellation context, however. Looking at it some more I think we'd need to do a bit more work to check the cancellation context + the signalChan, but otherwise it could be fine. However, I'm still worried that people that assumed they could cancel the context and not affect anything will be surprised, but maybe it wouldn't be so bad there.

@jhendrixMSFT, what do you think about this?

ItalyPaleAle · 2022-08-15T23:54:23Z

If someone cancels the context after StartNonBlocking returns they'll end up cancelling their already started processor.

Got it, although that's how contexts are supposed to be used, as a way to cancel background processes.

I know that a "track2" of this SDK is in the works so perhaps spending too much time optimizing this isn't even worth it at this point.

richardpark-msft · 2022-08-16T17:09:39Z

If someone cancels the context after StartNonBlocking returns they'll end up cancelling their already started processor.

Got it, although that's how contexts are supposed to be used, as a way to cancel background processes.

Yeah, it's a good point. I can see it both ways - do you expect to keep the context to still apply long past the call being "complete" though? It seems like it's supposed to be considered safe to just cancel it after the function call returns, but in this case you'd still end up cancelling the .Run().

I know that a "track2" of this SDK is in the works so perhaps spending too much time optimizing this isn't even worth it at this point.

This is true, but your point is 100% valid so I'd like to make sure, if we have do something similar, that our design makes sense to other Go people.

richardpark-msft · 2022-08-16T17:17:26Z

@ItalyPaleAle, all of what I wrote above only applies to the non-blocking version of the call. The blocking version makes sense to me.

ItalyPaleAle · 2022-08-16T17:57:48Z

do you expect to keep the context to still apply long past the call being "complete" though?

IMHO, yes. Usually, when a method accepts a context and runs in background, the context is the cancellation signal.

jhendrixMSFT · 2022-08-16T22:03:37Z

I'm also a bit worried that if we change the behavior now it will break somebody that depends on it.

Given that a track 2 replacement is on the way, I'd prefer if we just document this behavior instead. We'll improve the API for track 2. Does that work @ItalyPaleAle for you?

ItalyPaleAle · 2022-08-16T22:10:10Z

Yes that makes sense.

In the meanwhile, is there a way to fix the goroutine leak without breaking changes? For example, adding another context that is started inside StartNonBlocking and stopped on Close?

avoid goroutine leak when Close() is called

jhendrixMSFT · 2022-08-16T22:52:45Z

Agreed that should be fixed. I made some changes to your PR. Let me know if it works for you.

ItalyPaleAle · 2022-08-16T23:07:47Z

@jhendrixMSFT thanks for the changes! (To my limited knowledge) LGTM :)

jhendrixMSFT · 2022-08-17T14:34:07Z

If @richardpark-msft approves I'll get it merged and tagged.

richardpark-msft · 2022-08-17T20:22:33Z

eph/scheduler.go

@@ -83,6 +85,9 @@ func (s *scheduler) Run(ctx context.Context) {
 		case <-ctx.Done():
 			s.dlog(ctx, "shutting down scan")
 			return
+		case <-s.close:


In line 188, we actually call the s.done() function, which cancels the context we used to spin up this entire thing. So I think this s.close is not needed.

ItalyPaleAle and others added 2 commits June 3, 2022 12:56

Fix goroutine leak on closing

8e87ca0

Merge branch 'master' into patch-1

bf5e6d2

richardpark-msft requested changes Aug 15, 2022

View reviewed changes

doc that calling Close() is required

02450e9

avoid goroutine leak when Close() is called

richardpark-msft reviewed Aug 17, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix goroutine leak on closing #259

Fix goroutine leak on closing #259

ItalyPaleAle commented Jun 3, 2022

richardpark-msft left a comment

ItalyPaleAle commented Aug 15, 2022

richardpark-msft commented Aug 16, 2022

richardpark-msft commented Aug 16, 2022

ItalyPaleAle commented Aug 16, 2022

jhendrixMSFT commented Aug 16, 2022

ItalyPaleAle commented Aug 16, 2022

jhendrixMSFT commented Aug 16, 2022

ItalyPaleAle commented Aug 16, 2022

jhendrixMSFT commented Aug 17, 2022

richardpark-msft Aug 17, 2022

Fix goroutine leak on closing #259

Are you sure you want to change the base?

Fix goroutine leak on closing #259

Conversation

ItalyPaleAle commented Jun 3, 2022

richardpark-msft left a comment

Choose a reason for hiding this comment

ItalyPaleAle commented Aug 15, 2022

richardpark-msft commented Aug 16, 2022

richardpark-msft commented Aug 16, 2022

ItalyPaleAle commented Aug 16, 2022

jhendrixMSFT commented Aug 16, 2022

ItalyPaleAle commented Aug 16, 2022

jhendrixMSFT commented Aug 16, 2022

ItalyPaleAle commented Aug 16, 2022

jhendrixMSFT commented Aug 17, 2022

richardpark-msft Aug 17, 2022

Choose a reason for hiding this comment