Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: SurveyCanceller per survey #2337

Merged
merged 17 commits into from
May 17, 2024

Conversation

matiasgarciaisaia
Copy link
Member

@matiasgarciaisaia matiasgarciaisaia commented May 3, 2024

The old model of a Supervisor + a Canceller that only started Producers and Consumers was not only overly complex, but also buggy (it failed upon app startup - see #2147).

This refactor makes the Supervisor start a single Canceller process per survey that has to be cancelled.

This model should pave the way to turn the Canceller into a SurveyBroker that manages the survey's complete lifecycle in #2331.

Fixes #2147

Copy link
Contributor

@anaPerezGhiglia anaPerezGhiglia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from some small comments it looks good to me! ✨

The only "concern" that I have is that I tested locally, and cancelling a respondent was failing due to a data error (I have a particular state of db right now). The thing is that the failure would break the canceller process, so it would keep on starting the process and failing again and again.
The fail was generated by the SurveyCanceller.cancel_respondent method (particularly the Session.cancel() call)
Perhaps we should add a try-rescue statement to wrap the Session methods or the whole cancel_respondent method, so don't break the process and allow it to keep of cancelling other respondents

lib/ask/survey_respondents_canceller.ex Outdated Show resolved Hide resolved
lib/ask/survey_respondents_canceller.ex Outdated Show resolved Hide resolved
lib/ask/runtime/survey_canceller_supervisor.ex Outdated Show resolved Hide resolved
lib/ask/survey_respondents_canceller.ex Outdated Show resolved Hide resolved
:ignore ->
:ignore
end
def surveys_cancelling() do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method belongs to SurveyCancellerSupervisor - since is the only one using it and the SurveyCanceller its designed to only deal with one survey - IMHO it can be confusing to have this multi-survey method here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure about where to place this.

I hear what you say, but on the other hand I like the fact that the Supervisor doesn't know the DB or the Surveys at all - it asks what has to be done, and makes sure it gets done.

The fact that the Supervisor is difficult to test (evidenced by the simulate_survey_canceller_start() helper) tells there's room for improvement, but I don't think this specific issue is one of those.

lib/ask/survey_respondents_canceller.ex Outdated Show resolved Hide resolved
The Supervisor API expects a set of task descriptions to start, not the
actual tasks already started - as we were sending it.

This is still broken, since the consumers can't subscribe to the
producer if it hasn't started yet - and the app crashes.

See #2147
Instead of using consumers and producers, upon app startup we launch a
single SurveyCanceller process per each survey.

We still have to change the "online" behaviour (when we start cancelling
a survey while the app is already started).

See #2147
Instead of the old consumers and producers.

See #2147
Instead of having a producer and consumers, have a single actor per
Survey.

Still have to fix/refactor the canceller tests.

See #2147
Which may not necessarily be about SurveyCanceller, but this will do for
the time being.

See #2147
The exit_code and reason are already being set by the SurveyAction

See #2337
We've dropped the "respondents" part of the name long ago, and it's part
of the runtime actually.

See #2337
@matiasgarciaisaia matiasgarciaisaia force-pushed the fix/2147-survey-canceller-startup branch from 4c049d5 to c96c29b Compare May 9, 2024 18:18
Looks like the log entries are getting stored in different order in the
CI, which is probably not really an issue.

This commit changes how we check for the last disposition change in the
tests, so we avoid asserting on a "response" entry log instead.

See #2337
Doesn't really have to do with the current PR, but are failing anyway
If cancelling a respondent raises an error, we log the issue in Sentry
and continue cancelling the rest of them.

We keep the survey Cancelling so we wait for human intervention.

See #2337
@matiasgarciaisaia
Copy link
Member Author

The logic for retrying cancelling failing respondents could be improved, but maybe YAGNI so let's start like this first.

Copy link
Contributor

@anaPerezGhiglia anaPerezGhiglia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Niiiiiice ✨ 🙌 ✨

I left some 💅 nitpicking comments, but of course they are not mandatory

|> Repo.update!()
rescue
error ->
Logger.error(error, __STACKTRACE__, "Error cancelling respondent #{respondent.id}'s session")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💅 nitpick

"Error cancelling respondent #{respondent.id}'s session

the 's doesn't make sense there - respondent.id is a number

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to write "cancelling respondent's session", but wanted to specify their ID. So something like Respondent #25's session. Not sure how you write that.

lib/ask/runtime/survey_canceller.ex Outdated Show resolved Hide resolved
test/ask/runtime/survey_broker_test.exs Outdated Show resolved Hide resolved
test/ask/survey_canceller_test.exs Show resolved Hide resolved
matiasgarciaisaia and others added 3 commits May 13, 2024 16:16
The function effectively marks the survey as terminated - the
cancellation is the whole process of managing respondents + the survey.

See #2147
See #2337

Co-authored-by: Ana Pérez Ghiglia <[email protected]>
Since it's shared in at least two different test modules.

See #2147
See #2337
@matiasgarciaisaia matiasgarciaisaia merged commit 827cb10 into main May 17, 2024
2 checks passed
matiasgarciaisaia added a commit that referenced this pull request Jul 25, 2024
The tests were unnecessarily coupled to the Supervisor. We still have
lots of improvements to do, but this should make the tests pass.

See #2337
See #2147

Co-authored-by: Ana Pérez Ghiglia <[email protected]>
matiasgarciaisaia added a commit that referenced this pull request Jul 30, 2024
* Fix unrelated typing warning

Was laying there since #1842

* Refactor API preparing to receive more survey opts

Will need this to duplicate a survey.

See #2351

* Add endpoint to duplicate surveys

I still have to check if it works with already-running surveys, and
panel surveys and other configurations.

See #2351

* Expose survey duplicate feature from UI

See #2351

* Prevent duplicating Panel Surveys

See #2351

* Test duplication of running surveys

See #2351

* Test duplicate surveys don't have exit status

See #2351

* Fix: SurveyCanceller tests avoid the Supervisor

The tests were unnecessarily coupled to the Supervisor. We still have
lots of improvements to do, but this should make the tests pass.

See #2337
See #2147

Co-authored-by: Ana Pérez Ghiglia <[email protected]>

* Fix: add Flow types to new action

See #2351

* Don't over-engineer the double-click prevention

* Test copying standard quex

* Fix running survey duplication test

* Comment how quota_buckets_definitions work

* Extract common survey creation behaviour

See #2351
See #2354

---------

Co-authored-by: Ana Pérez Ghiglia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SurveyCancellerSupervisor crashes during application init
2 participants