Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV files download should be async #2350

Open
manumoreira opened this issue Jul 2, 2024 · 2 comments
Open

CSV files download should be async #2350

manumoreira opened this issue Jul 2, 2024 · 2 comments

Comments

@manumoreira
Copy link
Contributor

manumoreira commented Jul 2, 2024

In the last years CSV download has been generating several problems to the users. On medium to big surveys it ends up crashing the server frequently.
A solution for this problem can be uncoupling file creation from download.
This will imply a change in the user interaction.
Initially the user will click to create the file, the creation will be processed and once it is ready the user will be see a button to download it.
We will need mockups for this

Image

(source)

Questions:

  • Will it be usefull to have an option to trigger the file creation for all the files of the survey?
@matiasgarciaisaia
Copy link
Member

Let's note that the file which usually gives issues is the Interactions file (I don't remember having had issues with any other CSV file - but please correct me, @manumoreira).

But it may make a lot of sense to share the same approach between different files.

We could also generate the task (to create the files) and then notifiy the requester via email.

@manumoreira
Copy link
Contributor Author

In some cases we've seen issues with the results file in large surveys.
I'd prefer not to add an email service for this, just to keep it simple.
A circular progress bar might be enough.

matiasgarciaisaia pushed a commit that referenced this issue Aug 1, 2024
* Preload channels from survey.respondent_groups

* fix respondent_controller_tests

* preload all respondent_groups channels in the same query
matiasgarciaisaia added a commit that referenced this issue Aug 1, 2024
Respondent files are usually large (Interactions files can grow up to 1M
rows), and the "low" limit in queries made the DB work much more than
needed (we've observed 99% CPU usage in the mysqld process when
generating a 1M-rows interactions file with 1000 rows per query).

Increasing this limit makes the app generate less queries to the DB,
effectively driving the CPU usage down to about 30% instead.

There's probably more room for improvement (the generation of the file
is still CPU-bound instead of network-bound), but that's on the app
itself - we should profile the app's code to further improve the
performance.

See #2350
See #2359

Co-authored-by: Gustavo Giráldez <[email protected]>
matiasgarciaisaia added a commit that referenced this issue Aug 12, 2024
Respondent files are usually large (Interactions files can grow up to 1M
rows), and the "low" limit in queries made the DB work much more than
needed (we've observed 99% CPU usage in the mysqld process when
generating a 1M-rows interactions file with 1000 rows per query).

Increasing this limit makes the app generate less queries to the DB,
effectively driving the CPU usage down to about 30% instead.

There's probably more room for improvement (the generation of the file
is still CPU-bound instead of network-bound), but that's on the app
itself - we should profile the app's code to further improve the
performance.

See #2350
See #2359

Co-authored-by: Gustavo Giráldez <[email protected]>
matiasgarciaisaia added a commit that referenced this issue Aug 13, 2024
matiasgarciaisaia added a commit that referenced this issue Aug 15, 2024
matiasgarciaisaia added a commit that referenced this issue Aug 15, 2024
matiasgarciaisaia added a commit that referenced this issue Aug 15, 2024
We should un-skip them by the end of the PR.

See #2350
matiasgarciaisaia added a commit that referenced this issue Sep 24, 2024
The info about the generated files is still pending.

See #2350
matiasgarciaisaia added a commit that referenced this issue Oct 22, 2024
matiasgarciaisaia added a commit that referenced this issue Oct 22, 2024
This will allow us to check if there already is a file generated or not.

Also, we move the decision of whether to regenerate a file or not to the
user (instead of checking if we should generate the file again or not).

See #2350
matiasgarciaisaia added a commit that referenced this issue Oct 22, 2024
From the UI, request that the CSV files are generated by the backend.

We still miss checking if the files exist or are currently being
generated.

See #2350
matiasgarciaisaia added a commit that referenced this issue Oct 22, 2024
matiasgarciaisaia added a commit that referenced this issue Nov 5, 2024
Small changes, nothing too relevant.

See #2350
matiasgarciaisaia added a commit that referenced this issue Nov 5, 2024
There's probably still an issue with react-timeago not being properly
ignored yet.

See #2350
matiasgarciaisaia added a commit that referenced this issue Nov 6, 2024
There are no definitions available.

See #2350
matiasgarciaisaia added a commit that referenced this issue Nov 6, 2024
Thanks, eslint!

See #2350
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

No branches or pull requests

2 participants