Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect CSV cell format errors crash entire upload #361

Closed
mbarton opened this issue Nov 8, 2024 · 1 comment · Fixed by #362
Closed

Incorrect CSV cell format errors crash entire upload #361

mbarton opened this issue Nov 8, 2024 · 1 comment · Fixed by #362

Comments

@mbarton
Copy link
Member

mbarton commented Nov 8, 2024

Edit dummy_sheet.csv to have two rows of data. Make the following edits:

  • Column: If treatment included insulin pump therapy (i.e. option 3 or 6 selected), was this part of a closed loop system?
  • First row: 1
  • Second row: TRUE

Upload. You should get a validation error but instead you get a crash:

django-1   |     | Traceback (most recent call last):
django-1   |     |   File "/usr/local/lib/python3.12/site-packages/django/db/models/fields/__init__.py", line 2140, in to_python
django-1   |     |     return int(value)
django-1   |     |            ^^^^^^^^^^
django-1   |     | ValueError: invalid literal for int() with base 10: 'TRUE'
django-1   |     | 
django-1   |     | During handling of the above exception, another exception occurred:
django-1   |     | 
django-1   |     | Traceback (most recent call last):
django-1   |     |   File "/app/project/npda/general_functions/csv_upload.py", line 252, in validate_rows
django-1   |     |     visits = rows.apply(
django-1   |     |              ^^^^^^^^^^^
django-1   |     |   File "/usr/local/lib/python3.12/site-packages/pandas/core/frame.py", line 10374, in apply
django-1   |     |     return op.apply().__finalize__(self, method="apply")
django-1   |     |            ^^^^^^^^^^
django-1   |     |   File "/usr/local/lib/python3.12/site-packages/pandas/core/apply.py", line 916, in apply
django-1   |     |     return self.apply_standard()
django-1   |     |            ^^^^^^^^^^^^^^^^^^^^^
django-1   |     |   File "/usr/local/lib/python3.12/site-packages/pandas/core/apply.py", line 1063, in apply_standard
django-1   |     |     results, res_index = self.apply_series_generator()
django-1   |     |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
django-1   |     |   File "/usr/local/lib/python3.12/site-packages/pandas/core/apply.py", line 1081, in apply_series_generator
django-1   |     |     results[i] = self.func(v, *self.args, **self.kwargs)
django-1   |     |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
django-1   |     |   File "/app/project/npda/general_functions/csv_upload.py", line 253, in <lambda>
django-1   |     |     lambda row: (validate_visit_using_form(patient_form.instance, row), row["row_index"]),
django-1   |     |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
django-1   |     |   File "/app/project/npda/general_functions/csv_upload.py", line 196, in validate_visit_using_form
django-1   |     |     fields = row_to_dict(
django-1   |     |              ^^^^^^^^^^^^
django-1   |     |   File "/app/project/npda/general_functions/csv_upload.py", line 150, in row_to_dict
django-1   |     |     model_field: csv_value_to_model_value(
django-1   |     |                  ^^^^^^^^^^^^^^^^^^^^^^^^^
django-1   |     |   File "/app/project/npda/general_functions/csv_upload.py", line 144, in csv_value_to_model_value
django-1   |     |     return model_field.to_python(value)
django-1   |     |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
django-1   |     |   File "/usr/local/lib/python3.12/site-packages/django/db/models/fields/__init__.py", line 2142, in to_python
django-1   |     |     raise exceptions.ValidationError(
django-1   |     | django.core.exceptions.ValidationError: ['“TRUE” value must be an integer.']

@eatyourpeas
Copy link
Member

In the PR above dtype is now set at the time the dataframe is created. I did not do this as wanted to discuss here first but it would be possible at this point to raise an exception at this early stage if this dtype allocation process fails, such as in this situation. It was occurring because the new csv generation function was randomising booleans for the closed loop measure, rather than choices.

If the csv contains the wrong types (eg True/False in this case, rather than the expect int associated with the choice), should we just reject the whole csv maybe with a useful message to say please fix and try again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants