-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Occasional crasher when importing CSVs #119
Comments
Hi Shacker, I found an error of similar nature was reported by SO a few years back, I could check with my data scientist friends if this doesn't help out. https://stackoverflow.com/questions/5552555/unicodedecodeerror-invalid-continuation-byte |
One of the things i like about shaker/django-todo is doesn't crash. Ok so further finds that without being able to check, I’d guess one of two things is happening: they’re passing a filepath instead of an object or the file isn’t utf-8 encoded. Should we use a chardet type function to score the UTF as one of the approved types before importing, insert a "this file might need to be converted to utf-8" warning? |
@datatalking Good theory - could be the file encoding. Maybe I (or one of us) just needs to intentionally save a CSV file without some other encoding and see what happens. We shouldn't need to warn though - if that turns out to be the problem, we could wrap the file opener so that it opens "as" UTF-8. Do you know of a good way to save a CSV as non-UTF-8 that you could test with? (or provide and I can test it?) |
@shacker, I run into file encoding issues with CSV files a lot. I have written a couple of small routines I use routinely to fix this. I will post them here later for you. |
Here's the function I wrote:
and how I use it:
Basically solved all my encoding issues with diverse CSV files I've encountered. |
@bernd-wechner Awesome, thanks a bunch! Do you by chance have a CSV that can crash django-todo on import? If so, can you share the file? |
Alas no, not I. No need for CSV import yet. I do have some PRs open for you though fixing stuff that I did need ;-) |
Slightly awkward - I am the project's main author and filing this bug because I need help. I see occasional crash reports when people import CSVs into the demo site. The tracebacks don't tell me anything useful beyond
Exception Value: 'utf-8' codec can't decode bytes in position 15-16: invalid continuation byte
. I don't have access to the uploaded files because they're InMemory files. I've tried everything I can think of to reproduce the problem but just can't make it crash.If you uploaded a CSV and got it to crash, can you provide details in this thread? Thanks.
The text was updated successfully, but these errors were encountered: