Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignoring PythonIOEnconing #1277

Closed
alvaro-osvaldo-tm opened this issue Feb 13, 2025 · 1 comment · Fixed by #1278
Closed

Ignoring PythonIOEnconing #1277

alvaro-osvaldo-tm opened this issue Feb 13, 2025 · 1 comment · Fixed by #1278

Comments

@alvaro-osvaldo-tm
Copy link
Contributor

Issue


I think the implementation for the PYTHONIOENCONDING environment variable is not complete.

Because, if these variable is set, all Python input should expected with these enconding, but the parameter --enconding also need be defined.

It's seems redundant.

For example both commands belows should be analogs and produce the same output.

$ PYTHONIOENCONDING=utf-16 csvstat ./examples/test_utf16_little.csv
$ csvstat --enconding utf-16 ./examples/test_utf16_little.csv
# Your file is not "utf-8-sig" encoded. Please specify the correct encoding with the -e flag or with the PYTHONIOENCODING environment variable. Use the -v flag to see the complete error.

But only the second works, the first responds with.
I'm understand that the sentence with the -e flag **or** with the PYTHONIOENCODING environment variable could replace the '--enconding' parameter.

Expected Behavior

The 'PYTHONIOENCODING' environment variable replace the '--environment' variable.

Actual Behavior

It seems the 'PYTHONIOENCODING' environment variable have no purpose

Steps

Execute a 'csvstat' command getting as input a 'no-utf-8' file with PYTHONIOENCONDING defined, it will fail.

$ PYTHONIOENCONDING=utf-16 csvstat ./examples/test_utf16_little.csv 
# Your file is not "utf-8-sig" encoded. Please specify the correct encoding with the -e flag or with the PYTHONIOENCODING environment variable. Use the -v flag to see the complete error.

Input Data

  • Found in CSVKit source code: ./examples/test_utf16_little.csv

Stack trace

Versions

  • csvstat: 2.0.1
  • Python: 3.12.8
  • Operation System: Linux Debian 12.9

Additional information

  • Installed from source code with 'git clone' and 'pip -e'
@jpmckinney
Copy link
Member

jpmckinney commented Feb 13, 2025

This behavior changed in #1038

With #1278 PYTHONIOENCODING will take priority over the default of 'utf-8-sig'

However, --encoding will take priority over PYTHONIOENCODING. This is the typical priority order, e.g. https://docs.gunicorn.org/en/latest/configure.html

in order of least to most authoritative:

  1. Environment Variables
    ...
  2. Command Line

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants