Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to raw string value, when json field inside geojson isn't decodable #1470

Open
wants to merge 1 commit into
base: maint-1.10
Choose a base branch
from

Conversation

jbylina
Copy link

@jbylina jbylina commented Dec 3, 2024

Expected behavior and actual behavior.

I expected to successfully read the following geojson file by Fiona and leave the handling of mixed types geojson properties to lib user.

{
"type": "FeatureCollection",
"features": [
{"type": "Feature", "properties": { "a": [ ]}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } },
{"type": "Feature", "properties": { "a": ""}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } }
]
}

Actual behavior:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "fiona/ogrext.pyx", line 1996, in fiona.ogrext.Iterator.__next__
  File "fiona/ogrext.pyx", line 668, in fiona.ogrext.FeatureBuilder.build
  File "fiona/ogrext.pyx", line 390, in fiona.ogrext.JSONField.get
  File "/Users/jacek/.pyenv/versions/3.12.7-debug/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jacek/.pyenv/versions/3.12.7-debug/lib/python3.12/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jacek/.pyenv/versions/3.12.7-debug/lib/python3.12/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Steps to reproduce the problem.

Run the following script using the latest Fiona.

from pathlib import Path
import fiona

Path("test.geojson").write_text("""\
{
"type": "FeatureCollection",
"features": [
{"type": "Feature", "properties": { "a": [ ]}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } },
{"type": "Feature", "properties": { "a": ""}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } }
]
}
""")

list(fiona.open("test.geojson", "r"))

Operating system

macOS 15.1.1

Fiona and GDAL version and provenance

Fiona 1.10.1 from pip
GDAL 3.10.0 installed via Homebrew

Comment

When I flip lines, so that [ ] geojson property isn't first, the code doesn't fail.

{
"type": "FeatureCollection",
"features": [
{"type": "Feature", "properties": { "a": ""}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } },
{"type": "Feature", "properties": { "a": [ ]}, "geometry": { "type": "Point", "coordinates": [ [-74, 5]] } }
]
}

Fix

I propose to fall back to raw string value if json field property isn't decodable. When we look at different json values allowed by RFC, and how it's handled by json.loads() the string is the only one that requires different handling.

json.loads("1.1") returns float
json.loads("1") returns int
json.loads("[]") returns list
json.loads("{}") returns dict
json.loads("null") returns None
json.loads("a") raises exception. Extra quotes required

When geojson is processed, at the point of calling JSONField.get() we know that geojson is a valid json so the above cases are the only possible ones. json.loads("[") is impossible, so the decoding exception is possible only in case of the string.

@jbylina jbylina changed the base branch from main to maint-1.10 December 3, 2024 16:38
@jbylina jbylina force-pushed the bugfix/mixed-types-of-geojson-properties branch from 382d2ab to bd4e7c8 Compare December 3, 2024 16:48
@jbylina jbylina changed the base branch from maint-1.10 to maint-1.9 December 3, 2024 16:49
@jbylina jbylina changed the base branch from maint-1.9 to maint-1.10 December 3, 2024 16:49
@jbylina jbylina force-pushed the bugfix/mixed-types-of-geojson-properties branch from bd4e7c8 to 33001a3 Compare December 3, 2024 16:52
Copy link
Member

@sgillies sgillies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbylina Thanks for the suggestion!

How about a log message to say that we are defaulting to different behavior? And a warning so that users can raise an exception instead if they want to be very strict.

Comment on lines +417 to +418
except json.JSONDecodeError:
return val
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except json.JSONDecodeError:
return val
except json.JSONDecodeError as error:
warnings.warn(f"JSON field value not decoded, returning as string: {val=}, {error=}", FeatureWarning)
log.info("JSON field value not decoded, returning as string: val=%r, error=%r", val, error)
return val

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants