Direct parsing of custom types #1044

sdebionne · 2024-09-19T13:52:09Z

About direct parsing, the documentation states that:

The drawback of this approach is that fully custom type representations are not supported, only the library-provided conversions are.

I wonder where this limitation stands?

I have a described struct with an std::chrono::duration member. While I have the proper tag_invoke overloads, since there is no built-in conversion support for std::chrono, I can't use parse_into. Is that correct?

Would it be possible to add new customization points to allow direct parsing of custom types?

Any chance to add a conversion category for date /time / duration (if that make sense)?

The text was updated successfully, but these errors were encountered:

grisumbras · 2024-09-20T14:38:00Z

Direct parsing is implemented using a special event handler that is a tree of dedicated subhandlers for components of the target value, and works by descending to the first subhandlers that accepts a parsing event. The implementation relies on cooperation of parent and child subhandlers.

In order to allow users to add support for their types we would have to make this system a public API and document how all that works. I'm not sure it's a very good idea, because writing those subhandlers is very tricky.

I have an alternative idea, which I'm still not sure I could pull off, but we'll see. The idea is to have representation "proxies". You define a trait that your type is supposed to be represented as some other type, and converter/parser will create an object of that type and will do conversion/parsing into that object instead.

This would provide a significant degree of customisation opportunities, like representing a chrono::duration as a number of ticks, or a chrono::time_point as an ISO 8601 string.

pdimov · 2024-09-21T16:31:49Z

Support for std::chrono::duration should be built-in. time_point is maybe trickier. Or maybe not.

grisumbras · 2024-09-23T05:47:09Z

What should be the representation of duration in your opinion?

sdebionne · 2024-09-23T07:10:08Z

I remembered you mentioned the idea of having "proxies" in your latest blog post but to be honest I did not get it. What would be the proxy for std::chrono::duration, for instance?

I support the idea of idea of having std::chrono::duration built-in. The representation could be either the number of ticks (int) and the period (int) to keep the full precision. Or the floating point representation in second.

grisumbras · 2024-09-23T07:32:14Z

Representing them as a number of ticks has the drawback of all durations being represented in the same way (no way to tell 10 seconds and 10 minutes apart).

Representing them as floating point value of seconds opens up the possibility of rounding errors.

So, both approaches are not ideal.

pdimov · 2024-09-23T08:28:02Z

We have the following options (for e.g. a duration of 42 minutes):

42
"42m"
[42, 60, 1]
[42, "m"]
{"count": 42, "period": [60, 1]}
{"count": 42, "period": "m"}

(1) has the disadvantage as stated above. Even though it works (for a struct member, for instance) because the period is encoded in the source code as the duration type, it would cause issues if two or more durations are put into a variant.

(2) is unnecessarily hard to parse and kind of defeats the point of using a structured format such as JSON.

(5) and (6) are the most human-readable, but overly verbose. For a file with many durations in it, they'll cause a lot of unnecessary overhead.

(4) is more human-readable than (3), but harder to parse.

This leaves me with (3). Interestingly though, we can in principle make several of these work when parsing, because they can be disambiguated (but we do need to choose one when serializing.)

So we can in theory add support for one of these, e.g. (3), today, and then later add support for e.g. (4) if we decide we like that better.

Edit: there's also the option of [42, [60, 1]], which is more regular, but there's no significant difference with (3) based on the criteria above, so that would be something like (3a) in our classification.

pdimov · 2024-09-23T08:35:01Z

On second thought, I like [42, [60, 1]] better than [42, 60, 1] because it has more structure and doesn't match the representation of e.g. std::vector<int> or something like math::point3<float>.

sdebionne · 2024-09-23T09:23:48Z

I would think that, in most applications, both side of the serialization agrees on the period, so the ticks representation would work.

Is it not the responsability of the schema to provide the unit/period?

grisumbras · 2024-09-23T09:47:30Z

As Peter mentioned, every ambiguous representation becomes a problem for variants. I.e. you have variant<seconds, hours>, you save hours(10). You load seconds(10). How common such cases would be is hard to say.

I would think that, in most applications, both side of the serialization agrees on the period, so the ticks representation would work.

I would think, for JSON you usually don't control the format, which is why I prefer representations that don't just technically work, but are actually likely to be used by people. The spec you linked recommends ISO 8601 strings. It's actually not that hard to implement, so maybe we should use that.

grisumbras · 2024-09-23T09:52:23Z

Although, maybe not. The linked ABNF doesn't allow fractions of seconds.

sdebionne · 2024-09-23T11:53:14Z

Indeed ISO 8601 is limited in precision to ms.

Maybe it is not the responsibility of the JSON value itself to carry the unit/period. Instead this info could come from a schema. Something like:

uptime:
  type: integer
  unit: microseconds
  example: 50

as discussed in json-schema-org/json-schema-vocabularies#46.

Taking Boost.Units quantities for example, we would probably represent it as numeric (or whatever the underlying value type matches best) and leave the unit to the schema?

Side note: it is possible (tested to some extend) to generate a schema from a described struct.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Direct parsing of custom types #1044

Direct parsing of custom types #1044

sdebionne commented Sep 19, 2024

grisumbras commented Sep 20, 2024

pdimov commented Sep 21, 2024

grisumbras commented Sep 23, 2024

sdebionne commented Sep 23, 2024

grisumbras commented Sep 23, 2024

pdimov commented Sep 23, 2024 •

edited

Loading

pdimov commented Sep 23, 2024

sdebionne commented Sep 23, 2024

grisumbras commented Sep 23, 2024

grisumbras commented Sep 23, 2024

sdebionne commented Sep 23, 2024 •

edited

Loading

Direct parsing of custom types #1044

Direct parsing of custom types #1044

Comments

sdebionne commented Sep 19, 2024

grisumbras commented Sep 20, 2024

pdimov commented Sep 21, 2024

grisumbras commented Sep 23, 2024

sdebionne commented Sep 23, 2024

grisumbras commented Sep 23, 2024

pdimov commented Sep 23, 2024 • edited Loading

pdimov commented Sep 23, 2024

sdebionne commented Sep 23, 2024

grisumbras commented Sep 23, 2024

grisumbras commented Sep 23, 2024

sdebionne commented Sep 23, 2024 • edited Loading

pdimov commented Sep 23, 2024 •

edited

Loading

sdebionne commented Sep 23, 2024 •

edited

Loading