Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it possible to offer roundtrip guarantees for formats that need it #2778

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

rushmorem
Copy link

@rushmorem rushmorem commented Jul 23, 2024

The Serializer and Deserializer traits have a nice symmetry in their methods. In theory, that means it should be possible to write serialisers and deserialisers that offer round trip guarantees across its entire data model. In practice this is not possible, however, because the Visitor trait does not have a one to one translation with deserialisers. Unlike deserialisers, for example, it makes no distinction between structs and maps.

This PR adds additional methods to the Visitor trait to ensure that symmetry is kept. It also adds default implementations that maintain the current behaviour to ensure backwards compatibility. This allows formats that need it to offer round trip guarantees across the entire Serde data model without affecting existing formats. This behaviour is tested in this PR.

Edit: To give some context, I'm aware of this previous issue which actually advocated for removing Visitor::visit_unit_struct (one of the methods this PR actually adds). We need this method along with the other struct, tuple and enum related methods this PR is adding because serde_content::Value is defined as

enum Value<'a> {
    Unit,
    Bool(bool),
    Number(Number),
    Char(char),
    String(Cow<'a, str>),
    Bytes(Cow<'a, [u8]>),
    Seq(Vec<Value<'a>>),
    Map(Vec<(Value<'a>, Value<'a>)>),
    Option(Option<Box<Value<'a>>>),
    Struct(Box<Struct<'a>>),
    Enum(Box<Enum<'a>>),
    Tuple(Vec<Value<'a>>),
}

Without the methods this PR is adding, ValueVisitor wouldn't have enough context to be able to support the struct, tuple and enum variants. Being able to distinguish between structs and maps and the specific types of structs and enums allows us to return more informative errors. For example, trying to deserialise

struct Foo {
    bar: bool,
    baz: char,
}

into a unit would return something like this

failed to deserialise; expected a unit, found Foo { bar: true, baz: 'a' }

@Mingun
Copy link
Contributor

Mingun commented Jul 23, 2024

To guarantee roundtrips it is also required, that derive macros does not break them. The common problem is #[serde(flatten)] which convert visit_struct to visit_map which looses name and fields. While with current design it is impossible to keep fields (they must be statically known, but #[serde(flatten)] merge fields from different types which knows nothing about each other), it is possible to keep name.

The right approach for #[serde(flatten)] should use a special trait which would construct target type from intermediate buffer (that buffer would be read from data format). With that approach we also get compile-time guaranties that we cannot embed non-flattenable fields (for example, String), which is runtime error today.

While this is another big problem, you could, at least, investigate what could be improved with your new methods.

You also may investigate, how roundtrip would work with skipped fields (although there a known bugs here, which makes this task difficult).

@rushmorem
Copy link
Author

rushmorem commented Jul 23, 2024

Thanks for bringing up #[serde(flatten)] and #[serde(skip)] @Mingun. I plan to add tests for Serde attributes. I will make sure to include those ones as part of the tests. Those were not part of the initial implementation for serde-content because, for our needs, we only need to guarantee roundtrip support for the basic data model types. That includes serialising and deserialising serde_content::Value itself. Those are covered by these roundtrip tests.

I'm aware that for more comprehensive roundtrip support the derive macros will need to be updated as well, for example to call the new visitor methods for structs instead of visit_map but that's outside the scope of this PR. I wanted to keep it small and easy to review.

@rushmorem
Copy link
Author

I added a few attribute tests, just enough to confirm what I already suspected. The attributes have no bearing on serde-content since they rewrite the types before the serialiser is even called. I was able to reproduce the skip issue you mentioned but that's expected too since it's an issue with the attribute itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants