Skip to content

Deserialization

Carnagion edited this page Jun 13, 2022 · 11 revisions

Deserialization is the process of converting stored data (in this case, XML) into an object instance.

GDSerializer's ISerializer interface defines the deserialization method as follows:

object Deserialize(XmlNode node, Type? type);

where node is the XML to deserialize, and type is an optional Type that can be provided, in case the serializer cannot infer the Type to instantiate (more on that later).

The following example class will be used throughout this page:

public record IntVector2
{
    public IntVector2(int x, int y)
    {
        this.Y = x;
        this.Y = y;
    }
    
    public int X
    {
        get;
    }
    
    public int Y
    {
        get;
    }
    
    public override string ToString()
    {
        return $"({this.X}, {this.Y})";
    }
}

Consider this XML:

<IntVector2>
    <X>10</X>
    <Y>20</Y>
</IntVector2>

To deserialize this, a type that implements ISerializer is needed. The default serializer class is Serializer.
But Serializer cannot deserialize this XML just yet. There are a few problems relating to how IntVector2 is defined.

  • Firstly, IntVector2 has no parameterless constructor.
    Serializer requires a parameterless constructor in order to instantiate objects. A type with no parameterless constructor will cause a SerializationException to be thrown during deserialization.
    This can be fixed by simply adding a parameterless constructor to IntVector2:

    private IntVector2() : this(0, 0)
    {
    }

    Note that the constructor does not have to be public. It can have any access modifier; the important thing is that it exists.

  • Secondly, the X and Y properties in IntVector2 have no set accessor; i.e. they are read-only.
    Normally, this would not be a problem in C#, as the values are set during object construction when creating the object normally through code. However, Serializer sets the values after object construction, which means it requires a set accessor to be present, or else a SerializationException will be thrown.
    This can be fixed by simply allowing writes to the properties:

    public int X
    {
        get;
        private set;
    }
    
    public int Y
    {
        get;
        private set;
    }

    Again, note that the set accessors do not have to be public. They can have any access modifier; the important thing is that they are not read-only.

Now that there are no problems with IntVector2, the XML can actually be deserialized.
To do this, first a Serializer instance is needed, and then the Deserialize(XmlNode, Type?) method needs to be invoked to obtain the result as follows:

Serializer serializer = new();
IntVector2 vector = (IntVector2)serializer.Deserialize(xml); // xml is the XML node <IntVector2>...</IntVector2>
Console.WriteLine(vector);

// The output should be (10, 20)

Note that the Type? parameter was not passed in this case. This is because Serializer can infer the type from the XML node's name (or the Type attribute on the node, such as <Name Type="Namespace.Type">...</Name>).
However, in some special cases, it may be necessary to provide the type, such as when the node's name does not match the type (and the Type attribute does not exist on the node), or when deserializing as a different type.
In this case, the following code would have the same result:

IntVector2 vector = (IntVector2)serializer.Deserialize(xml, typeof(IntVector2));

Configuring deserialization

Sometimes, it may be necessary to ensure that certain fields or properties are always deserialized, or never deserialized. GDSerializer takes care of this using the [Serialize] attribute.

Consider this XML:

<IntVector2>
    <X>10</X>
</IntVector2>

Technically, this is valid XML for deserializing into an IntVector2 instance. The y-value is not specified, which means it just remains as the default value (which is 0 for int).
However, it may be desirable to ensure that the y-value (and vice-versa, the x-value) is also always included in the XML, so that it is much more clear to those reading it.

To do this, the [Serialize] attribute needs to be applied as follows:

[Serialize]
public int X
{
    get;
    private set;
}

[Serialize]
public int Y
{
    get;
    private set;
}

This ensures that Serializer checks for the presence of the X and Y nodes every time an XML node is being deserialized into an IntVector2.
If they do not exist in the XML, a SerializationException will be thrown.

The valid XML is now as follows:

<IntVector2>
    <X>10</X>
    <Y>0</Y>
</IntVector2>

It is also possible to prevent the assignment of a value to a field or property during deserialization. Consider a private field that is used simply for caching purposes:

public record IntVector2
{
    private string? toString;
    
    // The rest of the code is the same
    
    public override string ToString()
    {
        return this.toString ??= $"({this.X}, {this.Y})";
    }
}

Here, the toString field is used purely to cache the return value of the ToString() method so that it does not have to be re-evaluated each time it is called (since IntVector2 is immutable).
It should not be assignable through XML, as then the ToString() method would return the wrong value. So the following XML should be invalid:

<IntVector2>
    <X>10</X>
    <Y>0</Y>
    <toString>hello</toString>
</IntVector2>

To prevent this, the [Serialize] attribute can be used again as follows:

[Serialize(false)]
private string? toString;

The false value passed to [Serialize] ensures that Serializer does not assign the value of that field or property during deserialization. If an XML node is found for that field or property (such as in the above invalid XML), a SerializationException is thrown.

Invoking a method after deserialization

There are some cases when some code needs to be run during object construction (such as setting the value of a field or property, running an action, checking for invalid values, etc).

For example, assume that IntVector2 only accepts positive x- and y-values. In regular cases, this would be done in a constructor:

public IntVector2(int x, int y)
{
    if (x < 0 || y < 0)
    {
        throw new Exception("X and Y values must be positive");
    }
    this.X = x;
    this.Y = y;
}

However, Serializer only sets these values after object construction takes place, meaning it would not be possible to run such code during construction.

To get around this, the [AfterDeserialization] attribute can be used as follows:

public record IntVector2
{
    // rest of the code is the same

    [AfterDeserialization]
    private void ValidateInput()
    {
        if (this.X < 0 || this.Y < 0)
        {
            throw new Exception("X and Y values must be positive");
        }
    }
}

Methods marked with this attribute will be called immediately after deserialization (but before the deserialization method ends and the object is returned). The method must take no parameters, but can return any value (though it will be discarded if not void).

Additionally, the attribute can be used on both static and non-static methods. Instance methods will be called on the object that was just deserialized; static methods will be called without any data passed to them.

Note that the methods do not have to be public. They can have any access modifier, as long as they take no parameters.

Summary

  • Types that implement ISerializer can be used for deserialization
    • Serializer is the default implementation
    • Use the Deserialize(XmlNode, Type?) method to deserialize XML into an object instance
      • The Type? parameter is optional, but required in some special cases (if in doubt, provide it all the time)
  • Deserialization of a class/struct/record requires it to have a parameterless constructor
    • This need not be public
  • Deserialization of a property requires it to have a set accessor
    • This need not be public
  • Fields and properties that need to be deserialized all the time should have the [Serialize] (i.e. [Serialize(true)]) attribute
  • Fields and properties that should never be deserialized should have the [Serialize(false)] attribute
  • Methods marked with the [AfterDeserialization] attribute will be called immediately after deserialization
Clone this wiki locally