Skip to content

Serialization

Indraneel Mahendrakumar edited this page Aug 7, 2022 · 5 revisions

Serialization is the process of converting an object instance into storable data (in this case, XML).

GDSerializer's ISerializer interface defines the serialization method as follows:

XmlNode Serialize(object instance, Type? type);

where instance is the object to serialize, and type is an optional Type that can be provided, in case the actual type of the object is different from the type it should be serialized as (more on that later).

The following example class will be used throughout this page:

public record IntVector2
{
    public IntVector2(int x, int y)
    {
        this.Y = x;
        this.Y = y;
    }
    
    public int X
    {
        get;
    }
    
    public int Y
    {
        get;
    }
    
    public override string ToString()
    {
        return $"({this.X}, {this.Y})";
    }
}

Consider this IntVector2 instance:

IntVector2 vector = new(10, 20);

To serialize this, a type that implements ISerializer is needed. The default serializer class is Serializer.
But Serializer cannot serialize this instance just yet, due to an issue with how IntVector2 is defined.

  • The X and Y properties in IntVector2 have no set accessor; i.e. they are read-only.
    Although at first glance it might seem that a set accessor is not necessary for serialization (as the property's value only needs to be read, not written), serializing a read-only property makes no sense as it cannot be deserialized back into an object instance by Serializer.
    This can be fixed by simply allowing writes to the properties:
    public int X
    {
        get;
        private set;
    }
    
    public int Y
    {
        get;
        private set;
    }
    Note that the set accessors do not have to be public. They can have any access modifier; the important thing is that they are not read-only.

Now that there are no problems with IntVector2, it can actually be serialized.
To do this, first a Serializer instance is needed, and then the Serialize(object, Type?) method needs to be invoked to obtain the result as follows:

Serializer serializer = new();
XmlNode xml = serializer.Serialize(vector);
Console.WriteLine(xml.OuterXml);

// The output should be <IntVector2><X>10</X><Y>20</Y<</IntVector2>

Notice that the Type? parameter was not passed in this case. This is because the vector was being serialized as an IntVector2, which can be inferred by Serializer.
However, in some special cases, it may be necessary to serialize the object as a different type, such as serializing a HashSet<T> as an IEnumerable<T> instead, or serialzing a type as one of its base or inherited types.
In this case, the following code would have the same result:

XmlNode xml = serializer.Serialize(vector, typeof(IntVector2));

Configuring serialization

Sometimes, it may be necessary to ensure that certain fields or properties are always serialized, or never serialized. GDSerializer takes care of this using the [Serialize] attribute.

Consider this XML:

<IntVector2>
    <X>10</X>
</IntVector2>

Technically, this is valid XML output after serializing an IntVector2 instance. The y-value is not included, which might indicate that it had the default value (which is 0 for int).
However, it may be desirable to ensure that the y-value (and vice-versa, the x-value) is also always included in the XML, for future deserialization purposes, or so that it is much more clear to those reading it.

To do this, the [Serialize] attribute needs to be applied as follows:

[Serialize]
public int X
{
    get;
    private set;
}

[Serialize]
public int Y
{
    get;
    private set;
}

This ensures that Serializer always serializes those properties every time an IntVector2 is being serialized.
Note that Serializer technically treats [Serializer(true)] and the absence of a [Serializer] annotation as the same. So any field or property that is not marked with the [Serialize] attribute at all will still always be serialized. However, this is purely an implementation detail, although it is unlikely to change.

It is also possible to prevent the serialization of a field or property. Consider a private field that is used simply for caching purposes:

public record IntVector2
{
    private string? toString;
    
    // The rest of the code is the same
    
    public override string ToString()
    {
        return this.toString ??= $"({this.X}, {this.Y})";
    }
}

Here, the toString field is used purely to cache the return value of the ToString() method so that it does not have to be re-evaluated each time it is called (since IntVector2 is immutable).
It does not need to be serialized as it holds data that can always be computed again, and more importantly, it should not be deserializable, as then the ToString() method would return the wrong value. So the following XML should be invalid:

<IntVector2>
    <X>10</X>
    <Y>0</Y>
    <toString>hello</toString>
</IntVector2>

To prevent this, the [Serialize] attribute can be used again as follows:

[Serialize(false)]
private string? toString;

The false value passed to [Serialize] ensures that Serializer does not serialize the value of that field or property.

Invoking a method after serialization

It may be desirable to let an object know when it has been serialized, for whatever reason. This could be done in many ways, such as using events, but the [AfterSerialization] attribute provides a simple way to do so:

[AfterSerialization]
private void OnSerialize()
{
    // some code here
}

Methods marked with this attribute will be called immediately after serialization (but before the serialization method ends and the XML is returned). The method must take no parameters, but can return any value (though it will be discarded if not void).

Additionally, the attribute can be used on both static and non-static methods. Instance methods will be called on the object that was just deserialized; static methods will be called without any data passed to them.

Note that the methods do not have to be public. They can have any access modifier, as long as they take no parameters.

Serializing as a more general type

Sometimes, it may be necessary to serialize an instance not as its exact type, but as its base type or interface. Consider this type, which inherits IntVector2:

public record IntVector3 : IntVector2
{
    public IntVector3(int x, int y, int z) : base(x, y)
    {
        this.Z = z;
    }

    private IntVector3() : this(0, 0, 0)
    {
    }

    public int Z
    {
        get;
        private set;
    }

    public override string ToString()
    {
        return $"({this.X}, {this.Y}, {this.Z})";
    }
}

An instance of this type with X = 10, Y = 20, and Z = 30 would be serialized as follows:

<IntVector3>
    <X>10</X>
    <Y>20</Y>
    <Z>30</Z>
</IntVector3>

If necessary, an IntVector3 instance could be serialized as a more general IntVector2 instance - thus losing some information (in this case, the Z value) in the process:

IntVector3 vector3 = new(10, 20, 30);
XmlNode node = serializer.Serialize(vector3, typeof(IntVector2));
Console.WriteLine(node.OuterXml);

// The output should be <IntVector2><X>10</X><Y>20</Y></IntVector2>

Summary

  • Types that implement ISerializer can be used for serialization
    • Serializer is the default implementation
    • Use the Serialize(object, Type?) method to serialize an object instance into XML
      • The Type? parameter is optional, but may be used to serialize the object as a different type (such as a base type or inheriting type)
  • Serialization of a property requires it to have a set accessor
    • This need not be public
  • Fields and properties that need to be serialized all the time should have the [Serialize] (i.e. [Serialize(true)]) attribute
    • By default, Serializer treats the lack of a [Serialize] attribute as [Serialize(true)]; this is an implementation detail
  • Fields and properties that should never be serialized should have the [Serialize(false)] attribute
  • Methods marked with the [AfterSerialization] attribute will be called immediately after serialization
  • By passing a type argument to Serializer, an object instance can be serialized as a more general type such as a base type or interface
    • This can result in a loss of some information