-
Notifications
You must be signed in to change notification settings - Fork 85
Serialized Data
Special thanks to noko1111 for (as far as I can tell) being the first to publicly document this serialized format.
Some files and/or portions of files are just encoded data structures following the rules below. This seems to mostly be used in context (not event) related files because it is not particularly well compressed. The data is read byte by byte as a hexadecimal byte stream where each serialized object is preceded by a byte which identifies the object. tructures following the rules below. This seems to mostly be used in context (not event) related files because it is not particularly well compressed. The data is read byte by byte as a hexadecimal byte stream where each serialized object is preceded by a byte which identifies the object. NOTE: All integer values appear to have a bit marking the sign as the LAST bit.
0x0A => 0000 1010
, shift the sign off to get a positive value of 5
A byte string object is of variable length where the first byte (0x0A
below) marks the number of bytes to follow. This count must be shifted right 1 bit to remove the sign indicator that they place in the last bit of all the integers in this format. So the next 5 bytes make up the byte string which can be decoded to the players name "Pille".
02 0a 50 69 6c 6c 65 => "Pille" (when decoded)
Array objects start with a 01 00 followed by an integer byte indicating the number of elements in the array. Each object in the array then follows, one after another. I've highlighted the identifier bytes of each object to make visual parsing easier. I've also inserted pipes between the different elements.
04 01 00 08 | 02 0A 50 69 6C 6C 65 | 06 2A | 06 A6 | 06 8D
In the example above, we see an array with 4 elements, a byte string and three single byte integers.
The key value object starts with a single integer byte indicating how many key value pairs will follow. Then the pairs follow one by one with a single byte for a key integer and another object as the value. In the first example below, we see a 4 pair object, with the keys in italics and the object type indicator in bold. I've also inserted pipes between the different elements.
05 08 | 00 09 04 | 02 07 00 00 53 32 | 04 09 02 | 08 09 f2 bf 50
This object consists of a single byte integer; that is it.
06 4C => 0100 1100, shift of the sign bit to get positive 38
This object consists of a four byte integer and that is it.
07 00 00 53 32, gives a positive (last bit is zero) 10649 (shift off the last bit: 0x2999)
The VLF integer is more difficult to process. Each byte, starting with the first, uses the first bit X000 0000
as a flag to indicate if there is another byte or not. The remaining 7 bits from each byte are combined to form the final integer such that the first byte's bits make up the last 7 of the integer, the second byte's bits make up the 7 before that, etc for all the bytes. To better communicate I have included a snippet of python implementing the algorithm:
result,count,byteString = 0,0,""
#Loop through bytes until the first bit is zero
#build the result by adding new bits to the right
while(True):
num,byte = self.getBigInt(1,byteCode=True)
byteString += byte
if num & 0x80 > 0:
result += (num & 0x7F) << (7*count)
count = count + 1
else:
result += num << (7*count)
break
#The last bit of the result is a sign flag
result = pow(-1,result & 0x1) * (result >> 1)
We'll walk through this process with an example from below.
Example: 09 f2 bf 50
the data bytes in bits are 1111 0010
, 1011 1111
, 0101 0000
the first has its flag set, so we save the last 7 bits 111 0010
and grab another byte
result = 111 0010
the second has its flag set, so we push the last 7 bits 011 1111
to the front and grab another byte
result = 011 1111 111 0010
the third does not have its flag set, so we push the last 7 bits to the front and stop
result = 101 0000 011 1111 111 0010
we now shift off the last sign bit to get our final (positive) integer
result = 1010 0000 1111 1111 1001
or 0x0A 0F F9
or 659449