Skip to content

Havok MOPP Data format

Candoran2 edited this page Mar 17, 2024 · 19 revisions

The MOPP data in a havok block is difficult to capture in the xml format. The reason why should be apparent after reading this article. What follows is a best attempt at describing the data tree and how it is likely used. Note that some of this information may be specific to Oblivion and Skyrim. Also: as far as I can tell, everything is in big-Endian (most significant byte first).

The data setup and parsing

The main function of the MOPP data is to form a bounding box of sorts. It can take in a position and return all the triangles that position can collide with. It does this through a tree, set up from up to 256 different node types. These different types can be broken down into three subcategories:

  1. Nodes with only 1 child (filter/alteration nodes).
  2. Nodes with 2 children (decision nodes).
  3. Nodes with no children (triangle/leaf nodes).

These nodes form a decision tree. These decisions are made based on the input position, and the parser can go down both children of a decision node if they both qualify.

The scale/input

All filter and decision nodes compare the position in some way to the values in those nodes. However, those values are relative to the largest dimension of the bounding box for the entire object that the MOPP describes.

It was already known in the nif format that the Scale field of the bhkMoppBvTreeShape in Oblivion was calculated via scale = 254*256*256 / (size + 0.2), where size is the extent of the largest dimension (x, y or z). Skyrim appears to follow a similar formula, but using 0.01 instead. This correlates with the radius field in their respective havok geometry blocks: in Oblivion, this is typically 0.1, in Skyrim it is typically 0.005.

The offset field has a similar addition: It was known that in Oblivion, the offset field was the minimum of all vertices in the shape along each respective axis, minus 0.1. In Skyrim, it is minus 0.005. It seems safe to assume, then, that both of these simply depend on the radius of the havok geometry block.

These values, I believe, are used to calculate the new (relative) coordinates, for quick comparison with the values in the nodes in the following way:

  • Take the input position of the querying object (havok coordinates).
  • Add the offset field.
  • Multiply every value by the scale field
  • Convert to uint24 (or uint32 and don't look at the first byte, same effect).

These values are then used to determine whether an incoming position (or maybe triangle?) should be used for further collision detection.

Parsing

When looking through the MOPP tree, start at the first byte. The first byte of every node indicated what type of node it is. This dictates how many bytes it takes up, which values to compare, what (if any) child nodes there are and where they are. I have attempted to document every type of node I have encountered in the node types section. As far as I have been able to tell, there is only ever one root node in every MOPP tree.

Output

I think the parsing always returns a uint32. It always has this available, which initializes at 0. In the node type notation below, this will be called O (for Output). This value is remembered when going down nodes, so modifications to this value affect all children (but not siblings).

The meaning of this output can change depending on context. For example, when referring to the shapes in a bhkListShape, output merely reflects the index of the shape in the list. When referring to a bhkNiTriStripsShape with multiple bhkNiTriStripsData, the data is linked like so:

  • 00 00 00 00 - The 0th triangle in the 0th bhkNiTriStripsData
  • 00 00 00 01 - The 1-th triangle in the 0th bhkNiTriStripsData
  • 00 10 00 00 - The 0th triangle in the 1-th bhkNiTriStripsData

When referring to a bhkCompressedMeshShapeData, the data is (typically) linked like so:

  • 00 00 00 00 - The 0th "big triangle"
  • 00 00 00 01 - The 1-th "big triangle"
  • 00 04 00 00 - The 0th triangle in the 1st chunk.
  • 00 06 00 05 - The 5-th triangle in the 1st chunk, and the winding order is clockwise.

Effectively giving big triangles as a group the 0 index, and therefore counting the chunks starting from 1. While this is the typical structure for bhkCompressedMeshShapeData, it is theoretically possible to be different. The only thing certain is that chunk index bits are followed by the anticlockwise (0) vs clockwise (1) bit, followed by the triangle index. How many bits the chunk and the triangle index take, however, is determined by the data in the bhkCompressedMeshShapeData. The Mask Index (which is a mask to obtain the triangle) together with the Mask W Index (which is the same mask, but also includes the winding order bit) can be used to determine it, but so can the Bits Per Index field combined with the Bits Per W Index (the length in bits of their respective masks).

The node types

In order to make the node type notations in the next section easier to understand, I will explain one node in detail, from an example mesh with the following vertices:

  1. [0, 0, 0]
  2. [0, 1, 0]
  3. [2, 0, 0]
  4. [2, 1, 0]

And the following triangles:

  1. [0, 3, 1]
  2. [0, 2, 3]

With a set radius of 0.1. This means the lower corner is at -0.1, -0.1, -0.1, and the largest dimension is x with a size of 2.0 + 2*0.1=2.2.

The example node will be the one marked by the byte 27. It is a 'filter' node, i.e. one which has only one child which follows right after it, and it filters on the y coordinate. This node is formatted as follows: 27 AA BB. Here, AA and BB are the upper and lower bound, respectively. They are in 254ths of the largest dimension, relative to the origin.

Assuming this node is our root node for the example mesh, our y-bounds are [-0.1, 1.1]. The function for calculating the upper bound is then given by:

bound_byte = math.floor(1 + 254 * (bound_location - origin.y)/2.2)

which evaluates to:

bound_byte = math.floor(1 + 254 * (1.1 - -0.1)/2.2)
>>> bound_byte
0x8b

NOTE This function was determined experimentally. It implies that upper bounds use a < comparison. Personally, I think the math.floor(1 + ...) is just a poor implementation of a ceil function. In cases of whole numbers within the function, it raises the bound by 1. It could have been a proper ceil function and use <= comparison.


For the lower bound, the calculation is very similar:

bound_byte = math.floor(254 * (-0.1 - -0.1)/2.2)
>>> bound_byte
0x00

Leading to a node that looks like this: 27 00 8B. There similar nodes for the X and Z direction. Note that, even though the example mesh is a plane, the dimension in the Z direction is not 0 due to the radius. The parser will arrive at this node and read the first byte to recognize it as a 27 node. It will then compare the lower bound (00) with the first byte of the y position as well as the higher bound (8B). If it is within those, it will continue to the next node (which for this node is right after it).

Known nodes

When speaking about the X, Y or Z coordinate, presume it is read from the most significant byte of the scale-adjusted coordinate, fitting as many bytes as is appropriate for that comparison. General convention of structure notation: AA is used for lower bounds, BB for upper bounds, CC and DD for links to other nodes, CC or DD many bytes from the end of this node, and XX, YY and ZZ for referring to the respective coordinates. II, JJ, KK and so on are used for other byte types.

First byte Structure Instructions/processing
01 01 XX YY ZZ Subtract XX, YY and ZZ from the X, Y and Z positions, and shift the reading position 1 bit (multiply by 21). These are the new values for comparisons. This has effect on all subsequent child nodes. It, along with other nodes that perform similar operations, can be applied multiple times.
02 02 XX YY ZZ Presumed to be the same as node 01, but 2-bit shift instead (multiply by 22).
03 03 XX YY ZZ Tested. The same as node 01, but 3-bit shift instead (multiply by 23)
04 04 XX YY ZZ See 02. Not yet tested.
05 05 CC From the end of this node, jump CC bytes forward to arrive at the next node.
06 06 CC CC See 05, but with capacity for larger jumps.
09 09 II Add II to O and proceed to the node right after this one.
0A 0A II II See 09, but with capacity for larger additions.
0B 0B II II II II Set O to II II II II and proceed to the node right after this one. Mostly used for memory saving.
10 10 BB AA CC Compare the X coordinate to BB and AA. If it is smaller than BB, go to the node right after this one. If it is larger or equal to AA, jump CC bytes forward from the end of this node to arrive at the next node. Note that BB is often larger than AA, meaning the parser has to check both branches.
11 11 BB AA CC See 10, but for the Y coordinate.
12 12 BB AA CC See 10, but for the Z coordinate.
13 13 BB AA CC See 10, but for the direction perpendicular to the plane 0=Y+Z. Because the bytes could otherwise exceed FF, I presume this is calculated as coordinate= Y/2 + Z/2
14 14 BB AA CC See 13, but for FE/2 - Y/2 + Z/2
15 15 BB AA CC See 13, but for X/2 + Z/2
16 16 BB AA CC See 13, but for FE/2 + X/2 - Z/2
17 17 BB AA CC See 13, but for X/2 + Y/2
18 18 BB AA CC See 13, but for FE/2 + X/2 - Y/2
19 19 BB AA CC See 13, but for X/3 + Y/3 + Z/3.
1A 1A BB AA CC See 19, but for the direction perpendicular to the plane X + Y - Z. Exact calculation method is not known.
1B 1B BB AA CC See 1A, but for X - Y + Z
1C 1C BB AA CC See 1B, but for -X + Y + Z
20 20 XX CC Check X coordinate. If smaller than XX, go to the next node. If XX or higher, go to the node CC bytes from the end of this node.
21 21 YY CC See 21, but for Y coordinate.
22 22 ZZ CC See 22, but for Z coordinate.
23 23 BB AA CC CC DD DD Check X coordinate. If smaller than BB, go to CC CC bytes after the end of this block for the next. If equal to or greater than AA, go to DD DD bytes after the end of this node for the next. Like other two-value decision nodes, BB can be larger than BAA.
24 24 BB AA CC CC DD DD Like 23, but for the Y coordinate.
25 25 BB AA CC CC DD DD Like 23, but for the Z coordinate.
26 26 AA BB Check the X coordinate. Proceed to the node right after this one if it is greater than or equal to AA and smaller than BB.
27 27 AA BB See 26, but for the Y coordinate.
28 28 AA BB See 26, but for the Z coordinate.
29 29 AA AA AA BB BB BB Like 26, but checking the entire three-byte integer.
2A 2A AA AA AA BB BB BB See 29, but for the Y coordinate.
2B 2B AA AA AA BB BB BB See 29, but for the Z coordinate.
3I 3I Add 00 00 00 0I to O and return.
4I 4I See 3I, except add 00 00 00 1I to it (so 16 more).
50 50 II See 3I, except add 00 00 00 II to it.
51 51 II II See 3I, except add 00 00 II II to it.
52 52 II II II See 3I, except add 00 II II II to it.

Unknown nodes

These nodes have been found in existing MOPP tree, but their function is not yet determined.

Node type Extra information

Missing information

These nodes have been found in existing MOPP trees, and their linking is known, but the function of certain bytes is not yet determined, or their behavior/calculation is missing.

Node type Missing information
02 Length and existence is confirmed, but the function has not yet been tested.
13-18 The functions for the coordinates are likely, but not 100% confirmed.
1A-1C The functions for the coordinates are unknown. Guesses for the coordinates are the following:
  • 1A:FE/3 + X/3 + Y/3 - Z/3
  • 1B:FE/3 + X/3 - Y/3 + Z/3
  • 1C:FE/3 - X/3 + Y/3 + Z/3
I am quite certain of the sign of each coordinate, but not of the exact coefficient and offset combination used. The 19 node coordinate calculation has been confirmed, hence the guesses for these values.
20 I am quite certain of the function of this node. However, its use in farmhouse01walkway.nif is odd, because it is used to separate connected triangles (which should have some overlap in their bounding boxes due to the radius).

Presumed, but not yet found

The existence of the following nodes is assumed, but not yet observed in a MOPP tree:

Fully worked example

An explanation of a relatively simple mesh and its complete node tree will be shown below. The example mesh is the following flat plane:

Mesh Example

with the triangle indices and orientation as indicated (collision mesh found here: DoublePlane ).

This has the following mopp data:

28 00 26 27 00 FF 26 00 92 11 92 6B 0C 18 8D 72 04 52 06 00 05 52 04 00 04 18 57 3B 04 52 04 00 00 52 06 00 01

Broken up into the individual nodes:

28 00 26, 27 00 FF, 26 00 92, 11 92 6B 0C, 18 8D 72 04, 52 06 00 05, 52 04 00 04, 18 57 3B 04, 52 04 00 00, 52 06 00 01.

They are organized into the tree below:

MOPP tree with annotated memory jumps

In the picture, byte jumps are annotated in blue. They are written as they are displayed in the byte (so 0C is a jump of 12 bytes). Of course, there is also the other data: the conditions in each node. These are displayed in the picture below, added in green. The nodes used in this example only look at the most significant byte of the position integers.

MOPP tree with annotated memory jumps and conditions

Let's go through with three example positions, which have the following scaled coordinates (scaled as detailed in the scale/input section, and most significant byte only, hexadecimal):

  • 49, 7F, 30 (middle of the plane, higher z than the bounding box of any triangle)
  • 13, EC, 20 (upper left corner of the plane, within the bounding box, slightly higher than z the triangles)
  • 49, 7F, 20 (middle of the plane, within the bounding box,slightly higher z than the triangles)

position 1

  • Going into the first node, this compares the Z position (30) with the upper bound. Since 30 is higher than 26, exit immediately.

position 2

  • Going into the first node, the z position (20) is in between 0 and 26. Proceed to the next node.
  • Going into the next node, the y position (EC) is in between 0 and FF. Proceed to the next node.
  • Going into the next node, the x position (13) in in between 0 and 92. Proceed to the next node.
  • Going into the next node, the y position (EC) is not lower than 92. It is higher than 62, so only jump 0C bytes forward to arrive at the next node.
  • Going into that node, we have to check FE/2 + X/2 - Y/2 (7F + 09 - 76 = 12). This is lower than 57. It is not higher than 3B, but it is lower than 57. Therefore, only go to the next node.
  • This is a 52 node. Return the value 00 04 00 00 (in this case, triangle 0 in the image). From the last two bytes (00 00) we know the index of the triangle (0) in its chunk. From the second byte (04, in bits 0000 0100) we can tell the chunk and the orientation. The second-to-last bit (the third bit in 0100, i.e. 0) tells us it is an anticlockwise triangle (clockwise would be 0000 0110). The preceding bits (0000 01) tell us it is from the first chunk.

position 3

  • Going into the first node, the z position (20) is in between 0 and 26. Proceed to the next node.
  • Going into the next node, the y position (7F) is in between 0 and FF. Proceed to the next node.
  • Going into the next node, the x position (49) in in between 0 and 92. Proceed to the next node.
  • Going into the next node, the y position (7F) is lower than 92. However, it is also higher than 6B. This means we need to go down both branches. We'll go down the <92 one first. That node is immediately after this one.
    • Going into that node, we have to check FE/2 + X/2 - Y/2 (7F + 24 - 3F = 64). This is lower than 80 and not higher than 72, meaning we only have to go to the node right after this one.
    • Going into that node, this is a 52 node. Like before, we can tell the index (00 05), the chunk (the first one) and the orientation (clockwise, in this case).
  • We still have to go down the other branch. This is 0C bytes after the end of the original node (the 11 node).
    • Going into that node, we have to check FE/2 + X/2 - Y/2 (calculated before at 64). This is higher than 3B, but not lower than 57. This means we have to jump 04 bytes to arrive at the next node.
    • This is a 52 node. Like before, we can tell the index (00 01), the chunk (the first one) and the orientation (clockwise, in this case).