perf(proto): reduce Value's size by one-word #522

muktihari · 2025-02-28T12:18:23Z

In general, I value simplicity over cleverness. However, in this case, I trade some simplicity for a bit of cleverness to gain a considerable performance boost. Here is my justification:

In the previous implementation, we need one more 1-byte field, typ, just for storing a type, resulting one-word penalty for every single Value we have. In 64-bit machine, one-word is 8 bytes, so the size of the Value will be 8 + 8 + (1 + 7 padding byte) = 24 bytes. If we use this SDK to handle a really large fit files, such as from a ultra-long-run or ultra-triathlon race, the one-word penalty will be significantly multiplied as we have way more messages. And imagine when this SDK is used to handle multiple fit files at once.

// Struct before, size 24 bytes.
type Value struct { 
 	_   [0]func()
 	num uint64
 	ptr unsafe.Pointer 
 	typ Type // <- Solely for storing the type
 }

We only have 25 types for Value (extremely unlikely that we ever need to add more types), the nearest bits to hold such value is 5 bits (can hold up to 31 number). It's not even 1 byte (8 bits) yet we "waste" 8 bytes for it due to padding. So I'm thinking if we can squeeze the type identifier into num field instead for a slice and use pointer as an identifier for a numeric value, so we can reduce Value's size from 24 bytes to 16 bytes in 64-bit machine.

I take a measurement from two perspectives to ensure that the changes will be reasonable and safe:

From safety perspective, there are two things to consider:
1. From the FIT spec, the maximum size of value in FIT file is only 255 so we actually need only 1 byte to store slice's len yet we have uint64 (8 bytes) to hold it, slipping 1 byte in it to store a type should be okay.
2. From the language side (Go), there is also a limit of how many item we can store in a slice. Try run this https://go.dev/play/p/0NfG6S8XumE and it will panic (panic: runtime error: makeslice: len out of range). So even if people try to use Value to hold more than 255 bytes value, this can still hold huge number of items before the compiler will stop them from doing it. So again, reducing 1 byte from num field is okay.
From code complexity perspective, the change is considerably minimal, only some bits manipulation over here and there. We use bits manipulation everywhere anyway since we handle encoding.

The result of the new Value struct with a small additional change:

// Struct after, size 16 bytes.
type Value struct { 
 	num uint64
 	ptr unsafe.Pointer 
 }

You might notice that now comparing two value is allowed. My justification is having two same values pointing to the same pointer should be an identical value.

Benchmark

The goal of this change is to enhance performance with the trade-off being more complex than before, so without a significant gain I wouldn't open this PR.

We have fair amount of improvement, especially in memory usage. It seems like there is +6% CPU overhead for encoding protocol messages, however, it doesn't take a consideration of the creation of messages and holding it in memory, so this is actually negligible.

goos: linux; goarch: amd64; pkg: benchfit
cpu: Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz
                           │   old.txt    │               new.txt                │
                           │    sec/op    │    sec/op     vs base                │
Decode/muktihari/fit_raw-4   131.7m ±  4%   122.1m ± 10%        ~ (p=0.105 n=10)
Decode/muktihari/fit-4       107.0m ± 13%   100.2m ± 12%   -6.32% (p=0.029 n=10)
Encode/muktihari/fit_raw-4   46.90m ±  2%   49.83m ±  8%   +6.25% (p=0.000 n=10)
Encode/muktihari/fit-4       126.3m ±  2%   112.4m ±  3%  -10.99% (p=0.000 n=10)
geomean                      95.57m         90.99m         -4.80%

                           │   old.txt    │               new.txt                │
                           │     B/op     │     B/op      vs base                │
Decode/muktihari/fit_raw-4   73.52Mi ± 0%   64.36Mi ± 0%  -12.46% (p=0.000 n=10)
Decode/muktihari/fit-4       41.02Mi ± 0%   41.00Mi ± 0%   -0.05% (p=0.000 n=10)
Encode/muktihari/fit_raw-4   7.525Ki ± 0%   7.525Ki ± 0%        ~ (p=1.000 n=10)
Encode/muktihari/fit-4       41.97Mi ± 0%   32.82Mi ± 0%  -21.81% (p=0.000 n=10)
geomean                      5.523Mi        5.023Mi        -9.05%

                           │   old.txt   │               new.txt                │
                           │  allocs/op  │  allocs/op   vs base                 │
Decode/muktihari/fit_raw-4   100.0k ± 0%   100.0k ± 0%       ~ (p=1.000 n=10)
Decode/muktihari/fit-4       100.2k ± 0%   100.2k ± 0%       ~ (p=0.547 n=10)
Encode/muktihari/fit_raw-4    15.00 ± 0%    15.00 ± 0%       ~ (p=1.000 n=10) ¹
Encode/muktihari/fit-4       100.0k ± 0%   100.0k ± 0%       ~ (p=1.000 n=10) ¹
geomean                      11.07k        11.07k       -0.00%
¹ all samples are equal

codecov-commenter · 2025-02-28T12:19:33Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (eea67ce) to head (63b65f6).

Additional details and impacted files

@@            Coverage Diff            @@
##            master      #522   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           42        42           
  Lines         4641      4645    +4     
=========================================
+ Hits          4641      4645    +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

proto/value.go

perf(proto): reduce Value's size by one-word

63b65f6

muktihari added the performance Changes related to performance improvement label Feb 28, 2025

muktihari self-assigned this Feb 28, 2025

github-advanced-security bot found potential problems Feb 28, 2025

View reviewed changes

proto/value.go Dismissed Show dismissed Hide dismissed

muktihari added this to the v0.24.6 milestone Feb 28, 2025

muktihari merged commit 0ccfe28 into master Feb 28, 2025
5 checks passed

muktihari deleted the perf/proto-value-reduce-size-by-one-word branch February 28, 2025 12:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(proto): reduce Value's size by one-word #522

perf(proto): reduce Value's size by one-word #522

muktihari commented Feb 28, 2025

codecov-commenter commented Feb 28, 2025

perf(proto): reduce Value's size by one-word #522

perf(proto): reduce Value's size by one-word #522

Conversation

muktihari commented Feb 28, 2025

Benchmark

codecov-commenter commented Feb 28, 2025

Codecov Report