Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(proto): reduce Value's size by one-word #522

Merged
merged 1 commit into from
Feb 28, 2025

Conversation

muktihari
Copy link
Owner

In general, I value simplicity over cleverness. However, in this case, I trade some simplicity for a bit of cleverness to gain a considerable performance boost. Here is my justification:

In the previous implementation, we need one more 1-byte field, typ, just for storing a type, resulting one-word penalty for every single Value we have. In 64-bit machine, one-word is 8 bytes, so the size of the Value will be 8 + 8 + (1 + 7 padding byte) = 24 bytes. If we use this SDK to handle a really large fit files, such as from a ultra-long-run or ultra-triathlon race, the one-word penalty will be significantly multiplied as we have way more messages. And imagine when this SDK is used to handle multiple fit files at once.

// Struct before, size 24 bytes.
type Value struct { 
 	_   [0]func()
 	num uint64
 	ptr unsafe.Pointer 
 	typ Type // <- Solely for storing the type
 } 

We only have 25 types for Value (extremely unlikely that we ever need to add more types), the nearest bits to hold such value is 5 bits (can hold up to 31 number). It's not even 1 byte (8 bits) yet we "waste" 8 bytes for it due to padding. So I'm thinking if we can squeeze the type identifier into num field instead for a slice and use pointer as an identifier for a numeric value, so we can reduce Value's size from 24 bytes to 16 bytes in 64-bit machine.

I take a measurement from two perspectives to ensure that the changes will be reasonable and safe:

  1. From safety perspective, there are two things to consider:

    1. From the FIT spec, the maximum size of value in FIT file is only 255 so we actually need only 1 byte to store slice's len yet we have uint64 (8 bytes) to hold it, slipping 1 byte in it to store a type should be okay.
    2. From the language side (Go), there is also a limit of how many item we can store in a slice. Try run this https://go.dev/play/p/0NfG6S8XumE and it will panic (panic: runtime error: makeslice: len out of range). So even if people try to use Value to hold more than 255 bytes value, this can still hold huge number of items before the compiler will stop them from doing it. So again, reducing 1 byte from num field is okay.
  2. From code complexity perspective, the change is considerably minimal, only some bits manipulation over here and there. We use bits manipulation everywhere anyway since we handle encoding.

The result of the new Value struct with a small additional change:

// Struct after, size 16 bytes.
type Value struct { 
 	num uint64
 	ptr unsafe.Pointer 
 } 

You might notice that now comparing two value is allowed. My justification is having two same values pointing to the same pointer should be an identical value.

Benchmark

The goal of this change is to enhance performance with the trade-off being more complex than before, so without a significant gain I wouldn't open this PR.

We have fair amount of improvement, especially in memory usage. It seems like there is +6% CPU overhead for encoding protocol messages, however, it doesn't take a consideration of the creation of messages and holding it in memory, so this is actually negligible.

goos: linux; goarch: amd64; pkg: benchfit
cpu: Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz
                              old.txt                   new.txt                
                               sec/op        sec/op     vs base                
Decode/muktihari/fit_raw-4   131.7m ±  4%   122.1m ± 10%        ~ (p=0.105 n=10)
Decode/muktihari/fit-4       107.0m ± 13%   100.2m ± 12%   -6.32% (p=0.029 n=10)
Encode/muktihari/fit_raw-4   46.90m ±  2%   49.83m ±  8%   +6.25% (p=0.000 n=10)
Encode/muktihari/fit-4       126.3m ±  2%   112.4m ±  3%  -10.99% (p=0.000 n=10)
geomean                      95.57m         90.99m         -4.80%

                              old.txt                   new.txt                
                                B/op          B/op      vs base                
Decode/muktihari/fit_raw-4   73.52Mi ± 0%   64.36Mi ± 0%  -12.46% (p=0.000 n=10)
Decode/muktihari/fit-4       41.02Mi ± 0%   41.00Mi ± 0%   -0.05% (p=0.000 n=10)
Encode/muktihari/fit_raw-4   7.525Ki ± 0%   7.525Ki ± 0%        ~ (p=1.000 n=10)
Encode/muktihari/fit-4       41.97Mi ± 0%   32.82Mi ± 0%  -21.81% (p=0.000 n=10)
geomean                      5.523Mi        5.023Mi        -9.05%

                              old.txt                  new.txt                
                             allocs/op    allocs/op   vs base                 
Decode/muktihari/fit_raw-4   100.0k ± 0%   100.0k ± 0%       ~ (p=1.000 n=10)
Decode/muktihari/fit-4       100.2k ± 0%   100.2k ± 0%       ~ (p=0.547 n=10)
Encode/muktihari/fit_raw-4    15.00 ± 0%    15.00 ± 0%       ~ (p=1.000 n=10) ¹
Encode/muktihari/fit-4       100.0k ± 0%   100.0k ± 0%       ~ (p=1.000 n=10) ¹
geomean                      11.07k        11.07k       -0.00%
¹ all samples are equal

@muktihari muktihari added the performance Changes related to performance improvement label Feb 28, 2025
@muktihari muktihari self-assigned this Feb 28, 2025
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (eea67ce) to head (63b65f6).

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #522   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           42        42           
  Lines         4641      4645    +4     
=========================================
+ Hits          4641      4645    +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@muktihari muktihari added this to the v0.24.6 milestone Feb 28, 2025
@muktihari muktihari merged commit 0ccfe28 into master Feb 28, 2025
5 checks passed
@muktihari muktihari deleted the perf/proto-value-reduce-size-by-one-word branch February 28, 2025 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Changes related to performance improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants