Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tstate] change map[key] from string to fixed size byte array #170

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

hexfusion
Copy link
Contributor

@hexfusion hexfusion commented May 4, 2023

This is an experimentation with replacing the current internal maps from map[string][]byte to map[[65]byte][]byte. In benchmarks I found that using a byte array was much faster than using string as a key. I chose the length based on the largest keys I saw used by transfervm for warp. As you can see by the profiles this cut out a fairly expensive overhead of mappassign_faststr.

ref. https://gitlab.com/pthevenet/go-small-answers/-/blob/29c03407006a269f8f6ba27fa3123fb9742ddf7c/byte-slice-indexed-maps/benchmark_test.go

This reduces the allocations for FetchAndSetScope and provides much cleaner profile (will upload files soon).

goos: linux
goarch: amd64
pkg: github.com/ava-labs/hypersdk/tstate
cpu: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz
BenchmarkFetchAndSetScope/fetch_and_set_scope_10_keys-12                  239923              4242 ns/op            2402 B/op          3 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_100_keys-12                  35722             37077 ns/op           27892 B/op         10 allocs/op
BenchmarkFetchAndSetScope/fetch_and_set_scope_1000_keys-12                  2472            572810 ns/op          432668 B/op         26 allocs/op
ok  	github.com/ava-labs/hypersdk/tstate	4.474s

before
image

after
image

hexfusion added 3 commits May 4, 2023 16:37
Signed-off-by: Sam Batschelet <[email protected]>
Signed-off-by: Sam Batschelet <[email protected]>
@hexfusion hexfusion temporarily deployed to long-ci May 4, 2023 21:56 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 4, 2023 21:56 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 4, 2023 21:56 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 4, 2023 21:56 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 4, 2023 21:56 — with GitHub Actions Inactive
Signed-off-by: Sam Batschelet <[email protected]>
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 00:45 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 00:45 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 00:45 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 00:45 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 00:45 — with GitHub Actions Inactive
Signed-off-by: Sam Batschelet <[email protected]>
chain/processor.go Outdated Show resolved Hide resolved
tstate/tstate_test.go Outdated Show resolved Hide resolved
tstate/tstate_test.go Outdated Show resolved Hide resolved
Signed-off-by: Sam Batschelet <[email protected]>
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 12:41 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 12:41 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 12:41 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 12:41 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 12:41 — with GitHub Actions Inactive
@patrick-ogrady
Copy link
Contributor

patrick-ogrady commented May 5, 2023

Could you change the 65 here to be a config passed in on init? We should then apply a min filter based on what the HyperSDK stores behind the scenes.

The string conversion I was doing definitely was not optimal...this is probably the right approach.

@patrick-ogrady
Copy link
Contributor

You should make a similar change here:

func (p *Processor) Prefetch(ctx context.Context, db Database) {
ctx, span := p.tracer.Start(ctx, "Processor.Prefetch")
p.db = db
sm := p.blk.vm.StateManager()
go func() {
defer span.End()
// Store required keys for each set
alreadyFetched := make(map[string]*fetchData, len(p.blk.GetTxs()))
for _, tx := range p.blk.GetTxs() {
storage := map[string][]byte{}
for _, k := range tx.StateKeys(sm) {
sk := string(k)
if v, ok := alreadyFetched[sk]; ok {
if v.exists {
storage[sk] = v.v
}
continue
}
v, err := db.GetValue(ctx, k)
if errors.Is(err, database.ErrNotFound) {
alreadyFetched[sk] = &fetchData{nil, false}
continue
} else if err != nil {
panic(err)
}
alreadyFetched[sk] = &fetchData{v, true}
storage[sk] = v
}
p.readyTxs <- &txData{tx, storage}
}
// Let caller know all sets have been readied
close(p.readyTxs)
}()
}

Signed-off-by: Sam Batschelet <[email protected]>
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 14:15 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 14:15 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 14:15 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 14:15 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 14:15 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 16:28 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 16:28 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 16:28 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 16:28 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 16:28 — with GitHub Actions Inactive
hexfusion added 2 commits May 5, 2023 15:50
Signed-off-by: Sam Batschelet <[email protected]>
Signed-off-by: Sam Batschelet <[email protected]>
@hexfusion hexfusion force-pushed the hexfusion/tstate-fixed-size branch from 3b773d1 to d8abb92 Compare May 5, 2023 20:17
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:17 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:17 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:17 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:17 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:17 — with GitHub Actions Inactive
Signed-off-by: Sam Batschelet <[email protected]>
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:20 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:20 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:20 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:20 — with GitHub Actions Inactive
@hexfusion hexfusion temporarily deployed to long-ci May 5, 2023 20:20 — with GitHub Actions Inactive
for _, k := range tx.StateKeys(sm) {
sk := string(k)
sk := tstate.ToStateKeyArray(k)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the best middle-ground is just to use the approach in BenchmarkOptimizedStringKeyed?

https://pthevenet.com/posts/programming/go/bytesliceindexedmaps/

We then get most of the benefit while retaining flexibility 🤷 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the flexibility loss isn't worth the gain, working on benchmark a few alternatives including yours, hope to have that soon.

@github-actions
Copy link

This PR has become stale because it has been open for 30 days with no activity. Adding the lifecycle/frozen label will exempt this PR from future lifecycle events..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants