Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage Refactor] Refactor Approvals #6868

Merged
merged 11 commits into from
Feb 5, 2025

Conversation

zhangchiqing
Copy link
Member

@zhangchiqing zhangchiqing commented Jan 10, 2025

This PR refactors the approvals storage from badger transaction to badger batch updates.

It is to prepare for switching from badger to pebble.

Referrals:
#6381
#6466

@zhangchiqing zhangchiqing changed the base branch from master to leo/db-ops-dbstore January 10, 2025 18:38
@zhangchiqing zhangchiqing force-pushed the leo/storage-refactor-approvals branch from 43bfcd3 to cc06a44 Compare January 10, 2025 18:41
Base automatically changed from leo/db-ops-dbstore to master January 13, 2025 19:55
@zhangchiqing zhangchiqing force-pushed the leo/storage-refactor-approvals branch from 90051e1 to a4bfe1f Compare January 15, 2025 16:24
@zhangchiqing zhangchiqing marked this pull request as ready for review January 15, 2025 16:24
@zhangchiqing zhangchiqing requested a review from a team as a code owner January 15, 2025 16:24
@codecov-commenter
Copy link

codecov-commenter commented Jan 15, 2025

Codecov Report

Attention: Patch coverage is 67.14286% with 23 lines in your changes missing coverage. Please review.

Project coverage is 41.16%. Comparing base (914f353) to head (5c6f00c).

Files with missing lines Patch % Lines
storage/store/approvals.go 77.04% 10 Missing and 4 partials ⚠️
storage/operation/approvals.go 0.00% 8 Missing ⚠️
engine/testutil/nodes.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6868      +/-   ##
==========================================
+ Coverage   41.10%   41.16%   +0.05%     
==========================================
  Files        2127     2129       +2     
  Lines      186321   186390      +69     
==========================================
+ Hits        76586    76725     +139     
+ Misses     103308   103237      -71     
- Partials     6427     6428       +1     
Flag Coverage Δ
unittests 41.16% <67.14%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@peterargue peterargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a few comments about docs, but othewise looks good.

return fmt.Errorf("could not lookup result approval ID: %w", err)
}

// no approval found, index the approval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this go after the error conditional completes on line 79?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, line 79 is when we have found an approval indexed by the chunk, in that case, we need to compare if the stored on is the same as the new one. And we should not store if they are different, because we should not allow a different approval for the same chunk.


type Cache[K comparable, V any] struct {
metrics module.CacheMetrics
// nolint:unused
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed? looks like limit is used

"github.com/onflow/flow-go/storage"
)

// nolint:unused
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need all of these nolint directives? seems like they will become out of date quickly.

}

// Get will try to retrieve the resource from cache first, and then from the
// injected. During normal operations, the following error returns are expected:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the injected referring to here?

}

// Get will try to retrieve the resource from cache first, and then from the
// injected. During normal operations, the following error returns are expected:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's easier to find the error part when it starts on a new line

Suggested change
// injected. During normal operations, the following error returns are expected:
// injected.
// During normal operations, the following error returns are expected:

return resource, nil
}

func (c *Cache[K, V]) Remove(key K) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add docs for this method too?

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Leo, this all look very good. Essentially all of my suggestions are regarding documentation and code aesthetics.

storage/operation/approvals.go Outdated Show resolved Hide resolved
storage/operation/approvals.go Show resolved Hide resolved
storage/store/approvals.go Show resolved Hide resolved
withRetrieve(retrieve)),
indexing: new(sync.Mutex),
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

or maybe inline the return statement in line 33 for compactness

storage/store/approvals.go Show resolved Hide resolved
// ResultApprovals store is only used within a verification node, where it is
// assumed that there is never more than one approval per chunk.
func (r *ResultApprovals) ByChunk(resultID flow.Identifier, chunkIndex uint64) (*flow.ResultApproval, error) {
return r.byChunk(r.db.Reader(), resultID, chunkIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would suggest to inline the body of the byChunk method here to avoid unnecessary layers of abstraction. This is the only place where byChunk is used

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test coverage is kind of very minimal 😅. I know that store.cache is a generalization using generics of existing code that has ran since very long time without problems. Though, if you can spare the time, it would be great to improve the test coverage to mission-critical aspects of the cache at least:

  1. Tests confirming that the storage.ErrNotFound is returned in case of cache misses would be good:
    // Get will try to retrieve the resource from cache first, and then from the
    // injected. During normal operations, the following error returns are expected:
    // - `storage.ErrNotFound` if key is unknown.
    func (c *Cache[K, V]) Get(r storage.Reader, key K) (V, error) {
  2. I think it would be good if we verified that unexpected errors from the [storeFunc](https://github.com/onflow/flow-go/blob/dd1a8d5c13a0b4da1a64759c50df5749fd419d0c/storage/store/cache.go#L22-L27) and retrieveFunc functions are passed up and not accidentally wrapped into storage.ErrNotFound
  3. It would be nice to have tests that checks:
    • if an element is already cached, we don't interact with the database (i.e retrieve is not called)
    • after a cache miss and then getting an element from the DB, the element is cached

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me, it looks like you moved the file storage/badger/approvals_test.go and modified it? I think it would be good to also retain the tests for storage/badger/approvals.go since the old badger-version will probably be still in use for quite some time and maybe even be modified.

storage/store/approvals_test.go Outdated Show resolved Hide resolved
storage/operation/approvals.go Show resolved Hide resolved
@AlexHentschel
Copy link
Member

AlexHentschel commented Feb 4, 2025

Hey, inspired by your PR, I extended the documentation of the storage.ResultApprovals 👉 PR #6976. If you could give take a look, I hope it is quick (documentation changes only). Thanks

@zhangchiqing zhangchiqing force-pushed the leo/storage-refactor-approvals branch from 8c82715 to 5c6f00c Compare February 5, 2025 18:41
@zhangchiqing zhangchiqing added this pull request to the merge queue Feb 5, 2025
Merged via the queue into master with commit 0aae073 Feb 5, 2025
56 checks passed
@zhangchiqing zhangchiqing deleted the leo/storage-refactor-approvals branch February 5, 2025 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants