-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the IPFS core object to post block data during proposal #178
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start 👍🏼 🚀
types/block.go
Outdated
@@ -259,6 +264,128 @@ func mustPush(rowTree *nmt.NamespacedMerkleTree, id namespace.ID, data []byte) { | |||
} | |||
} | |||
|
|||
func (b *Block) PutBlock(ctx context.Context, api ipfsapi.CoreAPI) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we pass in a NodeAdder instead of the whole API directly?
It looks like we can even just pass in a pinning NodeAdder: https://github.com/ipfs/interface-go-ipfs-core/blob/b935dfe5375eac7ea3c65b14b3f9a0242861d0b3/dag.go#L12
(which since 0.8.0 seems to use the data store directly instead of coupling it with the DAG - hence it got faster with the last release as far as I understand).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I didn't notice the Pinning
method in the interface, that simplifies things a fair bit. 29244a0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
referencing #167 s.t. we know that we are already pinning some of the data.
types/block.go
Outdated
extendedDataSquare, err := rsmt2d.ComputeExtendedDataSquare(shares, rsmt2d.RSGF8, rsmt2d.NewDefaultTree) | ||
if err != nil { | ||
panic(fmt.Sprintf("unexpected error: %v", err)) | ||
} | ||
|
||
squareWidth := extendedDataSquare.Width() | ||
originalDataWidth := squareWidth / 2 | ||
|
||
// add namespaces to erasured shares and chunk into tree sized portions | ||
leaves := make([][][]byte, 2*squareWidth) | ||
for outerIdx := uint(0); outerIdx < squareWidth; outerIdx++ { | ||
rowLeaves := make([][]byte, squareWidth) | ||
colLeaves := make([][]byte, squareWidth) | ||
for innerIdx := uint(0); innerIdx < squareWidth; innerIdx++ { | ||
if outerIdx < originalDataWidth && innerIdx < originalDataWidth { | ||
rowShare := namespacedShares[outerIdx*originalDataWidth+innerIdx] | ||
colShare := namespacedShares[innerIdx*originalDataWidth+outerIdx] | ||
rowLeaves[innerIdx] = append(rowShare.NamespaceID(), rowShare.Data()...) | ||
colLeaves[innerIdx] = append(colShare.NamespaceID(), colShare.Data()...) | ||
} else { | ||
rowData := extendedDataSquare.Row(outerIdx) | ||
colData := extendedDataSquare.Column(outerIdx) | ||
parityCellFromRow := rowData[innerIdx] | ||
parityCellFromCol := colData[innerIdx] | ||
rowLeaves[innerIdx] = append(copyOfParityNamespaceID(), parityCellFromRow...) | ||
colLeaves[innerIdx] = append(copyOfParityNamespaceID(), parityCellFromCol...) | ||
} | ||
} | ||
leaves[outerIdx] = rowLeaves | ||
leaves[2*outerIdx] = colLeaves | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be orthogonal to this PR but we should either refactor this s.t. this is not duplicated here, or, use only the rsmt2d lib for getting the row/column roots (the latter is preferable if it works) and to collect the ipfs nodes.
Or, we could also just use the NMT + a "node collector" function here directly and skip using DataSquareRowOrColumnRawInputParser
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good idea! I ended up refactoring out the redundant code. 5bfb03f I don't think we can use the rsmt2d libs, because we have to add the namespaces back to the data before generating the roots, but I could be missing something. We could definitely use the node collector in this package instead of DataSquarerowOrColumnRawInputParser
, but I didn't for the time being in order not to export nmtNode
from the plugin lib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can use the rsmt2d libs, because we have to add the namespaces back to the data before generating the roots,
Alright! Yeah, I knew there was a catch but I couldn't remember what it was. We could add this kind of processing to the rsmt2d lib though: celestiaorg/rsmt2d#12 (comment)
But yeah, this is also completely orthogonal to this PR. Thanks for the refactoring! Looks good so far.
We could definitely use the node collector in this package instead of DataSquarerowOrColumnRawInputParser, but I didn't for the time being in order not to export nmtNode from the plugin lib.
Couldn't we simply export the NodeCollector and keep the nmtNode private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Is this what you had in mind? 47b4125
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that looks good!
func prependNode(newNode node.Node, nodes []node.Node) []node.Node { | ||
nodes = append(nodes, node.Node(nil)) | ||
copy(nodes[1:], nodes) | ||
nodes[0] = newNode | ||
return nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not the most pressing issue but I think @adlerjohn was worried about unnecessary allocations here (voiced in #152 or #155 - can't find the comment rn, there were so many). Now that this is a struct that has the nodes as a field, couldn't this be also a method of NmtNodeCollector
and you could modify that field directly instead of returning the modified node slice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method I believe only avoids allocations if the slice that's being prepended to never has to grow. It looks n.nodes
is reserved with a size of extendedRowOrColumnSize
, and should never exceed that, so there should be no allocations. That being said, as @liamsi said it might be a good idea to make this method more tightly linked to NmtNodeCollector
to prevent it from being used without this guarantee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The production code will now use a batch and immediately pass the node object the moment it is created to that batch instead of keeping track of the nodes in a slice (see NmtNodeAdder
below). Not sure about the allocs in that batch object but we'd need to pass the nodes there anyway.
I'd be OK with making the NmtNodeCollector a private function again as previously and only expose the batch adder version of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made NmtNodeCollector
a private struct c99bc38, but I have kept prependNode the way it is for now until further discussion. The main reason being that we are now only using it for testing purposes, and the second being that I think we should default to the idiomatic approach and optimize later. Given that a slice is kinda a pointer under the hood, my gut is telling me that it (weirdly) might not save on allocations or speed. I'm happy to do a quick fix on this though should we decide it worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason being that we are now only using it for testing purposes, and the second being that I think we should default to the idiomatic approach and optimize later.
Absolutely agree 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we even go further and rework the entire IPLD representation of NMT nodes, this code will very likely disappear entirely.
P.S. As you can see I am not satisfied with the current solution, but that is ok for now :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. As you can see I am not satisfied with the current solution, but that is ok for now :)
Can you jot down what you dislike and what you would suggest in an issue? That would be helpful :-)
@@ -32,5 +33,5 @@ var ( | |||
ParitySharesNamespaceID = namespace.ID{0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF} | |||
|
|||
// change accordingly if another hash.Hash should be used as a base hasher in the NMT: | |||
newBaseHashFunc = sha3.New256 | |||
newBaseHashFunc = sha256.New |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref: #187
types/block.go
Outdated
format "github.com/ipfs/go-ipld-format" | ||
|
||
"github.com/lazyledger/lazyledger-core/p2p/ipld/plugin/nodes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a dependencies perspective, it is probably desirable to not add any networking-related dependencies here to the types package. We can still merge this as is, but we should track this in an issue then.
Note that the main reason to make this thing a method of Block
was that we could potentially cache the extended data square as a field and reuse it for putting the block data to ipfs. Maybe we can achieve the same without adding the actual networking logic here. Either way, we can figure this out later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a great point! You're right,PutBlock
definitely does not have to be a method of Block
. #196
Codecov Report
@@ Coverage Diff @@
## master #178 +/- ##
==========================================
+ Coverage 60.87% 61.95% +1.07%
==========================================
Files 261 258 -3
Lines 23640 23069 -571
==========================================
- Hits 14392 14292 -100
+ Misses 7759 7291 -468
+ Partials 1489 1486 -3
|
…d computing the erasure data twice handle error from posting block data use the dag instead of the pinningAdder object don't use the error wrapping directive
review feedback
add testing infrastructure for TestPutBlock
4d572a6
to
3b52870
Compare
fix node collection and commit data to ipfs remove extra plugin var unexport node collector fix missed unexport
update test result, which changed because of the updated hash function update to the new PutBlock API
linter gods
func prependNode(newNode node.Node, nodes []node.Node) []node.Node { | ||
nodes = append(nodes, node.Node(nil)) | ||
copy(nodes[1:], nodes) | ||
nodes[0] = newNode | ||
return nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we even go further and rework the entire IPLD representation of NMT nodes, this code will very likely disappear entirely.
P.S. As you can see I am not satisfied with the current solution, but that is ok for now :)
// TODO(evan): don't hard code context and timeout | ||
if cs.IpfsAPI != nil { | ||
// longer timeouts result in block proposers failing to propose blocks in time. | ||
ctx, cancel := context.WithTimeout(context.Background(), time.Millisecond*1500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a special context.TODO()
for such cases.
Also, what's the rationale behind having timeout here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I didn't know that. In this case, we need the timeout because PutBlock
is synchronous and will take too long, stopping the block proposer from proposing a block in time.
@@ -1100,6 +1103,19 @@ func (cs *State) defaultDecideProposal(height int64, round int32) { | |||
} else if !cs.replayMode { | |||
cs.Logger.Error("enterPropose: Error signing proposal", "height", height, "round", round, "err", err) | |||
} | |||
|
|||
// post data to ipfs | |||
// TODO(evan): don't hard code context and timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case, we can pass-through context already, having an issue for that seems like an overmanagement :)
Co-authored-by: Ismail Khoffi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to add Nmt
codec to cid.Codecs
and cid.CodecsToStr
maps in init func. We don't have that and only register MH codec. While this does not break anything for now, it may in the future.
@Wondertan that makes sense. What |
@evan-forbes, considering the fact that we have two nodes |
It's certainly orthogonal to this PR but it's a small change and if it is a no-brainer, feel free to include it here. That said, we were planning to submit a PR to
The string should probably just be "nmt" or "nmt-node" as that codec is only used to register a block decoder.
@Wondertan yeah, you brought that up in that first LL plugin PR. It's a good point! Feel free to open an issue about it so we have it in our backlog and if you really feel that it would add immediate value, feel free to start working on it (I certainly agree that it feels cleaner - simply not sure if it's worth the effort right now). |
Yes. Although, I come from the intention: "if you introduce any codec, pls be kind to register it in your app." :)
I can agree here, but also I can agree that this is a no-brainer. I would compare this with the tech-debt or following good practices where one node type = one codec type. We can quickly resolve this in the PR without starting a separate discussion branch. I think adding a few lines(two string consts and map puts) does not worth that. |
@evan-forbes can you please add the |
@liamsi, ok, will do |
Co-authored-by: Ismail Khoffi <[email protected]>
The data race that shows up in the CI seems unrelated to the changes in this PR. Here is the output though: Click here to expand logs
I'll restart CI now. |
I was still trying to debug, and restarted the CI a few times with no luck... Looks like it passed this time! 😬 I'll open an issue regarding the nondeterministic test failures. #216 |
// iterate through each set of col and row leaves | ||
for _, leafSet := range leaves { | ||
// create a batch per each leafSet | ||
batchAdder := nodes.NewNmtNodeAdder(ctx, format.NewBatch(ctx, nodeAdder)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to extract a local var from format.NewBatch(ctx, nodeAdder)
and use that to commit to that batch.
// register the codecs in the global maps | ||
cid.Codecs[NmtCodecName] = Nmt | ||
cid.CodecToStr[Nmt] = NmtCodecName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we panic in case the codec already exists? (e.g. because another project registered 0x7700
for sth else)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left two nits. They can also be addressed in follow-up PRs instead.
Overall, this is amazing work 🚀 Thanks so much @evan-forbes!
Description
Here's a rough draft for using the core IPFS object to post block data while proposing a block. This PR shows some of the difficulties that go with using an IPFS core object. The IPFS object is not initiated until OnStart is called, but we need to get the core object to where the block is being proposed. While hacky, if we simply add a field containing a IPFS core object to the consensus.State struct, then we can fill that field during
Onstart
, and have access to the ipfs object when the block is being proposed.There is a second problem, which is that the erasure data is not cached. Unless we want to pass the IPFS object all the way to where we are first generating the erasure data, we have no way to access it. This PR simply generates the data again, but another option would be to cache the erasured data in the
Block
struct and use it from there.As mentioned in #170, there's the option to not use the IPFS coreapi object and instead use some other IPFS client.
closes #185