Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outstanding SVM design questions for genesis #108

Closed
Tracked by #92
lrettig opened this issue Jan 5, 2022 · 4 comments
Closed
Tracked by #92

Outstanding SVM design questions for genesis #108

lrettig opened this issue Jan 5, 2022 · 4 comments
Assignees
Labels

Comments

@lrettig
Copy link
Member

lrettig commented Jan 5, 2022

  • serialization
    • should encoding/decoding ("unpacking") of calldata happen in Wasm ("inside the VM") or via host functions?
      • if in Wasm, wouldn't this mean a big chunk of Wasm code (>7kb) that needs to be incorporated into every smart contract?
    • standard serialization library (e.g., SSZ) vs. custom library (vs. fork of/custom changes to standard library)
      • in particular, how to handle variable-length integers, e.g., gasprice/fee (which will usually be very low but may also sometimes need to be very high)
  • memory allocation: in Wasm or via host functions?
    • can we use, e.g., a simple bump allocator in Wasm?
@lrettig lrettig mentioned this issue Jan 5, 2022
20 tasks
@lrettig
Copy link
Member Author

lrettig commented Jan 5, 2022

CC @noamnelke @neysofu

@neysofu
Copy link

neysofu commented Jan 13, 2022

During recent R&D calls, we've discussed significant changes to the SVM for genesis. These changes are viable, but we still haven't assessed in great detail what impact these changes will have on the SVM roadmap. This is what we're talking about:

  1. Use SSZ for transaction encoding. The current custom encoding is also used to store templates' sections and transaction execution receipts, and we would likely want to use SSZ for these use cases as well: this would allow us to completely discard the old encoding format, which is highly desirable.
  2. Use SSZ for template sections and transaction execution receipts. See (1).
  3. Replace our WebAssembly ABI with SSZ. This point is quite delicate. Unlike (1) and (2), the WebAssembly ABI is subject to more constraints and stringent requirements compared to e.g. the transaction encoding. In particular, the encoding that we shall use for the WebAssembly ABI must satisfy the following properties:
    • It must be available under Fixed-Gas (no recursion and no loops). In some cases, we can get around this by unrolling loops up to a certain depth and disallowing e.g. nested structures with more than N levels deep. This most likely requires writing a custom WebAssembly library that allows for decoding SSZ under Fixed-Gas.
    • The code size of said library must not be too huge. A huge SSZ library results in smart contracts that are hard to audit and discourages the usage of said library in user-written templates.1
  4. Update the SDK to output "method selectors" integers, rather than endpoints' names. This requires changing examples, documentation, tests, as well as production code.
  5. Change allocation patterns within smart contracts. It seems like we want to remove the svm_alloc host call and we'll leave memory allocation to happen completely within userspace.
  6. Use zero-copy patterns in the WebAssembly host calls' API. This requires changes to the memory addressing modes of several WebAssembly host calls. (I remember Tal mentioning that we could use upper bits in 32-bit addresses to specify that memory addresses refer to either WebAssembly memory of transaction data. This is not final, but it can work.)
  7. Changes to verify. There are way too many proposed changes to verify to list them all here; but they all require significant changes to the code that we have. We obviously want to have very throughout tests for verify, so all changes will have to be tested as well. Depends on:
  8. Generalized nonce schemes. See https://community.spacemesh.io/t/nonce-schemes-and-account-unification/202. This must all be implemented from scratch. Depends on:
  9. Expose transaction data to WebAssembly logic. Since transactions are not constant in size, dynamic allocation via svm_alloc can be useful here. See also SVM Raw Transaction Ranges SMIPS#69 for in-depth discussion about possible ways to approach the zero-copy problem. My personal suggestion is to rely on svm_alloc to copy some memory back and forth between the host and Wasm logic.
  10. Update the SDK and all SVM toolchain (e.g. the CLI) to reflect the changes to host calls, ABI encoding, verify, allocation, etc.. This is mostly "glue" work and must be addressed incrementally while making all necessary changes.

I will update this comment as more things come to my mind.

Footnotes

  1. At genesis we won't have user-written templates, so this only becomes a concern once we allow custom templates.

@lrettig
Copy link
Member Author

lrettig commented Feb 24, 2022

Here's my recollection of where Filippo and I left things a few days ago:

Use SSZ for transaction encoding
Replace our WebAssembly ABI with SSZ

See this spreadsheet for proposed tx syntax. There's still some debate about this. The VM itself only needs to read the first two fields (20 byte principal, 1 byte method selector), which are the first few bytes of the tx and are fixed-length. Everything else is handled inside the principal contract, so in theory it could use a custom codec, or none at all (i.e., just fixed-length values). The challenge here to not having a single, standard encoding and ABI is that it makes creating transactions harder: any application or client that wants to interact with a smart contract would have to know not only its "ABI" (the list of endpoints and their signatures) but would also have to have a custom codec per contract, which sounds infeasible. And, when we add cross-contract calls, it means the calling endpoint/contract would also have to know the codec of the callee, which also sounds unworkable.

@noamnelke feel free to disagree/add thoughts.

Use SSZ for template sections and transaction execution receipts

Not much to add here. This is done in a custom way today and it should be relatively straightforward to use SSZ instead.

Update the SDK to output "method selectors" integers, rather than endpoints' names

This also feels relatively straightforward. We have one byte reserved in each tx for a method selector. If we reserve a few of these values for reserved "special" methods (deploy, self spawn, etc.), that would still leave ~240-250 possible endpoints per contract, which is enough. The SVM SDK, when it parses template code, will assign ints to the endpoint symbols. We need to make sure this happens deterministically across all nodes and all platforms. Methods should already be listed in the same order in the Wasm bytecode - we can check how the Wasmer tooling does this and maybe rely on this.

Change allocation patterns within smart contracts

It seems easier to allow the host to handle memory allocation for now. This does not preclude individual smart contracts from requesting memory from the host and managing it in a custom fashion. This also makes, e.g., a ssz_decode host function easier since the host can alloc the memory, write the result, and just pass back a pointer (otherwise it would require more back and forth).

Use zero-copy patterns in the WebAssembly host calls' API

This seems tricky and infeasible for genesis. It should be left as an open research task that we can pursue later.

Changes to verify

Here's a rough sketch of how verify works at a low level. Will flesh this out in a separate issue:

  • immediately allocates 64 bytes: half for pointer, half for length
  • immediately calls host fn and passes in pointer to this memory
  • host interprets as pointer to 64 byte value, allocates memory for tx data, returns this pointer + the length
  • hash the tx (up to signature) using incremental hash fn
    • svm_hash_init (do we need this? probably not assuming we only allow one hash at a time.)
    • svm_hash(pointer, length) -> () doesn't return anything, updates hash state
    • svm_hash_finalize(pointer-to-32-byte-memloc) -> (), cannot fail?
  • fetch pub key from global state
    • pub_key <- svm_load256(var_id, section_id)
  • svm_sig_verify
  • call each of svm_set_nonce, svm_set_maxgas, svm_set_gasprice, svm_set_maxspend once
  • SDK has to do gas metering and cut off verify (return failure) if it doesn't finish/succeed/checkpoint after a certain amount of gas spending - does verify() care about the account balance?
  • return valid

Generalized nonce schemes

I think we still have some outstanding questions here (CC @noamnelke). It seems that nonce checking could happen in one of two places, in the node or in the VM. If it happens in the node, then the node needs to call into the VM (via spacemeshos/SMIPS#80) to get the nonce/cmax value, and the comparison/sorting/validity logic can be implemented in the node. Questions:

  • does the VM need to check the nonce when executing a tx, or does it trust the node that it already did?
  • if yes, how do we implement this in the VM? (It can't happen in verify since it's not context-aware and cannot depend on account nonce or balance)

Expose transaction data to WebAssembly logic

nothing to add

Update the SDK and all SVM toolchain (e.g. the CLI) to reflect the changes to host calls, ABI encoding, verify, allocation, etc..

nothing to add

One more thing that's still outstanding: we still need to finalize how verify() works, top to bottom (see above sketch), and do the same for the other handler methods, handle() (high-level, entry-point VM handler that calls verify() and execute()) and execute() (which executes a transaction in situ).

@lrettig
Copy link
Member Author

lrettig commented May 29, 2023

Closing as outdated

@lrettig lrettig closed this as completed May 29, 2023
@github-project-automation github-project-automation bot moved this from Paused to Spec available (on-going review) in Research+Sandwich work pipeline May 29, 2023
@github-project-automation github-project-automation bot moved this from Research to WIP SMIP (comment) in Old research pipeline May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Spec available (on-going review)
Status: WIP SMIP (comment)
Development

No branches or pull requests

3 participants