Initial Fabric Private Chaincode RFC #21

mbrandenburger · 2020-02-19T13:18:20Z

Signed-off-by: Marcus Brandenburger [email protected]

Signed-off-by: Marcus Brandenburger <[email protected]>

cendhu

Thanks, @mbrandenburger, for a detailed proposal. I have listed some of my doubts/questions.

cendhu · 2020-02-24T17:55:17Z

text/0000-fabric-private-chaincode.md

+
+A prototype implementation of FPC is available as Hyperledger Lab on github (https://github.com/hyperledger-labs/fabric-private-chaincode).
+
+# Motivation


What problem is being solved here?

Is a client of an organization distrusting the peer hosted by that organization?

Is an organization distrusting the cloud vendor who hosts the peer?

Are we trying to protect the peer once the server is compromised?

All of the above

I am asking this because the solution could differ depending on the problem being solved. For example, if a client wants to hide data from the endorsing peer or if the client wants to hide data from the attacker once the peer is compromised, the client would first try to encrypt the data before sending it to the peer. As the decryption needs to be done by the chaincode, it could leak the key. Hence, an SGX enclave could be used to avoid the leakage of the key (I assume that the key itself was generated by the enclave using EGETKEY -- sealing key). At the same time, I don't see a reason for running the validation logic in an enclave as well as holding ledger metadata. If we try to solve all problems at once, the changes required at the peer might be very significant and has enough potential to introduce bugs IMO.

The primary problem solved here is that we protect an organization (and its client) from other organizations (and its peer) and their "nosy-ness". Fabric does give guarantees for integrity in that scenario but not for confidentiality in the case the organization have to collaborate for a particular chaincode (private data collections do not help in that case).

Regarding the trusted ledger: it is paramount that a private chaincode works from accurate, consistent and committed ledger data as otherwise you cannot give integrity and hence also no privacy guarantees (see the rollback attack referenced later in the text as one example of why we must provide these guarantees). As the private chaincode runs in a hostile environment -- local peer is not necessarily trusted -- it cannot rely on the peer doing the validation. While there would be perceivably other ways to get the integrity, e.g., on some consensus among enclaves at other peers in other organizations, our approached seemed by far the most efficient approach.

Let me try to understand the problem with an use-case.

Use-Case: Every organization needs to vote on a particular decision by saying either yes or no. At the end of the voting, no organization should know the vote provided by another organization but only the number yeses and nos.

Smart-Contract: The smart-contract would receive the vote from the organization and store it on the ledger. Once all organizations in the channel provided their vote, any organization can query the chaincode to count the votes alone. This smart-contract logic is visible to everyone.

Solution Based on An Existing Approach that Provides Data Confidentiality:
- implicit collection: Every organization can put their vote on their individual collection and only hash gets stored on other organization's peers. Note that the vote is passed with some salt so that from the hash it is not possible to determine the vote. In the end, the vote must be revealed to count the number of yeses and nos. Hence, the implicit collection does not solve the problem.

Problem: The above-mentioned use-case and similar use-cases where the organization data must never be revealed outside the organization but using smart-contract and pre-defined logic, another organization can compute something using those secret data. For this scenario, we have no mechanism in Fabric.

Proposed Solution: We can use TEE to ensure the confidentiality of the data and still other organization can run some computation on the data. The decryption key is only known to the sgx enclave and shared among all organization's sgx enclave. As a result, only the sgx enclave can decrypt the data, do computation based on the pre-defined logic and respond to the user with the result. In other words, only the CPU chip can see the real data.

The proposed solution also wants to run the validation component inside the enclave and maintain the ledger data in the enclave. This is to ensure that no space is given for an organization to try any attack and reveal any data. Sure, I will look at the reply attack and get back.

Let me know if I have understood the problem correctly. If yes, it is good to update the RFC to clearly mention the problem with a use-case.

Some followup questions: As I asked in some of my previous comments, I would like to understand how the keys are generated?

Does it use MRENCVLAE to generate the seal key?

Does it use MRSIGNER to generate the seal key and who is the signing authority?

How the keys are distributed to other organizations' enclaves?

How failures are managed?

A lot of question marks around key management and failure handling. It is good to add separate sections to discuss these. I am asking this because based on the key management approach, a certain attack could be possible (Note: my area of expertise is not security but just an enthusiastic).

Yes, you got the problem correctly. It is essentially the classical multi-party security problem we are tackling such as millionaires problem and alike.

Regarding your questions:

we do seal enclave-local state with key-ing material but externally visible keys (enclave and contract PKs) as well as contract state encryption keys are randomly generated and not via EGETKEY.

no, we do not use MRSIGNER but rely only on MRENCLAVE (with MRSIGNER you would have to use some threshold signature scheme to make it work and essentially with the approval process in Fabric v2 lifecycle we get the equivalent)

enclaves are first registered on the ledger including their attestation on a enclave-specific key. The first enclave will generate chaincode specific key-material. Via the enclave registry enclaves can then figure out approved enclaves and in mutually authenticated way propagate the keys.

regarding failures, this is a bit an open-ended questions. We definitely keep different keys for different chaincodes and using the registry you would know where chaincode would have run and so you can do additional risk management, e.g., by limiting on what chaincodes could be collocated in the same platform. Not that we do currently would expose actual policies. However, such policies could built on top.

I need some more time to digest these points. Failure handling is the most important IMO. In peer, we have a lot of code that is related to recovery after failure and ensuring data consistency. Hence, it would be nice to have a section on failure handling.

I have not clearly understood how the keys are passed between peers securely, who create key, where it is stored, etc... Maybe once the RFC is updated with some of the figures I have requested to understand these complex concepts, I can re-review the whole proposal again.

cendhu · 2020-02-24T18:21:45Z

text/0000-fabric-private-chaincode.md

+FPC is motivated by the many use cases in which it is desirable to embody an application in a Blockchain architecture, but where in addition to the existing Integrity assurances, the application also requires privacy. This may include private voting, sealed bid auctions, operations on sensitive data such as regulated medical or genomic data, and supply chain operations requiring contract secrecy. With Fabric's current privacy mechanisms, these use cases are not possible as they still require the endorsement nodes to be fully trusted.
+For example, the concept of channels and Private Data allows to restrict chaincode data sharing only within a group of authorized participants, still when the chaincode processes the data it is exposed to the endorsing peer in clear. In the example of a voting system, where a government may run an endorsing peer it is clear that this is not ideal. 
+
+A second motivation for FPC is its integrity model based on hardware-enabled remote cryptographic attestation; this model can provide similarly high confidence of integrity to the standard Fabric model of integrity through redundancy, but using less computation and communication. With TEE-based endorsement and remote attestation, a new set of endorsement policies are made possible, which can reduce the number of required endorsements and still provide sufficient assurance of integrity for many workloads.


The endorsement model is used not just to provide integrity of chaincode execution through redundancy, it is a form of agreement/consensus from the various organizations on a result of a transaction simulation (on what they intend to do). I think we have named the block creator as an ordering service (rather than a consensus service as it does not look into the transaction semantics and make decisions depending on the business logic). For this reason, we introduced pluggable ESCC at the peer per chaincode. As there are problem with go-plugins, in v2.0, we allow each organization to run a customized chaincode whereas, in v1.x, all organizations must run the same chaincode (hash must match). As a result, any customized logic can be added and an error can be created when the peer is not ready to endorse the changes. Copying the below snippet from https://hyperledger-fabric.readthedocs.io/en/latest/whatsnew.html

Chaincode packages do not need to be identical across channel members: Organizations can extend a chaincode for their own use case, for example, to perform different validations in the interest of their organization. As long as the required number of organizations endorse chaincode transactions with matching results, the transaction will be validated and committed to the ledger. This also allows organizations to individually roll out minor fixes on their own schedules without requiring the entire network to proceed in lock-step.

Further, the key-based endorsement is also used as an ACL for writes. If an asset is owned by two or more organizations together, all the member organizations need to sign to make any modification to the asset (by defining an endorsement policy for that key)

I think the TEE could be useful when the key-based endorsement policy has a single endorser only. In such a scenario, the chaincode integrity cannot be guaranteed through redundancy (as it is not possible). Btw, is the key-based endorsement supported?

Hence, I do not see TEE as a replacement for endorsement in Fabric.

TEE-based endorsement is indeed not a complete replacement of all fabric endorsements/agreements. Any change of meta-data such as code upgrade, keys required to bootstrap security and alike will still rely on normal fabric policies for approving the corresponding meta-data state. For endorsements of actual chaincode transactions, enclave endorsements (attestations) are sufficient, though. Note that the n-programming model which fabric allows now with v2 works well for integrity guarantees but does not (practically) work for confidentiality guarantees. In the latter case you require that a chaincode only ever releases information according the specification at endorsement time , there is no way to prevent wrong execution at validation time (as for integrity) as the damage (leakage of sensitive information) already has happened before and a smart attacker would never send such leaky endorsement for validation ..

Regarding key-based endorsement: this is not a currently supported feature in FPC.

Agreed. A peer's endorsement can be replaced with TEE-based endorsement (attestation). However, my comment was related to the following points which suggested replacing existing endorsement policy with single/fewer TEE endorsement.

this model can provide similarly high confidence of integrity to the standard Fabric model of integrity through redundancy, but using less computation and communication. With TEE-based endorsement and remote attestation, a new set of endorsement policies are made possible, which can reduce the number of required endorsements and still provide sufficient assurance of integrity for many workloads.

Hence, I mentioned that endorsements are not only to ensure integrity, it is also to ensure

deterministic execution of smart-contract (as we support general-purpose language unlike Ethereum which support deterministic solidity).

agreements (a part of consensus).

to allow each organization to run a customized logic.

The first point is automatically guaranteed for a particular chaincode, everybody will run the same enclave code. As mentioned, to ensure confidentiality this is almost unavoidable; in particular it practically rules out organization-private custom logic (you would have to proof that such code still guarantees the overall privacy guarantees, you would have to do that in zero-knowledge if you want to keep logic secret and this would have to hold for potentially arbitrary code. Theoretically of course possible but not very practical). For more restricted classes of customization the problem could be simpler: What concretely do you have in mind when you say customized logic?

cendhu · 2020-02-24T18:36:03Z

text/0000-fabric-private-chaincode.md

+
+Overall, FPC can be considered an addition to Fabric wherein all chaincode computation relies only on the correctness of data provided by an authenticated Client or computed inside of and signed by a validated Trusted Execution Environment. It assumes that the Ordering Service is a trusted system element. The Endorsing Peer outside of the Trusted Execution Environment, however, is considered untrusted.
+
+Writing chaincode for FPC should come natural to developers familiar with Fabric as the programming model (e.g., chaincode lifecycle, chaincode invocations and state) is the same as for normal Fabric chaincode.  The main differences are a (for now at least) different programming language (C++) and a shim API which, on the one hand, is somewhat simplified but, on the other hand, also contains FPC-specific extensions such as allowing for private as well as public ledger state as well as a secure (yet “replicable”) random-number generator.  In particular, a developer can write chaincode largely without having to be aware that the code executes within a Trusted Execution Environment (TEE), often referred to as an Enclave. However, to understand the architecture and security underpinning it we introduce in the following several new concepts and terms. Except where noted otherwise, all elements of the architecture described below reside on a Fabric Peer.


Already we support golang, nodejs, and java chaincodes. Sometimes, it is very tedious and time consuming to maintain consistency between various shims. In order to add a new chaincode API to the shim, we need to write the code in three different languages. Introducing another shim would add more management overhead IMO. I understand the C/C++ is needed for sgx enclave. Is it not possible to use cgo and extend go shim rather than building a new shim? https://github.com/rupc/go-with-intel-sgx I could be wrong as I do not have experience with sgx enclave.

as mentioned, our goal is to broaden the set of chaincode languages, probably based on a WebAssembly based runtime. However, support for languages other than C/C++ for SGX is still evolving -- e.g., there is currently no good go support (note the project you reference is only on how to access from (untrusted) go code outside of the enclave an enclave function, something we in fact also do in our code, and does not help running go code inside the enclave ) -- and our goals is to first focus on the core architecture and functionality (which is as in fabric chaincode language independent) and then focus on adding other, "more fabric standard" chaincode languages.

cendhu · 2020-02-24T18:40:34Z

text/0000-fabric-private-chaincode.md

+
+FPC Chaincode: A chaincode created by a developer to run in a Chaincode Enclave. Unlike regular Fabric chaincode, an Enclave Chaincode  must currently be written in C++ using our FPC SDK. A future goal for the project is to support additional languages, e.g., by the use of WebAssembly.
+
+Chaincode Enclave: An enclave in which a single particular chaincode executes.  The Chaincode Enclave contains the chaincode (called an Enclave Chaincode) to be executed, with its linked Chaincode Library. The FPC runtime is responsible for creating the Chaincode Enclave when the FPC chaincode is installed on the endorsing peers.


Is the chaincode installed as a plaintext? If the peer is compromised, can the attacker look at the business logic implemented in the chaincode?

The chaincode itself is not encrypted as it is paramount that participating organization can assure themselves that the chaincode behaves according to commonly agreed specification. We currently assume that the chaincode runs only on peers from organizations participating in this chaincode and hence the business logic/specification is already known. Note that the code itself is not on the ledger, only a fingerprint (MRENCLAVE).

cendhu · 2020-02-24T18:47:53Z

text/0000-fabric-private-chaincode.md

+
+Chaincode Library (also known as the FPC Shim): This is a shim interface exposed to an FPC Chaincode.  The Chaincode Library contains two parts, one residing inside the enclave and another one outside the enclave. The shim interface exposed to the FPC chaincode is written primarily in C++, whereas the counterpart is written in Go and mainly responsible for communication between the Peer and the FPC chaincode. The Chaincode Library enables an Enclave Chaincode to talk to the Peer, in particular to invoke the chaincode and provide "normal" shim operations such as getState and putState. The FPC SDK provides the chaincode library to enable FPC chaincode developers to easily link to this library when compiling their FPC Chaincode and building a deployment artifact. It implements the interface from the Peer to the Enclave Chaincode, encrypts and decrypts state and arguments, and implements Attestation.
+
+Ledger Enclave: The Ledger Enclave is a crucial component to protect the execution of a FPC chaincode. In particular, it is responsible to ensure that a FPC chaincode processes committed world state data, i.e. only state data that comes from valid and committed transactions. The Ledger Enclave is also called the TLCC, Trusted Ledger Enclave or the Validation Enclave. The Ledger Enclave is a separate enclave which locally stores integrity metadata for validating information on the blockchain ledger. Like the Peer, it performs standard validation logic when a new block arrives, but it also creates and stores a cryptographic hash of each new key-value pair of the blockchain state. This makes it possible to verify that the data coming from the Peer is correct -- remember that in the FPC setting we do not necessarily trust a (single) peer. The Chaincode Enclave uses the Ledger Enclave as its tamperproof source of information on the blockchain world state. The Ledger Enclave and a Chaincode Enclave interact by through a secure channel that is initially established when the Chaincode Enclave is deployed. There is one Ledger Enclave per Channel on any given Peer. When a peer receives a new block from the ordering service, it forwards it to the Ledger Enclave. The Ledger Enclave is part of the FPC runtime.


Does the ledger enclave store the hash of each kv pair in memory?

if yes, don't we have some limit on the protected memory used by the sgx? May be with EPC, we can use more memory but this would impact the performance.

Does the ledger enclave seal the data and store it on the disk?

if yes, can't the attacker tamper the data? if such tampering is detected, what would the chaincode or ledger enclave do?

Conceptionally, the ledger enclave maintains a hash of each kv pair in memory. As you pointed out, memory is a limited resources here and eventually the ledger enclave needs to implement a clever way to tackle this problem securely and efficiently.

The ledger enclave can maintain the integrity metadata (hashes) in a merkle tree structure and outsource them securely (using enclave data sealing) to untrusted disk on the peer.

Enclave data sealing provides authenticated encryption that enables the enclave to detect data corruption. In case of a violation, the ledger enclave must signal this violation to the chaincode that must abort transactions.

Merkle tree-based integrity management is nice.

Using a regular Merkle tree is not enough because you also need to authenticate non-membership queries, and there is no way to prove a key doesn't exist in a Merkle tree efficiently without the keys being sorted in some manner.

cendhu · 2020-02-24T19:19:45Z

text/0000-fabric-private-chaincode.md

+The FPC team welcomes the community’s advice on how each of these touch-points to Fabric should be handled going forward, and hope to solidify our plans for each element through the RFC process.
+
+
+# Drawbacks


A few of the following limitations are missing (may be I should call it current limitations).

The peer cannot run CouchDB as the stateDB. Even if CouchDB is used, GetQueryResult() APIs cannot be used as the value is encrypted and CouchDB is not a database that can operate on encrypted data (unlike cryptDB -- https://css.csail.mit.edu/cryptdb/)

GetStateByRange() GetStateByRangeWithPagination() GetStateByPartialCompositeKey() GetStateByPartialCompositeKeyWithPagination() are not supported in the chaincode.

Is the private data supported? -- if not, please state in the limitation.

Is the key-based endorsement supported? if not, please state in the limitation.

Is it possible for the client to take blocks out of the peer, decrypt rwset, and execute analytical queries?

I agree with all these current limitations. Let's focus on a robust integration of FPC with a minimum of features but make it end2end secure. From there we can start adding the features listed above.

cendhu · 2020-02-24T19:22:57Z

text/0000-fabric-private-chaincode.md

+The Ledger Enclave stores all attestation reports, as signed by the TEE vendor. Reports include information on what chaincode is running; what specific TEE is in use, including version number and hash of the chaincode, and including the public keys of the enclave. With the planned new key management scheme to support multiple endorsing peers, there may be many public keys instead of one. Upon creation each chaincode enclave generates public/private key pairs for signing and for encryption. The public signing key also denotes the chaincode enclave identity. In the new scheme, there will also be a public/private key pair and a symmetric state encryption key per FPC chaincode that are shared among all chaincode enclaves that run the same FPC chaincode using the key distribution protocol above.
+
+
+## FPC Chaincode Lifecycle


A figure or a flow diagram could help understand the proposed approach better.

I agree! Will add this.

cendhu · 2020-02-24T19:30:50Z

text/0000-fabric-private-chaincode.md

+
+Other intended future work includes new possible endorsement policies that can incorporate TEE policies across multiple TEE vendors. For instance, FPC developer could write endorsement policy to express the following: two SGX endorsements OR a single IBM Z endorsement; OR 2 AMD SEV endorsements OR 5 XXX TEE endorsements;
+
+## Deployment Process (in detail)


It seems like an important section. A figure or a flow diagram could help understand it deeper.

cendhu · 2020-02-24T19:36:04Z

text/0000-fabric-private-chaincode.md

+
+This new model of trust for Fabric smart contracts makes possible high-stakes markets such as private voting systems and sealed-bid auctions; which aren’t supported by the existing model of privacy in Fabric because of the requirement for endorsing peers to be fully trusted for confidentiality.
+
+A prototype implementation of FPC is available as Hyperledger Lab on github (https://github.com/hyperledger-labs/fabric-private-chaincode).


If a prototype is available, Did we compare the performance of vanilla Fabric v2.0 and the v2.0 with the enclave with a different number of chaincodes, etc... ?

I am asking this because the performance could degrade significantly with the usage of enclaves -- sgx-perf: A Performance Analysis Tool for Intel SGX Enclaves, Middleware 2019 https://www.ibr.cs.tu-bs.de/users/weichbr/papers/middleware2018.pdf It might be better to quantify the drop in performance in terms of percentage.

We have a performance evaluation using the prototype for Fabric v1.4, however, I don't think that this is representative anymore for the current prototype code.

Please see the paper:
Marcus Brandenburger, Christian Cachin, Rüdiger Kapitza, Alessandro Sorniotti: Blockchain and Trusted Computing: Problems, Pitfalls, and a Solution for Hyperledger Fabric. https://arxiv.org/abs/1805.08541

20% overhead seems like the best case. I was expecting at least 50% drop in the performance based on this paper https://www.ibr.cs.tu-bs.de/users/weichbr/papers/middleware2018.pdf Is enclave execution efficient?

Overhead is quite use-case dependent. One of the overheads is related to the number of enclave <-> non-enclave transitions you do (which in turn is proportional to the shim calls you do. This can be greatly optimized by using the switchless/exitless techniques of doing OCALLS. While we currently do not do that yet, it is certainly something which we will explore once more use-cases are clear and performance becomes an issue. (Note with FPC you will require less redundancy than in "vanilla" fabric, so there are compensating performance benefits).

cendhu · 2020-02-24T19:42:27Z

text/0000-fabric-private-chaincode.md

+- Fabric Issue: (leave this empty)
+
+# Summary
+[summary]: #summary


Fundamentally, in a blockchain network, each organization could run a peer that is implemented in a different language as compared to others. This is true in the Ethereum world. The hard requirement is that majority of the node should follow the predefined rules. I believe that the same is applicable for Fabric too. Now, we are adding a constraint that all peers must use hardware with a specific feature. I am not sure about the feasibility. For example, I might host my peers on-prem and might have a very strong security team to keep the attackers away from my peers. Whereas, some other organization in the same network might be using a cloud vendor to host the peers which may require a TEE. I am not sure whether it is good to enforce every channel member to use specific hardware.

I completely agree. Eventually, FPC should also follow these principles and support heterogeneous TEE platforms and let the users decide on endorsement policies that reflect their needs. With SGX we are just starting to explore the TEE capabilities in the context of blockchain.

MHBauer · 2020-03-12T05:34:06Z

text/0000-fabric-private-chaincode.md

+
+## Fabric Touchpoints
+
+- Additional Metadata in the Channel and Chaincode Definitions: In order to insure agreement on security-critical data across all Orgs on a Channel, we recommend that FPC introduce new metadata into the Channel Definition: for example, the TEE vendor’s Certificates, and into the Chaincode Definition (possibly using the “version” field): for example, the MRENCLAVE chaincode enclave identifier. In the initial implementation these are Intel SGX-specific and hardcoded, recording these metadata in the Enclave Registry, but this makes runtime upgrades impossible. We hope to explore the possible use of Configtxgen and approveForMyOrg in future releases to address this issue.


insure -> ensure

pretty sure we mean "guarantee" and not the "guard against contingency" meaning.

Signed-off-by: Michael Steiner <[email protected]>

Spellcheck and minor format improvements

* Fix terminology * Clarify system model * Add FPC Management API and FPC Shim * Add deployment policies * Add Roadmap features * Add list of not (yet) supported features * Add details on restart and sealing * Typos Co-authored-by: Michael Steiner <[email protected]> Co-authored-by: jrlinton <[email protected]> Signed-off-by: Marcus Brandenburger <[email protected]>

mbrandenburger · 2020-07-03T06:23:08Z

@cendhu We've updated the RFC regarding your valuable feedback. Thank you

yacovm · 2020-07-06T13:47:49Z

text/0000-fabric-private-chaincode.md

+
+* Trusted Execution Environment (TEE): The isolated secure environment in which programs run in encrypted memory, unreadable even by privileged users or system processes. FPC chaincodes run in TEEs.
+
+* Enclave: The TEE technology used for the initial release of FPC will be Intel SGX.  In SGX terminology a TEE is called an _enclave_.  In this document and in general, the terms TEE and Enclave are considered interchangeable.  Intel SGX is the first TEE technology supported as, to date, it is the only TEE with mature support for remote attestation as required by the FPC integrity architecture.  However, our architecture is generic enough to also allow other implementations based on AMD SEV-SNP, ARM TrustZone, or other TEEs.


it is the only TEE with mature support for remote attestation as required by the FPC integrity architecture.

I would remove this. This is time-dependent and also a matter of opinion. Someone from AMD, or the NSA may have other opinions.

Good point about time-dependency. BTW: the reason why right we think SGX is the only one is that you need security also against (some) hardware adversary (ruling out, e.g., TPM) and it has to provide publicly verifiable attestation (ruling out available AMD SEV; SEV-SNP probably should address that but details are not yet all out and is not yet available). Curious, though, what did you have in mind when you mentioned NSA?

Historically, national security agencies have not disclosed vulnerabilities for periods of time in order to have competitive advantage.

Ah, ok, you meant that NSA might be of the opinion that nonesuch exists rather than they had another technology in mind (which i thought was where you were heading at) ...

yacovm · 2020-07-06T14:00:03Z

text/0000-fabric-private-chaincode.md

+The FPC SDK provides the FPC Shim to enable FPC chaincode developers to easily link to this library when compiling their FPC Chaincode and building a deployment artifact. It implements the interface from the Peer to the FPC Chaincode, encrypts and decrypts state and arguments, and implements Attestation.
+See the [FPC Shim](#fpc-shim) section for more information.
+
+**Ledger Enclave:** The Ledger Enclave is a crucial component to protect the execution of a FPC chaincode.


Why do you need the ledger enclave? Can't you sign each (key, value, version) in the read-write set and then have the FPC always verify the signature whenever it reads the value during computation?

you need consistency of the whole view and history. Strictly speaking you don't need the trusted ledger and could validate the history back to the genesis block but obviously that would be extremely inefficient. In essence, the trusted ledger is a cache of evaluted/validated history (as it is for standard fabric, except in this case with a different thread/trust model)

yacovm · 2020-07-06T14:01:41Z

text/0000-fabric-private-chaincode.md

+**Ledger Enclave:** The Ledger Enclave is a crucial component to protect the execution of a FPC chaincode.
+In particular, it is responsible to ensure that a FPC chaincode processes committed world state data, i.e. only state data that comes from valid and committed transactions.
+The Ledger Enclave is a separate enclave that establishes a trusted view of the ledger and locally maintains integrity metadata for validating information on the blockchain ledger.
+Like the Peer, it performs standard validation logic when a new block of transactions arrives, but it also creates and stores a cryptographic hash of each new key-value pair of the blockchain state. This makes it possible to verify that the data coming from the Peer is correct -- remember that in the FPC setting we do not necessarily trust a (single) peer. The Chaincode Enclave uses the Ledger Enclave as its tamperproof source of information on the blockchain world state. The Ledger Enclave and a Chaincode Enclave interact by through a secure channel that is initially established when the Chaincode Enclave is deployed. There is one Ledger Enclave per Channel on any given Peer. When a peer receives a new block from the ordering service, it forwards it to the Ledger Enclave. The Ledger Enclave is part of the FPC runtime.


The Ledger Enclave and a Chaincode Enclave interact by through a secure channel that is initially established when the Chaincode Enclave is deployed.

How do you protect from an attack where the peer rollbacks the ledger enclave to an earlier snapshot?

integrity-wise, it's no different than fabric: you won't be able to get them ordered properly. Confidentiality-wise, chaincode will have to deploy a commit-and-reveal strategy, i.e., reveal confidential information only after the trigger allowing the revelation is commited to the ledger. (this is necessary not only for ledger roll-back but also for the case a transactor never submits the transaction to the ledger once he gets the result). The referenced arXiv paper has more information on the relevant security issues and how they are addressed.

yacovm · 2020-07-06T14:05:29Z

text/0000-fabric-private-chaincode.md

+Like the Peer, it performs standard validation logic when a new block of transactions arrives, but it also creates and stores a cryptographic hash of each new key-value pair of the blockchain state. This makes it possible to verify that the data coming from the Peer is correct -- remember that in the FPC setting we do not necessarily trust a (single) peer. The Chaincode Enclave uses the Ledger Enclave as its tamperproof source of information on the blockchain world state. The Ledger Enclave and a Chaincode Enclave interact by through a secure channel that is initially established when the Chaincode Enclave is deployed. There is one Ledger Enclave per Channel on any given Peer. When a peer receives a new block from the ordering service, it forwards it to the Ledger Enclave. The Ledger Enclave is part of the FPC runtime.
+
+**FPC Registry:** Also referred to as the Enclave Registry Chaincode (ERCC), this is a component which maintains a list of all Chaincode Enclaves deployed on the peers in a channel.
+The registry associates with each enclave their identity, associated public keys and an attestation linking them. Additionally, the registry manages chaincode specific keys, including a chaincode public encryption key, and facilitates corresponding key-management among authorized Chaincode Enclaves. Lastly, the registry also records information required to bootstrap the validation of attestation. All of this information is committed firmly on the ledger. This enables any Peer in the Channel (even those without SGX) to inspect the Attestation results before taking actions such as connecting to that chaincode or committing transactions produced by an FPC chaincode. Moreover, clients can query the registry to retrieve the chaincode public encryption keys of a particular FPC chaincode so they can send privately transaction proposals for endorsement.


Moreover, clients can query the registry to retrieve the chaincode public encryption keys of a particular FPC chaincode so they can send privately transaction proposals for endorsement.

But you would need the clients to have signature public keys of the enclaves, otherwise the peer could do a MITM, no?

So, why not assume clients have the encryption public keys in the first place? Or, do you assume there is some kind of PKI in place that certifies the signature public keys of the enclaves?

Note we assume the standard fabric model that clients talk to peer of their own organization for queries. So no MITM in that case and also no need to get the signature keys of enclaves. If you would extend the model to clients outside of peer orgs, then it gets a bit more complicated but this would be true also for vanilla fabric (i.e., each query would have be replicated consistent with the implied endorsement policy. )

Also note the SGX attestation authority essentially roots a PKI, although, for efficiency and simplicity the Enclave Registry forms its own (sub)PKI but could always be verified also by clients to be grounded in SGX attestation and appropriate measurements .

Note we assume the standard fabric model that clients talk to peer of their own organization for queries. So no MITM in that case and also no need to get the signature keys of enclaves.

So we trust the a peer won't be malicious against a client from its own organization?

That's correct. As mentioned, it is not essential and the architecure could be extended to clients outside of orgs but the initial approach should suffice for what we would foresee the most common scenarios in a permissioned ledger.

yacovm · 2020-07-06T14:07:57Z

text/0000-fabric-private-chaincode.md

+The registry associates with each enclave their identity, associated public keys and an attestation linking them. Additionally, the registry manages chaincode specific keys, including a chaincode public encryption key, and facilitates corresponding key-management among authorized Chaincode Enclaves. Lastly, the registry also records information required to bootstrap the validation of attestation. All of this information is committed firmly on the ledger. This enables any Peer in the Channel (even those without SGX) to inspect the Attestation results before taking actions such as connecting to that chaincode or committing transactions produced by an FPC chaincode. Moreover, clients can query the registry to retrieve the chaincode public encryption keys of a particular FPC chaincode so they can send privately transaction proposals for endorsement.
+
+**FPC Validator:** This validation complements the Peer validation logic by validating transactions produced by an FPC Chaincode.
+It does this by checking that each transaction has been cryptographically signed (endorsed) by a Chaincode Enclave that is listed in the FPC Registry and is authorized for that chaincode.


Does the validator know the public keys of the consensus service? Is it aware of what are Fabric block signatures, and all Fabric config changes? I am asking because if the validator has no ability to validate Fabric blocks on its own, then it's trivial to do a reorder attack (change order between transactions) and make it think something is invalid when it is valid.

The validator exists at two places: as plugins in the peer and as part of the trusted-ledger. For plugins, they rely on the normal peer; for the validation logic inside the trusted ledger, they rely on the trusted ledger's core fabric validation logic which (a) explicitly references the (cryptographic) channel identity (and exposes this transitively to clients and chaincode enclaves) and (b) includes following/validating changes to ordering service info in channel definitions.

yacovm · 2020-07-06T14:16:52Z

text/0000-fabric-private-chaincode.md

+
+	![Invoke](../images/fpc/high-level/FPC-Invoke.png)
+
+	The Client prepares the Invocation of an FPC Chaincode by first encrypting the arguments of the Chaincode Invocation using the public key specific to a particular Chaincode. This encryption happens completely transparently using our FPC Client SDK extension. This Transaction Proposal is then sent to the Endorsing Peer where a corresponding Chaincode Enclave resides. Depending on the Endorsement Policy the client may perform this step with one or more Endorsing Peers and their respective Chaincode Enclaves. (For simplicity we will continue describing the process for a single Endorsing Peer.) The Peer forwards the Transaction Proposal to its FPC Chaincode running inside the Chaincode Enclave. Inside the Enclave, the FPC Shim decrypts the Proposal and invokes the FPC Chaincode.


The Client prepares the Invocation of an FPC Chaincode by first encrypting the arguments of the Chaincode Invocation using the public key specific to a particular Chaincode.

I guess it needs to encrypt it several times for multiple peers, because the proposal response contains a hash of the proposal?

The public key is not unique to an enclave but corresponds to the chaincode overall, i.e., all enclaves authorized for that chaincode share the private key corresponding to the chaincode public key

So where is the private key generated and how does it gets to the enclaves?

It is created by the initial enclave itself with the public key attested as part of the enclave registration and the private key (together with the state encryption key) distributed to other authorized enclaves as part of the key distribution protocol.

yacovm · 2020-07-06T14:22:42Z

text/0000-fabric-private-chaincode.md

+	![Validate](../images/fpc/high-level/FPC-Validate.png)
+
+	As the Peers in the Channel receive the block of transactions, they perform the standard Fabric validation process.
+	In addition, a custom validation plugin in the Peer is responsible to verify FPC transactions. In particular, the FPC Validator queries its local FPC Registry to retrieve the enclave signature verification key and check that the signature was produced in an actual and correct Chaincode Enclave. This query retrieves the stored Attestation report associated with the Public Key of the Chaincode Enclave that produced and signed the transaction. The Attestation report is checked to verify the details of the Chaincode Enclave, affirming the validity of the Enclave.


In addition, a custom validation plugin in the Peer is responsible to verify FPC transactions. In particular, the FPC Validator queries its local FPC Registry to retrieve the enclave signature verification key and check that the signature was produced in an actual and correct Chaincode Enclave

It seems to me, that the validator needs to validate the entire block, to ensure that there is no omission of FPC transactions, no?

so above: there are two places where that validation logic applies. By FPC Validator we are referring only to the plugin which is part of the (untrusted) legder, not the validation part of the trusted ledger.

Note that security-wise there are two reasons to have the validator plugin: it (a) prevents a denial of service by ensuring the "normal" ledger considers a transaction also aborted when the trusted ledger faults some FPC transaction and (b) prevents unvalidated public FPC state from being wrongly commited (and potentially acted upon later). The latter issue would go away but not supporting public state (i.e., put_public_state) but requiring a corresponding separate chaincode query which would decrypt the answer and return it as query result. I.e., this would be our fallback strategy iff validator (& endorsement) plugins would disappear unreplaced as part of the go-plugins eradication effort ...

yacovm · 2020-07-06T14:26:05Z

text/0000-fabric-private-chaincode.md

+	![Revalidate](../images/fpc/high-level/FPC-Revalidate.png)
+
+	In addition to the standard validation and commit process, the Peer also forwards all committed blocks to the Ledger Enclave in order to establish a full and current trusted view of the ledger. 
+	The same validation steps described above are repeated inside the Enclave, and then the transaction is committed to the trusted version of the Ledger inside this Enclave. This constitutes an update to the World State Integrity Metadata: for each Key Value Pair of World State, a second Key Value Pair is stored in the Trusted Ledger containing Integrity Metadata (a cryptographic hash value) along with channel-specific details needed to verify that the transaction was produced by a valid authorized participant in the channel.


for each Key Value Pair of World State, a second Key Value Pair is stored in the Trusted Ledger containing Integrity Metadata (a cryptographic hash value) along with channel-specific details needed to verify that the transaction was produced by a valid authorized participant in the channel.

So the enclave has limited storage capabilities that can securely store a hash, then? Why not instead use a MAC and just MAC the hash and channel pair? Then you can use this MAC key for all channels and all keys, and you don't need to store in the enclave a value for each channel and for each key?

I also don't understand how does this really work in reality? What if you have lots of keys? Where is all this information stored? Does SGX have a database and a hard drive? I thought it's just a tinfoil wrapped box inside the CPU that can store a few bytes like keys, etc.?

currently, the meta-data must fit into enclave memory. Note though, even with currently still somewhat limited EPC, this supportes millons of keys. As noted below, we also have it on the road-map to do crypto-paging, which besides scaling the number of possible keys also improves start-up performance for peers.

the Peer also forwards all committed blocks to the Ledger Enclave in order to establish a full and current trusted view of the ledger.

I don't understand something.
It seems to me that the Ledger Enclave needs to process all blocks, and validate the transactions in these blocks, namely configuration transactions.

I believe this is needed, because otherwise the Enclave Ledger cannot maintain the correct view of the validity of the transactions inside of it.

Under the assumption that I'm correct and it is indeed needed, it seems to me that the Ledger Enclave needs to implement all logic of transaction validation: everything that Fabric transaction validation depends on, namely MSP, BCCSP, and config transaction processing.
Otherwise, a malicious party can look at the ledger enclave code and at the Fabric code, see a deviation in the semantics and craft a transaction that is invalid in Fabric but valid in the Enclave code, and thus cause a fork between the Fabric state and the Ledger Enclave state. Am I wrong?

If that is indeed the case, isn't that a lot of code to implement and maintain in C++ ? Fabric code has historically mutated across versions, and a lot of Fabric versions come with activation flags (called "Capabilities") which hint the validation pipeline whether they should be applied or not. Maintaining this code in Fabric has been an unpleasant and time consuming endeavor and you can look that we have validation code duplicated for three different versions.

@yacov You are right, re-implementing the entire validation pipeline of the peer in C++ and run it inside the ledger enclave makes maintaining the code not easy.

I opened an issue for that topic. So we can have find a solution/strategy in a separate conversation without polluting this conversation here. Once we are happy we pipe the result back in the RFC.

Right, I think the answer really lies in reducing the scope and perhaps even deviating from the current Fabric logic in the FPC, to eventually reach something that co-exists within Fabric and appears to the user as Fabric, but has a more strict (but simple) validation logic than Fabric.

yacovm · 2020-07-06T14:34:27Z

text/0000-fabric-private-chaincode.md

+
+## Subsequent Query and Validation of World State Data by FPC Chaincodes
+
+Subsequent to the transaction described above: When any FPC Chaincode accesses the World State, the FPC Shim is responsible not only for passing the query to the Peer but also querying the Ledger Enclave for the Integrity Metadata for the Key in question. It performs a cryptographic hash of the Key Value Pair of World State returned by the Peer and checks it against the Integrity Metadata from the Ledger Enclave for that key. If the hashes do not match, it detects an integrity violation.


It performs a cryptographic hash of the Key Value Pair of World State returned by the Peer and checks it against the Integrity Metadata from the Ledger Enclave for that key. If the hashes do not match, it detects an integrity violation.

This raises the question - how do you validate that keys do not exist?

There is likely a better way of doing it, such as consistently maintaining an accumulator over the world state, and keeping track only of its value in the FPC persistent storage and when you query the peer, the peer provides a membership proof for the accumulator, or a non membership proof of the accumulator.

One construction for an accumulator is a sorted Merkle tree.

For keys which do not exist, there will be no metadata which would wrongly validate its existance; for keys which do exist but a malicious peer tries to hide; we cross-check also that non-existing tags indeed do not exist as part of our meta-data.

Regarding merkle-tree, this was implicitly referred to as part of the "Trusted Ledger State Snapshot:" in the road-map features section.

For keys which do not exist, there will be no metadata which would wrongly validate its existance

But don't you need to prove that it is indeed the case? Otherwise I can simply trick the FPC to compute the wrong output, no?

what i forgot to mention: when the trusted_ledger does not find meta-data it will notify the chaincode enclave of that fact in a crypto-graphically protected way (over the secure channel between trusted ledger enclave and the chaincode enclave). That way the chaincode enclave can validate responses from the untrusted peer, whether it provides a value or claims no value for a given key exists ..

yacovm · 2020-07-06T14:39:38Z

text/0000-fabric-private-chaincode.md

+
+![Encryption](../images/fpc/high-level/FPC-Encryption.png)
+
+Encrypted elements of the FPC architecture (over and above those in Fabric, such as TLS tunnels from Client to Peer) include:


The below enumerations are extremely high level and obscure and to reason about security, i think it should be explained:

What keys (symmetric, asymmetric) encrypt the data

Who owns the keys

How and where are the keys generated

How are the keys being distributed

The UML diagrams referenced further down have all the corresponding details.

I can't understand this. The zoom level in which the text is big enough to be readable, prevents you from seeing anything but a small section of the diagram so you lose context. If you zoom out, the text becomes unreadable and then you don't understand anything.

Can't you just write a short summary?

(see below)

yacovm · 2020-07-06T14:43:31Z

text/0000-fabric-private-chaincode.md

+    Thereby, the FPC Shim creates an new Chaincode Enclave running the FPC Chaincode.
+    In particular, when creating the enclave, the FPC Shim also fetches the MRENCLAVE and Channel ID of the Ledger Enclave and passes them with the create command.
+	Now, having all necessary information, the new Enclave proceeds with the key generation protocol.
+	It creates public/private key-pairs for signing and encryption. The public signing key is also used as enclave identifier.


It creates public/private key-pairs for signing and encryption.

How does this work in the case of several enclave instances?
Like, you have an enclave in your bank, but i want to have an enclave in my bank, so i need your encryption key as well.
Since key generation relies on secret entropy, one way or another one of us needs to transfer the other some random bytes. How do we do that? Do we have some kind of infrastructure to do key exchange? Do we have a global system setup of bootstrap encryption keys?

See the fpc-registration and fpc-key-dist UML diagrams referenced further down in the text which gives the details of the involved protocols.

Is there a paper that describes this UML flow in English? If I zoom out the text is unreadable, and if i zoom in i see like 1% of the entire diagram so I can't figure out what's going on...

Can you perhaps just write the high level idea?

There is no paper, but the source files (PUML) are fairly readable text files.
On a high-level as mentioned above the first enclave creates all chaincode secrets and then will distribute it to other enclaves encrypted using the other enclave's encryption key. The key of course is to make sure that only authorized enclaves, i.e., enclaves running the correct code, and with a consistent view on channel identity, chaincode definition and trusted ledger identity get the key. The devil for security, though, is really in the detail which right means the UML diagrams ...

yacovm · 2020-07-06T14:49:04Z

text/0000-fabric-private-chaincode.md

+* Attestation: An important cryptographic feature of Trusted Execution Environments by which the hardware can reliably state exactly what software it is running. This statement is signed by the hardware so that anyone reading it can verify that the statement came from an actual valid and up-to-date TEE. The Attestation can cover both the software portion of the TEE itself as well as any program to be run in it. This attestation takes the form of a "quote" containing (among other things) a measurement of the code and data in the TEE and its software version number, which is signed by the TEE and can be verified by anyone using well known Public Key Infrastructure technology.
+
+* Trusted Execution Environment (TEE): The isolated secure environment in which programs run in encrypted memory, unreadable even by privileged users or system processes. FPC chaincodes run in TEEs.
+


Perhaps add a section that describes the capabilities / primitives of the TEE, such as:

The TEE can:

Produce a signature that every other TEE can verify

Store with integrity up to n values in persistent storage

Good point that we should be more explicit (e.g., as part of the TEE entry in the terminology section) in what properties we need in a TEE. The key properties besides the already mentioned "encrypted memory, unreadable even by privileged users or system processes" is the ability to provide (publicly verifiable!) attestation and seal state (i.e., encrypt data with a key which is known only to a particular enclave). These latter two properties is what enables, e.g., the two bullets you mention above.

mastersingh24 · 2020-07-08T10:05:41Z

text/0000-fabric-private-chaincode.md

+In particular, the FPC Client SDK extension will provide these core functions: 1) FPC transaction proposal creation (including transparent encryption of arguments) 2) FPC transaction proposal response validation and decryption of the result.
+The encryption and decryption is performed by the Client SDK "under the covers" without requiring any special action by the users, i.e.,
+the users still use normal `invoke`/`query` functions to issue FPC transaction invocations.
+For the MVP, we intend to support the peer CLI only. Extended support for the NodeSDK (or Go SDK) will be future work.


Suggested change

For the MVP, we intend to support the peer CLI only. Extended support for the NodeSDK (or Go SDK) will be future work.

For the MVP, we intend to support the peer CLI only. Extended support for the NodeSDK (or Go SDK) will be future work.

While ok for the labs project to use the CLI, let's not make the mistake of adding this to the current peer "CLI".
Pick one SDK and implement it there. I'd also say that you do not want to implement this inside the current SDKs ... meaning you should do something similar to how the higher level programming models work as a layer above the low level SDK code

The FPC client-side request and response handling can be done transparently as pre/post-transformation of request/response in a proxy pattern way. Our original plan was to create two corresponding go-functions for this transformation and wrap them in a small utility which can be used in our peer wrapper to transparently do the processing for the invoke and query commands. As the cli is currently used for CI and is convenient for first walk-thru/tutorials, we will still do that.

However, we hear you and also added support for the Go SDK to the MVP plans. At a minimum level, we would just expose and document as separate library the two functions mentioned above. However, we will also investigate whether there is a straightforward way to implement a proxy on top of the normal sdk which will retain, with the exception of selecting a different package name or alike, the existing fabric client programming interface (it seems the mock implementation does something similar).

mastersingh24 · 2020-07-08T10:08:10Z

text/0000-fabric-private-chaincode.md

+The deployment of a FPC chaincode consists of multiple steps; it begins with the standard Fabric 2.0 Lifecycle mechanism to produce a chaincode definition that all participants agree upon. FPC requires adding some additional metadata to the chaincode definition, in particular a cryptographic hash (called the MRENCLAVE in the initial version of FPC) that identifies the code running inside the Chaincode Enclave. The FPC SDK provides tools to generate this MRENCLAVE in a way that enables all parties to screen the code upfront and produce this hash. Note that this is crucial for the security provided by the TEE and is part of the attestation capabilities.
+For more details on Intel SGX remote attestation we refer to the vendor [documentation](https://software.intel.com/content/www/us/en/develop/topics/software-guard-extensions/attestation-services.html).
+
+Once the chaincode definition for an FPC chaincode has been approved by the consortium, a Chaincode Enclave is created by calling `peer lifecycle chaincode createenclave`. This command triggers the creation of an enclave and registers it at the FPC Registry.  The registration includes the attestation of the Chaincode Enclave.


I think we need to find an alternative way of creating the enclave ... adding a command to the lifecycle seems like the wrong approach here.

If we had a CLI that is extendable via plugins (what's the status of https://github.com/hyperledger/fabric-cli ? Is it ongoing work? abandoned?) then it'd make sense to implement the support there.

Conceptually, the enclave creation is part of the chaincode lifecycle, hence our proposal to put it into the corresponding namespace of closely related commands in the peer cli. Implementation-wise, we currently handle that we a simple script which wraps and extends for FPC the normal command. The actual protocol message would initially be layered on top of a normal query flow.

To better understand your concern, is your concern the particular namespace used or that this command is exposed to admins at all? Or something else?

Changing namespace of course would be trivial. For the short-term MVP solution with a designated peer, we also could simply hide the enclave creation and trigger it "under the cover". It is a bit less clear whether hiding enclave interaction still easily works if we go to more advanced endorsement policy models which involve additional key distribution protocol flows. Hence, we were conservative and thought we propose something which works for sure also longer term. That said we of course will try handle additional requirements and other suggestions.

This PR proposes the *FPC 1.0* architecture and supersedes the earlier [PR#21](hyperledger#21). PR#21 proposed intrusive and complex changes for an initial realization of the concept of a (stronlgy) private chaincode. FPC 1.0 takes a different approach as it covers a smaller application domain but thereby avoids any change to the core Fabric code. How to broaden to the application domain covered in PR#21 is outlined in a roadmap of future work. Signed-off-by: Marcus Brandenburger <[email protected]> Co-authored-by: Michael Steiner <[email protected]> Co-authored-by: Bruno Vavala <[email protected]> Co-authored-by: Jeb Linton <[email protected]>

mbrandenburger · 2020-12-04T13:40:03Z

This PR is closed and superseded by #40 following the the advise from @denyeart at the Fabric Contributor Meeting Nov 11, 2020.

This PR proposes the *FPC 1.0* architecture and supersedes the earlier [PR#21](#21). PR#21 proposed intrusive and complex changes for an initial realization of the concept of a (stronlgy) private chaincode. FPC 1.0 takes a different approach as it covers a smaller application domain but thereby avoids any change to the core Fabric code. How to broaden to the application domain covered in PR#21 is outlined in a roadmap of future work. Signed-off-by: Marcus Brandenburger <[email protected]> Co-authored-by: Michael Steiner <[email protected]> Co-authored-by: Bruno Vavala <[email protected]> Co-authored-by: Jeb Linton <[email protected]>

Initial Fabric Private Chaincode RFC

ccc3411

Signed-off-by: Marcus Brandenburger <[email protected]>

cendhu suggested changes Feb 24, 2020

View reviewed changes

MHBauer reviewed Mar 12, 2020

View reviewed changes

g2flyer mentioned this pull request May 14, 2020

FPC RFC hyperledger/fabric-private-chaincode#356

Closed

19 tasks

g2flyer and others added 3 commits June 9, 2020 09:32

Some spell-checking and small format improvements

99b4c15

Signed-off-by: Michael Steiner <[email protected]>

Merge pull request #1 from g2flyer/msteiner.spell-check

0c3d05d

Spellcheck and minor format improvements

mbrandenburger force-pushed the fpc-rfc branch from 577fe0a to b1beb4a Compare July 1, 2020 06:21

mbrandenburger requested review from cendhu and MHBauer July 3, 2020 06:21

yacovm reviewed Jul 6, 2020

View reviewed changes

mastersingh24 reviewed Jul 8, 2020

View reviewed changes

g2flyer mentioned this pull request Jul 9, 2020

RFC v3 hyperledger/fabric-private-chaincode#400

Closed

5 tasks

mbrandenburger mentioned this pull request Jul 10, 2020

Ledger Enclave Strategy hyperledger/fabric-private-chaincode#402

Open

3 tasks

mbrandenburger mentioned this pull request Dec 4, 2020

RFC Fabric Private Chaincode 1.0 #40

Merged

mbrandenburger closed this Dec 4, 2020


		A prototype implementation of FPC is available as Hyperledger Lab on github (https://github.com/hyperledger-labs/fabric-private-chaincode).

		# Motivation


		Overall, FPC can be considered an addition to Fabric wherein all chaincode computation relies only on the correctness of data provided by an authenticated Client or computed inside of and signed by a validated Trusted Execution Environment. It assumes that the Ordering Service is a trusted system element. The Endorsing Peer outside of the Trusted Execution Environment, however, is considered untrusted.

		Writing chaincode for FPC should come natural to developers familiar with Fabric as the programming model (e.g., chaincode lifecycle, chaincode invocations and state) is the same as for normal Fabric chaincode. The main differences are a (for now at least) different programming language (C++) and a shim API which, on the one hand, is somewhat simplified but, on the other hand, also contains FPC-specific extensions such as allowing for private as well as public ledger state as well as a secure (yet “replicable”) random-number generator. In particular, a developer can write chaincode largely without having to be aware that the code executes within a Trusted Execution Environment (TEE), often referred to as an Enclave. However, to understand the architecture and security underpinning it we introduce in the following several new concepts and terms. Except where noted otherwise, all elements of the architecture described below reside on a Fabric Peer.


		FPC Chaincode: A chaincode created by a developer to run in a Chaincode Enclave. Unlike regular Fabric chaincode, an Enclave Chaincode must currently be written in C++ using our FPC SDK. A future goal for the project is to support additional languages, e.g., by the use of WebAssembly.

		Chaincode Enclave: An enclave in which a single particular chaincode executes. The Chaincode Enclave contains the chaincode (called an Enclave Chaincode) to be executed, with its linked Chaincode Library. The FPC runtime is responsible for creating the Chaincode Enclave when the FPC chaincode is installed on the endorsing peers.


		Chaincode Library (also known as the FPC Shim): This is a shim interface exposed to an FPC Chaincode. The Chaincode Library contains two parts, one residing inside the enclave and another one outside the enclave. The shim interface exposed to the FPC chaincode is written primarily in C++, whereas the counterpart is written in Go and mainly responsible for communication between the Peer and the FPC chaincode. The Chaincode Library enables an Enclave Chaincode to talk to the Peer, in particular to invoke the chaincode and provide "normal" shim operations such as getState and putState. The FPC SDK provides the chaincode library to enable FPC chaincode developers to easily link to this library when compiling their FPC Chaincode and building a deployment artifact. It implements the interface from the Peer to the Enclave Chaincode, encrypts and decrypts state and arguments, and implements Attestation.

		Ledger Enclave: The Ledger Enclave is a crucial component to protect the execution of a FPC chaincode. In particular, it is responsible to ensure that a FPC chaincode processes committed world state data, i.e. only state data that comes from valid and committed transactions. The Ledger Enclave is also called the TLCC, Trusted Ledger Enclave or the Validation Enclave. The Ledger Enclave is a separate enclave which locally stores integrity metadata for validating information on the blockchain ledger. Like the Peer, it performs standard validation logic when a new block arrives, but it also creates and stores a cryptographic hash of each new key-value pair of the blockchain state. This makes it possible to verify that the data coming from the Peer is correct -- remember that in the FPC setting we do not necessarily trust a (single) peer. The Chaincode Enclave uses the Ledger Enclave as its tamperproof source of information on the blockchain world state. The Ledger Enclave and a Chaincode Enclave interact by through a secure channel that is initially established when the Chaincode Enclave is deployed. There is one Ledger Enclave per Channel on any given Peer. When a peer receives a new block from the ordering service, it forwards it to the Ledger Enclave. The Ledger Enclave is part of the FPC runtime.

		The FPC team welcomes the community’s advice on how each of these touch-points to Fabric should be handled going forward, and hope to solidify our plans for each element through the RFC process.


		# Drawbacks

		The Ledger Enclave stores all attestation reports, as signed by the TEE vendor. Reports include information on what chaincode is running; what specific TEE is in use, including version number and hash of the chaincode, and including the public keys of the enclave. With the planned new key management scheme to support multiple endorsing peers, there may be many public keys instead of one. Upon creation each chaincode enclave generates public/private key pairs for signing and for encryption. The public signing key also denotes the chaincode enclave identity. In the new scheme, there will also be a public/private key pair and a symmetric state encryption key per FPC chaincode that are shared among all chaincode enclaves that run the same FPC chaincode using the key distribution protocol above.


		## FPC Chaincode Lifecycle


		Other intended future work includes new possible endorsement policies that can incorporate TEE policies across multiple TEE vendors. For instance, FPC developer could write endorsement policy to express the following: two SGX endorsements OR a single IBM Z endorsement; OR 2 AMD SEV endorsements OR 5 XXX TEE endorsements;

		## Deployment Process (in detail)


		This new model of trust for Fabric smart contracts makes possible high-stakes markets such as private voting systems and sealed-bid auctions; which aren’t supported by the existing model of privacy in Fabric because of the requirement for endorsing peers to be fully trusted for confidentiality.

		A prototype implementation of FPC is available as Hyperledger Lab on github (https://github.com/hyperledger-labs/fabric-private-chaincode).


		## Fabric Touchpoints

		- Additional Metadata in the Channel and Chaincode Definitions: In order to insure agreement on security-critical data across all Orgs on a Channel, we recommend that FPC introduce new metadata into the Channel Definition: for example, the TEE vendor’s Certificates, and into the Chaincode Definition (possibly using the “version” field): for example, the MRENCLAVE chaincode enclave identifier. In the initial implementation these are Intel SGX-specific and hardcoded, recording these metadata in the Enclave Registry, but this makes runtime upgrades impossible. We hope to explore the possible use of Configtxgen and approveForMyOrg in future releases to address this issue.


		* Trusted Execution Environment (TEE): The isolated secure environment in which programs run in encrypted memory, unreadable even by privileged users or system processes. FPC chaincodes run in TEEs.

		* Enclave: The TEE technology used for the initial release of FPC will be Intel SGX. In SGX terminology a TEE is called an _enclave_. In this document and in general, the terms TEE and Enclave are considered interchangeable. Intel SGX is the first TEE technology supported as, to date, it is the only TEE with mature support for remote attestation as required by the FPC integrity architecture. However, our architecture is generic enough to also allow other implementations based on AMD SEV-SNP, ARM TrustZone, or other TEEs.


		![Invoke](../images/fpc/high-level/FPC-Invoke.png)

		The Client prepares the Invocation of an FPC Chaincode by first encrypting the arguments of the Chaincode Invocation using the public key specific to a particular Chaincode. This encryption happens completely transparently using our FPC Client SDK extension. This Transaction Proposal is then sent to the Endorsing Peer where a corresponding Chaincode Enclave resides. Depending on the Endorsement Policy the client may perform this step with one or more Endorsing Peers and their respective Chaincode Enclaves. (For simplicity we will continue describing the process for a single Endorsing Peer.) The Peer forwards the Transaction Proposal to its FPC Chaincode running inside the Chaincode Enclave. Inside the Enclave, the FPC Shim decrypts the Proposal and invokes the FPC Chaincode.


		## Subsequent Query and Validation of World State Data by FPC Chaincodes

		Subsequent to the transaction described above: When any FPC Chaincode accesses the World State, the FPC Shim is responsible not only for passing the query to the Peer but also querying the Ledger Enclave for the Integrity Metadata for the Key in question. It performs a cryptographic hash of the Key Value Pair of World State returned by the Peer and checks it against the Integrity Metadata from the Ledger Enclave for that key. If the hashes do not match, it detects an integrity violation.


		![Encryption](../images/fpc/high-level/FPC-Encryption.png)

		Encrypted elements of the FPC architecture (over and above those in Fabric, such as TLS tunnels from Client to Peer) include:

		* Attestation: An important cryptographic feature of Trusted Execution Environments by which the hardware can reliably state exactly what software it is running. This statement is signed by the hardware so that anyone reading it can verify that the statement came from an actual valid and up-to-date TEE. The Attestation can cover both the software portion of the TEE itself as well as any program to be run in it. This attestation takes the form of a "quote" containing (among other things) a measurement of the code and data in the TEE and its software version number, which is signed by the TEE and can be verified by anyone using well known Public Key Infrastructure technology.

		* Trusted Execution Environment (TEE): The isolated secure environment in which programs run in encrypted memory, unreadable even by privileged users or system processes. FPC chaincodes run in TEEs.

	For the MVP, we intend to support the peer CLI only. Extended support for the NodeSDK (or Go SDK) will be future work.
	For the MVP, we intend to support the peer CLI only. Extended support for the NodeSDK (or Go SDK) will be future work.

Initial Fabric Private Chaincode RFC #21

Initial Fabric Private Chaincode RFC #21

Conversation

mbrandenburger commented Feb 19, 2020

cendhu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cendhu Feb 27, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbrandenburger commented Jul 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yacovm Jul 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbrandenburger Jul 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yacovm Jul 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yacovm Jul 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yacovm Jul 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cendhu Feb 27, 2020 •

edited

Loading

yacovm Jul 6, 2020 •

edited

Loading

mbrandenburger Jul 10, 2020 •

edited

Loading

yacovm Jul 6, 2020 •

edited

Loading

yacovm Jul 6, 2020 •

edited

Loading

yacovm Jul 6, 2020 •

edited

Loading