-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TAG review for v1.0 #94
Comments
No. The verification methods are always resolved using their IDs, not by happening to know a verification method is (supposedly) controlled by a particular controller and going to its controller document (perhaps in a cache) to find the verification method. This seems convoluted; one starts with the verification method ID (e.g., as expressed in a Notably, the text already says: "The following algorithm specifies how to safely retrieve a verification method..." -- and one would hope that anyone building a cache would only do so by safely retrieving a verification method first, not by doing it in an unsafe way and creating cache entries that could then allow avoiding safe retrieval. I suppose we could add a note that says (something like): "Don't build a cache by fetching controller documents, walking each one, and adding reverse entries from any verification method IDs found therein back to the controller document, as this is not a secure way to retrieve verification methods. Caches have to be built based on originally safe verification method retrieval processes or else they could allow unsafe retrieval." |
What choice do we have here? Certainly not to start designing something new from scratch. A well known and well established tech must be picked. The only question is, should the tech allow extensibility or not? If yes, then there is no better choice than Multibase, Multihash, it's well established and well supported by all major and minor languages, and I'm happy to learn about better alternatives. If not, then you have to mandate one digest algorithm and one base encoding, but in long term this choice, a choice you made on behalf of end users, won't be respected, and would be see as limiting, and therefore implementers will start inventing their own solution to meet customer requirements, and there won't be any interoperability at all. Generally, I see a lot concerns about implementers do not respecting this and this, and therefore making implications that such a thing is an issue, on many w3c forums. The truth is that as an implementer I must deliver what a customer requires, not otherwise. This concern is void. One could be easily sued for delivering a verifier that does not work as expected. Extensibility prohibition is the sure way to make implementers to stop respecting a spec. |
(This is me discussing the issues; I'll go back and check for TAG consensus once things settle down.) @dlongley Re #94 (comment), why is it the right design for the verification methods to name themselves using absolute URLs instead of something that's explicitly scoped to the containing document? I see why it happened in a design that started as RDF -- RDF has no local names, just blank nodes and universal names -- but it seems vulnerable to bugs. @filip26 Re #94 (comment), the advice I've gotten from security experts is to "have one joint and keep it well oiled". The most obvious candidate for that joint in controller documents is the verification method's type, which means that type should determine everything else about the cryptographic system, from the signature algorithm to the hash to the binary->ascii encoding. If customer requirements change, you standardize a new value for that type field. You don't try to switch from base64 to base58 in place. This does imply that JWKs are also a mistake, but that's a lost battle while multihash is not. I think it's acceptable for a type to say "you must use base64, and encode it with multibase's initial ' |
Please, can you explain how multihash, multibase, or any other self-describing, well-documented/adopted format prevents you from doing so?
It's not, and you say it is, ... argumentation based on subjective feelings ends up like this. Please, let's go back to the first question I've asked. |
Please, can some elaborate on why "we disagree"? Is it based on something? |
I'm responding in my capacity as an Editor and not on behalf of the VCWG. We will try to review this response during W3C TPAC to see if the WG has consensus wrt. the suggestions below. @jyasskin wrote:
No problem, thank you for the thorough review. :)
The primary reasons the document exists is because 1) the VC JOSE COSE specification authors did not want to create normative references to DIDs or Data Integrity, and 2) a few implementers wanted to generalize DID Documents to allow for any URL in all places instead of just DID URLs. IOW, we are here because this was the compromise that achieved consensus in the VCWG. As an Editor, I agree that profiling is a non-trivial effort. That said, the DID WG has agreed to profile the Controller Document and build DID Core v1.1 on top of it, the VC JOSE COSE Editors have agreed to the same, and there is growing demonstration that the ActivityPub community is doing things that look very close to what a Controller Document is (we're engaging with that community as well). We have three communities profiling so far, and that's better than three bespoke formats that do the same thing.
Yes, that is always a danger when you generalize a security format. That said, we know of no vulnerabilities now in the specifications that plan to profile the Controller Document and are monitoring how profiles are created in order to mitigate the concern raised above.
It sounds like the TAG would like more language in the specification on how to safely process a DID Document, but it's not clear what language would address the concern. Could you provide a few rough sentences on what the TAG would like the specification to say?
Removing JSON-LD support and using a centralized registry would lead to objections within the Working Group. What we have right now is where we have achieved consensus.
Hmm, selecting a single hash and digest algorithm has been discussed over the years and rejected for a variety of reasons (lack of algorithm agility, how do you pick the "right" algorithm, conflicting customer requirements across industries, there are legitimate needs for different base encodings, etc.). For example, some implementers want to use SHA2-256 while others want to use SHA2-384 to perform cryptographic strength matching. Some government customers are pushing upgrading to SHA3. While some implementers don't see that as necessary, others claim their customers require it. Similarly, the base-encoding mechanisms used in the VCWG find the encodings used in a variety of scenarios where picking one base-encoding format would create deployment issues. For example, base64url is an acceptable choice for base-encoding a value into a URL, but a poor choice when base-encoding a value into a QR Code (which is optimized for base45). Choosing one hash and one encoding mechanism has demonstrated to not be workable for the diversity of use cases that the Working Group is addressing. That said, where the WG can pick one base encoding mechanism, such as with Data Integrity cryptosuites, it does that. Just because Multibase allows any base encoding does not mean we maximize the flexibility. For example, the ECDSA and EdDSA cryptosuites use base58 encoding only. Similarly, we only specify four multihash values, which are the ones we've heard are required based on customer demand.
We have many implementations already of both formats in the wild, with multiple VCWG implementations having committed to the format.
The design of the
@dlongley provided a preliminary answer here: #94 (comment) We will discuss this item in the group in more detail at W3C TPAC 2024. |
From @jyasskin
Respectfully, this problem cannot be solved with registries. Not because we haven't figured out a way to do that yet, but because the point of the work is to solve identifier management without a registry. To wit, the web (including IP addresses and DNS) already embodies a wonderfully, operationally decentralized system. Unfortunately, that system is not decentralized from an authority perspective: to be interoperable with the public web relies on a centralized list of known root authorities. This work, decentralized identifiers, exists to solve that problem: how do create a global identifier space that is NOT anchored to a centralized list of necessarily trusted authorities. The goal of the web has always been to democratize access to information. Moving beyond a centralized registry that gets to--however indirectly--decide who gets to participate as a first class peer in the network, is the point of the work. We have always seen DIDs as a continuation of the fundamental goals of TBL at the inception of the Web itself. Indeed, DIDs provide the most promising opportunity to connect the legacy web with Web3. Requiring that DIDs use a centralized registry would bring that opportunity for interoperability and integration to a halt. |
The issue was discussed in a meeting on 2024-09-27
View the transcript3.2. TAG review for v1.0 (issue controller-document#94)See github issue controller-document#94. Brent Zundel: which is our TAG review issue.
Manu Sporny: first of all, thank you very much for the review, really appreciate it, I think that at a high level the TAG had a number of concerns around the document and some of the functionality in there. Some of it seemed to be more general uneasiness around some of the stuff that we are doing, and there were some very specific questions towards the end. At a high level, the TAG acknowledged that it was useful to express a more generalized form of this. Jeffrey Yasskin: the worry that it might not be widely usable is not an argument against publication, maybe only against generic naming. Manu Sporny: the group has struggled with if this is worth it, we are now committed to getting it out there as we have dependencies in other specifications. Pierre-Antoine Champin: I think that the linked web storage WG should probably try to reuse part of this, this needs to be discussed by the WG but I will encourage this. Manu Sporny: some of the activity pub community would like this, BlueSky is using DID documents and would look at this.
Jeffrey Yasskin: That seems reasonable, security considerations to warn people about past vulnerabilities when they did this sort of thing. Manu Sporny: There was also a note that said that you were happy to see that the document does not try to add a format that could be interpreted as JSON/JSON-LD at the same time, however there is discussion around using context parsing documents as JSON-LD using hash ??? values. Jeffrey Yasskin: would like to introduce hadleybeema, also on the TAG. Ivan Herman: I do not know which version of the document you looked at, because there was a fairly extensive discussion on the JSON-LD presence in the document, it is now much more isolated than before, the document speaks about the vocabulary in general, it is much more concentrated now. Jeffrey Yasskin: another thing that I saw in the discussion after we wrote this comment is that json-ld implementations are expected to internally inject the context instead of expect it to be present, that helps with the interoperability concern about mixing JSON/JSON-LD. Manu Sporny: high level, the desire is to make sure that no matter if you do JSON/JSON-LD, the outcome is the same, the meaning of all of the fields doesn't change between those two mechanisms. Jeffrey Yasskin: I think that the TAG is likely to stay uncomfortable with this sort of document, but putting something in the specification to say it is a specification bug if you can get different results with JSON/JSON-LD. Manu Sporny: we can certainly put that language in there. Jeffrey Yasskin: We could write something about when it is appropriate to use JSON-LD, we may not have time or expertise to do this, but it would be good to be clear about the technical reasons to pick one or the other. Manu Sporny: That would be helpful. Moving on to the next comment, skepticsm that JSON-LD is necessary for controller document, extensibility could be achieved through registries. You recognize that the DID WG sees registries as decentralized but the TAG disagrees. Brent Zundel: maybe the controller document spec should strongly recommend or say that, when you profile this, pick one. That would be a step closer. Jeffrey Yasskin: I had a couple thoughts. The first is that the base encoding is implied by QR codes or a couple other situations, but in controller document, it doesn't seem like you have to use the same base encoding as you might need for a QR code, you are conveying bits that you can re-encode or switch encodings on. I think I still prefer one base encoding. The argument about hashes is interesting, and I wonder if the document could explain the benefit of hash choices. Hadley Beeman: I wonder if you have considered standardizing for use case, it would make a big step towards standardizing for interoperability. For example, for QR codes, standardize for base 45, you would have done the hard work for the implementers. Manu Sporny: The reason multibase and multi hash are in the controller document is historical, ideally they would be totally different specifications, they are in there because the implementation community was using multibase/multihash, not just controller documents that could use them, long history of different use cases. That said, the reason it is there is we needed normative documentation. I believe that work should be done, but I don't know if the right place is the controller document specification, for example nothing we are doing has anything to do with QR codes. Hadley Beeman: we are regularly reminding ourselves that we have no power, you can do what you like, I will say that we have had this conversation before regarding interoperability around crypto, and that will continue to come up as a stumbling block. We will continue to say there are opportunities here. I was imagining per-use case profiles for the ecosystem.
Kevin Dean: as someone from the supply chain and barcodes, I would strongly recommend against aligning anything with an encoding mechanism for a specific barcode format. There is work underway to add support to other formats with different compression and encoding algorithms. Hadley Beeman: Is it complex enough to not write up that nuance?
Kevin Dean: if you knew ahead of time you could, but from experience at GS1 with QR codes, we found that the compression algorithms built into the barcode format was about as good as anything we could come up with ourselves, not worth the extra effort. Jeffrey Yasskin: The TAG has not come to consensus on if multibase/multihash is good to use, have already talked about how the spec profiles to only a few of the options therein, the TAG will continue discussing that. Ivan Herman: as someone who is pretty much a newcomer in crypto related things, I have the impression that doing something like what you refer to goes beyond the scope of W3C. The crypto community is huge and has an enormous amount of work going at various organizations. Not up to W3C to make judgement calls on key formats, hash functions - the only thing we can do is give the possibility for various things to be used, up to various implementations to decide what to use. Not W3C's business. Michael Jones: Ivan I was once in a W3C WG that called itself Web Crypto that made choices about what algorithms and formats to use/deprecate. Manu Sporny: jyasskin this is about your comment on the TAG continuing to think about it. There is a common misconception around the multi formats that they suggest that you could use any of them in an application. What we are trying to say is no - you should pick one. The reason the multiformats exist is that the reality is that we have many different formats, and applications are using them without using any kind of header.
Manu Sporny: One of the feedbacks is that it seems risky in some cases - this has to do with who can make changes to a controller document. We have a field that can point to something else in the world, controller doc not always self contained, other authorities can have the ability to change a document. Jeffrey Yasskin: This was not a request for a particular change, just to add text to security considerations. Manu Sporny: agreed, but we should raise an issue about digest pinning.
Joe Andrieu: Just want to be a voice against digest pinning, DIDs unique ability to have indirection between identifier and crypto material. Brent Zundel: I think the group would oppose mandating digest pinning, maybe would support language making it optional. Manu Sporny: Last set of feedback from the TAG has to do with a potential caching attack against keys. Currently in the spec we use a full URL to identify key information in the document (or you can), the question from the TAG is, if you have a full URL for Key A, and the attacker sees Key A, uses the same URL as the other person, and we have dmitriz over there, we know that the good actor wants to interact with dmitriz, we will interact with dmitriz first with Key A. Jeffrey Yasskin: I did get this in the issue, it seems like a complication that arises because you are using JSON-LD where local names don't really exist where you wind up with names that look global but are only usable if they are in the requested document. Manu Sporny: just to clarify, local names do not exist, but fragments do, fragments could achieve what you are suggesting. We could also say that URLs are invalid if they do not align with the base of the document and add tests to a test suite to test that. The concern is that we would have to be careful about how we do that, as there are use cases we have explored where you may keep key information that's yours external to the document. We would have to work through details, the controller is an example where you point outside of the document, and that is part of the security model. As a result of that feature, we have to care about external links, need to assume that is part of the core operating model. Joe Andrieu: manu answered my direct question, which is that in DID document A, I can define a verificationMethod that is defined in another DID document/controller document. Manu Sporny: they can, it's a confusion attack, meaning that you have DID doc A,B, DID doc B uses same identifiers as DID doc A, I could see there are variations of the attack where small misimplementations make the attack work, people need to defend themselves. I want to note this has nothing to do with JSON-LD and exists because we are using URLs so our security model is more complex. Joe Andrieu: either I don't understand who is referring to who or I disagree. If DID document A is pointing, say attestation method, refers to a verification method in Document B, the listing of that in DID A is entirely under the control of DID document A, not DID document B. Manu Sporny: that's not the attack. DID A has a URL that is Key A. DID B, the bad actor, will use the same URL that's in DID A for the purposes of confusing someone what the proper key is.
Jeffrey Yasskin: don't need to solve this right now, can put it in security considerations.
Brent Zundel: thank you for the time and review, anything else you want to express to the group from TAG, are we on the right track? Jeffrey Yasskin: there are likely to be some concerns that the WG decides not to address, that is fine, you are on track to address the rest of the concerns. Manu Sporny: jyasskin, you and I had a nice brief chat about continuing engagement with the TAG as we mature the work, the TAG will continue to have concerns about the work that the DID and VC WGs are doing, this is not resolvable in 6 months. Everyone being aware of that is good, I don't know what the engagement mechanisms is other than horizational review, but some discussion here goes beyond horizationtal review comments, e.g. the discussion on multibase. What is the venue there? Hadley Beeman: you can always open a TAG issue/TAG review, we would love to have discussions there other than "here is a done spec, please check it". We can offer help at the architecture stage, share our experience, connect you to people, etc. Kevin Dean: Just would like to add a big +1 to that, I am still a member of the GS1 architecture group where we have the same model helping groups progress standards and ensure alignment, I would reiterate, as with the TAG, we don't bite. Ivan Herman: We don't have to go into the details here, but what you say is something that should be better reflected in how the process works. The way current transitions go, and staff contacts communicate with the people in charge of these things, is different than what you said. Hadley Beeman: There was some discussion of that this week. |
@jyasskin and @hadleybeeman, thank you for engaging on behalf of the W3C TAG at the recent W3C Technical Plenary meeting in Anaheim. The transcript of that meeting can be found in the Github comment above this one. This comment is meant to summarize the changes we intend to make to the specification based on our conversations during W3C TPAC. Please let us know if you would like us to do more than what we propose below:
We'll link the PRs to each checkbox as they are raised. |
I think there's some nuance here. I agree that a controller document should not be able to set a verification method for an identifier other than the primary, singular "id" property--the document is canonical for just one identifier--but a controller SHOULD be able to set a verification method that uses an externally defined key because that is a reasonable policy decision for managing keys. However, there doesn't seem to be a way in the controller document to actually specify a verification method for an identifier that isn't the "id" of the base document. So, either I'm in disagreement (because external keys are a reasonable policy choice) or it seems to functionally be addressed (but could use better explanatory language). |
I agree with @jandrieu that there is some nuance here. I think any change here would ideally not eliminate use cases where "external VMs" can be referenced from controller documents. I think this can be done by requiring any externally referenced VM to be done only by reference and not by embedding, i.e., only the ID (a URL) of the verification method can be expressed in an external ("non-canonical" in Joe's parlance here) controller document. However, expressing an embedded VM that has an The spec should also more clearly state that any VM retrieval can only be safely performed through the VM retrieval algorithm (or an equivalent algorithm, as always). With this in mind, safely retrieving verification methods always means the starting point is a verification method ID, which is necessarily rooted in its expected "canonical" controller document URL, and an expected verification relationship (which can default to If the algorithm looks like this (which is already very close to what is in the spec):
Then using this algorithm to retrieve VMs will always result in finding the "canonical" controller document for the verification method, which is also the only acceptable place to express it in full (not just by reference). Now, this does not preclude another controller document from referencing external VMs. But note that verification retrieval algorithms won't start at that controller document, however -- they will start with a verification method ID as mentioned, eliminating any possible influence. For a use case for this, consider a VC with an issuer with an ID value of |
The issue was discussed in a meeting on 2024-10-09
View the transcript3.3. TAG review for v1.0 (issue controller-document#94)See github issue controller-document#94. Manu Sporny: the next item, subtopic issue 94, is the TAG's horizontal review of controller document.
Manu Sporny: we were joined by jyasskin at TPAC, there is an overlap with the PING's review around use cases, the second item is to clarify that the semantics between a "JSON interpretation" and a "JSON-LD interpretation" must be the same, and any differences are either a spec bug or an implementation bug. |
All PRs related to this issue have been created, reviewed, and merged. Closing. |
Sorry that this took so long. I'm pasting the comment the TAG agreed on this week, which is also in w3ctag/design-reviews#960 (comment):
We appreciate this effort to make the bag-of-keys functionality that Verifiable Credentials use more independent from the did: URL scheme. Beyond that, we're not confident that other systems will find much use in it, since the effort of profiling it is likely to be larger than the effort in defining a bespoke format. There is also a risk that defining a generic format will introduce security vulnerabilities into specific applications when libraries implement the generic format and fail to enforce the restrictions that those specific applications need. We've seen this in the past when generic JWT libraries allowed alg=none or symmetric keys in applications that were designed for asymmetric keys. While those specific flaws don't exist here, analogous ones might.
We were happy to see that this document doesn't try to define a format that can be interpreted as JSON and JSON-LD at the same time. Some of the discussion in issues has been worrying on that front — it sounds like some implementers might be intending to include
@context
properties, parse documents as JSON-LD using hash-pinned values for those@context
URLs (which is better than not pinning them), and then interpret the result using an unspecified (though logical) mapping from URLs to the terms that this specification defines. We are concerned about such an implicit interoperability requirement that isn't captured in the format's specification, and we're concerned that attackers will find ways to exploit the complexity of JSON-LD context processing. We're also skeptical that JSON-LD provides benefits for a format designed for grouping cryptographic keys: interoperable extensibility can be achieved through IANA registries at least as well as through individually-owned URL prefixes. (We recognize that the DID WG sees registries as too-centralized, but we disagree.)Some of us are concerned about the inclusion of multihash and multibase. We all think it's best to mandate that all implementations of this specification align on a single cryptographic digest algorithm and a single base encoding, to improve interoperability. We're split on whether it's a good idea to use the multihash and multibase formats to make those strings self-describing.
We don't see some security considerations that we were expecting to see:
It seems risky, at least in some cases, to say "https://some.url/ defines the keys that can approve changes to this document" without pinning the content of https://some.url/ either by hash or signature, and we don't see any facility in this specification to do that pinning. Where would that be defined?
If one controller document creates a "verification relationship" to "https://actor1.example/key1", can a hostile actor include a verification method in their controller document with "id": "https://actor1.example/key1" and cause their key to be trusted? https://www.w3.org/TR/2024/WD-controller-document-20240817/#retrieve-verification-method does say to fetch every verification method URL with no caching at all, but it seems unlikely that implementations will actually omit all caching.
The text was updated successfully, but these errors were encountered: