Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle hostnames with ports? #7

Open
dmitrizagidulin opened this issue May 3, 2020 · 37 comments · Fixed by #38
Open

How to handle hostnames with ports? #7

dmitrizagidulin opened this issue May 3, 2020 · 37 comments · Fixed by #38

Comments

@dmitrizagidulin
Copy link
Collaborator

Now that we support paths (as of PR #5) in our did:web URLs, and we're using the : to encode the / characters for paths, this poses another challenge.

Since we're using the : character for paths, what do we do about the actual most common intended purpose of that character, which is to specify a port number?

So, specifically, say I am a developer who has just fired up a test server on their local machine, which is running on https://localhost:8443 (the https is deliberate of course - I made a self-signed cert for it and everything). This kind of thing happens all the time (it's happening to me right now :) ).

If there's a did:web document residing on that domain (say in /.well-known/did.json), what will that URL look like? According to our rules so far:

did:web:localhost:8443

Except now we're using : (in the did-specific-identifier portion of the url) to encode path fragments. So that URL would "decode" to https://localhost/8443. Not what we want.

So, what are people's thoughts on how to best handle this? @awoie, @OR13 ?

@OR13
Copy link
Collaborator

OR13 commented May 4, 2020

retry logic I suppose.

@dmitrizagidulin
Copy link
Collaborator Author

The other option we have is - we can require the hostname portion to be URL-encoded.
So, https://localhost:8443 would encode as did:web:localhost%3A8443.

@OR13
Copy link
Collaborator

OR13 commented May 4, 2020

^ much better idea.

@felixwatts
Copy link

Do we have a timeline for a resolution of this issue? As far as I can see this prevents actual adoption in a real life scenario.

@OR13
Copy link
Collaborator

OR13 commented Sep 1, 2020

@felixwatts you could just use a URL instead of a DID, and return a did document with the same URL everywhere the DID would be.

@dmitrizagidulin
Copy link
Collaborator Author

@felixwatts - apologies, I did the thing where I implemented the proposed solution and thought I updated the spec but didn't. Will be making a PR shortly.

@llorllale
Copy link

@dmitrizagidulin is your PR arriving any time soon? Could you explain the proposed solution?

@llorllale
Copy link

I guess the proposed solution is base64url-encoding the host portion as per #7 (comment) and what @felixwatts did.

@llorllale
Copy link

We are going to adopt the base64url-encoding proposal in aries-framework-go.

@dmitrizagidulin
Copy link
Collaborator Author

dmitrizagidulin commented Nov 11, 2020

@llorllale sure. I'll make the PR today.

The proposed solution is - URL-encoding (as in, encodeUriComponent) both the host portion, and each path portion.
So, the test vectors would be:

  1. did:web:localhost%3A8080 -> https://localhost:8080/.well-known/did.json
  2. https://example.com/path/some+subpath -> did:web:example.com:path:some%2Bsubpath

@dmitrizagidulin
Copy link
Collaborator Author

@llorllale -1 to base64url-encoding did:web URLs, though. (Since base64url-encoding removes one of the nice properties of did:web DIDs, which is, readability / recognition of the domain name.)

So for example, https://localhost:8080 would base64url-encode as did:web:bG9jYWxob3N0OjgwODA=, which is opaque to human eyes.

@llorllale
Copy link

@dmitrizagidulin

@llorllale -1 to base64url-encoding did:web URLs, though. (Since base64url-encoding removes one of the nice properties of did:web DIDs, which is, readability / recognition of the domain name.)

So for example, https://localhost:8080 would base64url-encode as did:web:bG9jYWxob3N0OjgwODA=, which is opaque to human eyes.

Fully agree - I mixed them up this morning before coffee somehow.

@llorllale
Copy link

@llorllale sure. I'll make the PR today.

The proposed solution is - URL-encoding (as in, encodeUriComponent) both the host portion, and each path portion.
So, the test vectors would be:

1. `did:web:localhost%3A8080`  `->` `https://localhost:8080/.well-known/did.json`

2. `https://example.com/path/some+subpath` `->` `did:web:example.com:path:some%2Bsubpath`

+1

@llorllale
Copy link

@dmitrizagidulin @OR13 I just realized url-encoding the path components results in a non-compliant DID as per the syntax: https://www.w3.org/TR/did-core/#did-syntax

@llorllale
Copy link

llorllale commented Nov 19, 2020

Nevermind: https://tools.ietf.org/html/rfc3986#section-2.4

So in summary, if I understand it right:

  • parse (split) the did prefix, the method name, and the method-specific ID
  • url-decode the method-specific ID*
  • should implementations barf if a % is still present after decoding?
  • method-specific ID => s/:/\//
  • HTTP GET https:// + previous result

@llorllale
Copy link

llorllale commented Nov 19, 2020

URI recommendations: https://www.w3.org/Addressing/URL/4_URI_Recommentations.html

The percent sign ("%", ASCII 25 hex) is used as the escape character in the encoding scheme and is never allowed for anything else.

Some test vectors for percent-encoding: https://www.w3.org/2004/04/uri-rel-test.html#reg-percent

@sk91
Copy link

sk91 commented Feb 17, 2021

@llorllale Hi,

I just realized url-encoding the path components results in a non-compliant DID as per the syntax: https://www.w3.org/TR/did-core/#did-syntax

Nevermind: https://tools.ietf.org/html/rfc3986#section-2.4

can you clarify please if we should incorporate encodeUrI into implementation or not?

@OR13
Copy link
Collaborator

OR13 commented Feb 17, 2021

It sounds like did web does not currently support encodeUrI or ports.... and folks should assume that remains true until this issue is closed after the spec is updated.

@sk91
Copy link

sk91 commented Feb 17, 2021

yeh, incorporating ngrok to use did:web in development. Seems like the easiest way to be compliant in development

@OR13
Copy link
Collaborator

OR13 commented Feb 17, 2021

hah, nice i <3 ngrok.... thats an awesome idea.

@mirceanis
Copy link

I'm not sure if this should be a separate issue or not, but it's related to encoding 😅.
The spec does not mention how to deal with non-ASCII domain names.

punycode is an option for that, but it won't cover the port issue and is not as easily available as encodeUriComponent.
However, the did-core spec does not allow the % character in the method-specific-id which is currently a blocker for encodeUriComponent

@OR13
Copy link
Collaborator

OR13 commented Feb 18, 2021

@mirceanis non ascii domain names cannot be DIDs.... if the spec needs to be updated to support them we need URL safe bidirectional transformations.

@dmitrizagidulin
Copy link
Collaborator Author

Update: The DID Core spec is being updated to accept % characters as part of the DID URI ABNF, as of PR w3c/did-core#703.
So, we'll go with percent url encoding, since that is now valid.

mirceanis added a commit to decentralized-identity/web-did-resolver that referenced this issue Mar 11, 2021
fixes #91
implements w3c-ccg/did-method-web#7

BREAKING CHANGES:
* resolver is now returning a DIDResolutionResult that wraps a DIDDocument
* No errors are thrown, instead returned as didResolutionMetadata.error/message
mirceanis added a commit to decentralized-identity/web-did-resolver that referenced this issue Mar 15, 2021
* feat: update to latest did spec

fixes #91
implements w3c-ccg/did-method-web#7

BREAKING CHANGE: Resolver now returns a `DIDResolutionResult` that wraps a DIDDocument. No errors are thrown, instead returned as `didResolutionMetadata.error/message`
@kdenhartog
Copy link
Contributor

Updated the text for this in #38

@letmaik
Copy link
Contributor

letmaik commented Nov 5, 2021

https://example.com/foo:bar is also an interesting case, since : is a valid path character in URIs. Would that become did:web:example.com:foo%3Abar? What if a URL already has encoded characters, are they decoded first and then re-encoded per path segment? For example, https://example.com/foo%3Abar -> https://example.com/foo:bar -> did:web:example.com:foo%3Abar. In this case, what URL does that DID refer to? https://example.com/foo:bar or https://example.com/foo%3Abar?

@dmitrizagidulin
Copy link
Collaborator Author

@letmaik - good point, about : characters being allowed in the path segment of URLs. I'll update the proposal with that in mind.

@OR13
Copy link
Collaborator

OR13 commented Nov 16, 2021

imo, did:web:example.com/foo:bar -> https://example.com/foo:bar/did.json... no problem here.

@letmaik
Copy link
Contributor

letmaik commented Nov 17, 2021

imo, did:web:example.com/foo:bar -> https://example.com/foo:bar/did.json... no problem here.

That's a DID URL, not a DID. I think this has to be solved for the DID itself, right?

@OR13
Copy link
Collaborator

OR13 commented Nov 17, 2021

yes, that example was a did url with a path that contained a colon.

today:

The method specific identifier MUST match the common name used in the SSL/TLS certificate, and it MUST NOT include IP addresses or port numbers.

if we wanted to allow the identifier to use ports:

did:web:localhost%3A3000 -> https://localhost:3000/did.json

and for completness, here is a did url that uses ports and colons in paths:

did:web:example.com%3A1337/foo:bar -> https://example.com:1337/foo:bar/did.json

@letmaik
Copy link
Contributor

letmaik commented Nov 17, 2021

yes, that example was a did url with a path that contained a colon.

today:

The method specific identifier MUST match the common name used in the SSL/TLS certificate, and it MUST NOT include IP addresses or port numbers.

But what follows after that is important:

Directories and subdirectories MAY optionally be included, delimited by colons rather than slashes.

This applies to DIDs, not just DID URLs. Or are you suggesting to remove that part?

@dmitrizagidulin
Copy link
Collaborator Author

@OR13

imo, did:web:example.com/foo:bar -> https://example.com/foo:bar/did.json... no problem here.

The issue mentioned is in the other direction:

https://example.com/foo:bar/ to did:web:example.com:foo%3Abar

But that should still be covered by the overall algorithm (that requires percent-encoding of path fragments when encoding from URL to DID).

@OR13
Copy link
Collaborator

OR13 commented Nov 18, 2021

ahh yes, thanks for clarifying!

@gribneau
Copy link
Contributor

Thinking in terms of the changes for #42 , the percent encoding would work like this:

did:web:example.com%3A1337:foo --> https://example.com:1337/foo/

and percent encoding in the path as well would work like this:

did:web:example.com%3A1337:foo%3Abar --> https://example.com:1337/foo:bar/

Is that correct?

@dmitrizagidulin
Copy link
Collaborator Author

@gribneau -- that looks right, +1

@OR13
Copy link
Collaborator

OR13 commented Nov 18, 2021

^ yes, i think thats the case we needed

@letmaik
Copy link
Contributor

letmaik commented Nov 18, 2021

and percent encoding in the path as well would work like this:

did:web:example.com%3A1337:foo%3Abar --> https://example.com:1337/foo:bar/

Is that correct?

What about IDNs? For the domain räksmörgås.josefsson.org how would the did:web look like? Are the unicode characters percent encoded? I guess so. How does resolve work then? Since this is an IDN and the resolve algo assumes URLs, not IRIs, it would have to undo the percent encoding and then convert via punycode to a URL: https://xn--rksmrgs-5wao1o.josefsson.org. I think this complicates things too much and I would probably just simplify this to url-decoding each segment in full and putting the parts together to form an IRI. So, did:web:r%C3%A4ksm%C3%B6rg%C3%A5s.josefsson.org%3A1337:r%C3%A4ks becomes https://räksmörgås.josefsson.org:1337/räks (plus /did.json appended). If you need a URI then that's an application concern and most libraries accept UTF-8 encoded IRIs anyway these days. Doing it this way also covers the port automatically and doesn't have to be mentioned specifically in the spec.

@dmitrizagidulin
Copy link
Collaborator Author

@letmaik - Great point.
I'm reopening this issue as a reminder for us to address this in the next PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants