Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manifest id should strip fragment at parse time #1121

Closed
mgiuca opened this issue May 6, 2024 · 1 comment · Fixed by #1122
Closed

Manifest id should strip fragment at parse time #1121

mgiuca opened this issue May 6, 2024 · 1 comment · Fixed by #1122
Assignees

Comments

@mgiuca
Copy link
Collaborator

mgiuca commented May 6, 2024

Currently, the id member does not have its fragment stripped in the parser, but both places where the spec mentions comparing identities, it uses exclude fragments. This strongly suggests that the canonical id should not include a fragment, and it's creating implementation problems.

There's even an example of the id parsing that leaves the fragment in, which is confusing:

json["id"] manifest["start_url"] manifest["id"]
undefined "https://example.com/my-app/#here" "https://example.com/my-app/#here"

Since for all intents and purposes, the "#here" is not part of the fragment (it would compare equal to any app id with the same origin, path and query, but a different fragment).

The problem with this is that it encourages user agents to retain the fragment when storing the identity in a database, but then do comparisons ignoring fragment later. However, if the id is used as a database primary key, or hashed before being compared, it can be difficult to correctly recognize two ids as the same app if they differ only by fragment.

The fragment should be stripped out at parse time, not at comparison time, so they represent the canonical id.

Historical analysis

I'm a bit confused because an earlier draft of this text in the original PR #988 included stripping out the fragment at parse time (discussion). But it wasn't in the final version that was merged and I don't see where it got taken out. Note that Chromium implemented stripping out the fragment at parse time, in the same week as that PR, citing the then-draft of the manifest, and still does.

We recently ran into an issue in Chromium (CC @phoglenix) where some old app database entries mysteriously had a fragment in them, which led us to investigate this. Although we can't figure out where those old entries came from (since Chromium always stripped out the fragment), it has highlighted the dangers of having fragments in the URL used as the primary key in a database, when the fragments are supposed to be ignored for comparison purposes.

@mgiuca mgiuca self-assigned this May 6, 2024
@mgiuca
Copy link
Collaborator Author

mgiuca commented May 6, 2024

Note that there is precedent for this: #793 was basically exactly the same issue with scope.

Prior to 2019, "scope" included the query and fragment, but the scope matching algorithm was expected to ignore those. This effectively meant that "scope" did not include the query and fragment, but the "processed manifest" version of scope did include those. We successfully changed it to strip query and fragment at parse time, and remove the check from the scope matching algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant