-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use URL hash fragment anchor for message permalink, add id
attribute of message to jump on it
#238
Comments
id
attribute of message to jump on it
@bkil What benefit are you trying to achieve with this? I assume you're after the permalink event scrolling into view even when JavaScript is disabled? Please note, we're not specifically optimizing for the disabled JavaScript case but simpler and semantic is better in terms of search engines which we do care about. I don't think search engines care about scroll though 🤔 We do need to set the Duplicating the event ID in the hash and |
The way how it is generated at present is actually inferior from a SEO standpoint. You now generate hundreds of pages per day (differentiated by the ID in the URI query), all containing the exact same content, but interlinked somewhat with the major difference being invisible SEO metadata and the single class hand crafted on top of the highlighted message substituting Search engines have heuristics to detect such link farms and either penalize such results or downrank the whole domain for this. If keeping the continuation token is unavoidable, it may be included as long as it remains the same across links pointing towards the same wall of messages |
For inspiration, this is how IndieWeb generates their online archive (backed by a git repository and a bridge between Slack-IRC-Matrix) with excellent JS & noJS accessibility and optimized for SEO: |
@bkil Ahh, that's a really interesting point (especially in terms of caching)! But this seemed to work out fine for Gitter with the same URL pattern for permalinks. I don't think the Matrix Public Archive really qualifies for a link farm or spamdexing. Having a permalink for an item is pretty standard. You can even see this with Discourse or StackExchange sites. As an interesting point of comparison, in the case of StackExchange questions/answers, they do duplicate the answer ID in the URL and the hash (I assume the hash is for scrolling): https://stackoverflow.com/a/482129/796832 -> https://stackoverflow.com/questions/184618/what-is-the-best-comment-in-source-code-you-have-ever-encountered/482129#482129
I'm not sure about the distinction you're trying to make here? Can you give an example? |
I also know of blog engines from the 90s that generate a similar URL including a message ID in both the hash and the query. Although, all such ranking algorithms are proprietary, I'd probably allow for including a tiny bit of context around each referenced message, however including the whole day worth of chat on each separate page would definitely not fly with me. For tree-based or thread-based blog engines, this typically boils down to referring to a thread or subtree at a time, not the whole root every time. In search engines I've tried, those results are ranked higher which are accessible through content-unique URLs. I.e., answers are not at the top, as they have been downranked by The Algorithm.
Your linked StackOverflow example also includes this crucial piece:
|
Drawbacks of link differentiation via the query pointing to the same page:
Advantages:
|
Please create a new separate issue about adding this (with the SO example) ⏩ -> #251
Reddit and Twitter are a good example of this but they are slightly different use cases since they support infinite nested levels of threads. Both include the permalink ID in the URL for reference. Reddit even has a It's unclear what impact on SEO that our current level of bulk surrounding messages has but it's also something we haven't measured and not something I'm particularly worried about this stage. Based on that experience with Gitter, I've seen plenty of relevant permalinks appear in Google. I'm leaning towards leaving things as-is. In terms of the drawbacks you listed for using the And in terms of following a reply-chain without a page reload (as long as the messages are on the page), this isn't really relevant since we can still accommodate for that with the Hydrogen client-side JS. Caching seems like the most impactful benefit we could get from changing but also not a total deal-breaker in my opinion with how it currently works. |
Include the Matrix event ID in the URI hash, ex:
To make this work, we would also need to set the
id
attribute of each timeline message to the respective value (instead of the currentdata-event-id
) so the browser will jump to it upon loading. You can use the:target
CSS selector to highlight the matching message on the timeline with a different background and add a mark on the side as well.If the backend for some reason would also need to access the event ID (without JavaScript) to return messages for the given date, consider adding it to both the query and the hash.
There were restrictions in former versions of HTML on the syntax of the ID, but from HTML5, it should be non-empty and can contain basically anything except whitespace:
The text was updated successfully, but these errors were encountered: