-
-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regenerate cached attachments so that they get <title> tags #4071
Comments
They do, as of around ea1e040#diff-dc48ad4750a95626efc6b2a6655af4c6R4. That html page is cached from some point before the We could blow away the caches of old attachments – I don't think we'd need to be too worried about letting fresh caches get generated on demand. Do we get a list of URLs that the search console things are a problem? Incidentally I've just noticed the old logo, so opened mysociety/whatdotheyknow-theme#410. |
We've now started clearing attachment caches automatically after a while, so this effectively isn't a problem. We've done this through a cron job which just removes any files created after N days, so we haven't fixed the problem within Alaveteli itself. #1006 is where any further improvements should take place. I loaded a few attachments from the spreadsheet above and all were regenerated with a |
Google Search Console identifies that there are no HTML title tags for HTML versions of PDF, TXT etc. documents, e.g. https://www.whatdotheyknow.com/request/154598/response/381797/attach/html/2/Employee%20handbook%20Feb%20131.pdf.html
Adding a title tag would likely improve Google search results for said material. Perhaps have the title of the request to which it is attached? or the filename or something?
The text was updated successfully, but these errors were encountered: