Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regenerate cached attachments so that they get <title> tags #4071

Closed
kingqueen3065 opened this issue Jun 27, 2017 · 3 comments
Closed

Regenerate cached attachments so that they get <title> tags #4071

kingqueen3065 opened this issue Jun 27, 2017 · 3 comments
Labels
f:request-management improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) x:uk

Comments

@kingqueen3065
Copy link
Collaborator

Google Search Console identifies that there are no HTML title tags for HTML versions of PDF, TXT etc. documents, e.g. https://www.whatdotheyknow.com/request/154598/response/381797/attach/html/2/Employee%20handbook%20Feb%20131.pdf.html

Adding a title tag would likely improve Google search results for said material. Perhaps have the title of the request to which it is attached? or the filename or something?

@garethrees
Copy link
Member

They do, as of around ea1e040#diff-dc48ad4750a95626efc6b2a6655af4c6R4. That html page is cached from some point before the <title> tags got introduced.

We could blow away the caches of old attachments – I don't think we'd need to be too worried about letting fresh caches get generated on demand. Do we get a list of URLs that the search console things are a problem?

Incidentally I've just noticed the old logo, so opened mysociety/whatdotheyknow-theme#410.

@kingqueen3065
Copy link
Collaborator Author

@kingqueen3065 kingqueen3065 reopened this Jul 3, 2017
@garethrees garethrees changed the title Title tags Regenerate cached attachments so that they get <title> tags Jul 3, 2017
@garethrees garethrees added f:request-management improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) labels May 29, 2018
@garethrees
Copy link
Member

We've now started clearing attachment caches automatically after a while, so this effectively isn't a problem.

We've done this through a cron job which just removes any files created after N days, so we haven't fixed the problem within Alaveteli itself. #1006 is where any further improvements should take place.

I loaded a few attachments from the spreadsheet above and all were regenerated with a <title> tag (except actual .html attachments – always an edge case!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
f:request-management improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) x:uk
Projects
None yet
Development

No branches or pull requests

2 participants