-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #14 from lsst-sqre/tickets/DM-40815
DM-40815: Embed Highwire Press metadata tag
- Loading branch information
Showing
19 changed files
with
584 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,3 +11,8 @@ clean: | |
rm -rf docs/_build | ||
rm -rf docs/api | ||
rm -f demo/_build | ||
|
||
.PHONY: demo | ||
demo: | ||
npm run build | ||
tox run -e demo |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
### New features | ||
|
||
- Include common metadata in the technote HTML: | ||
|
||
- Standard HTML meta tags like `description` and `canonical` URL link rel. | ||
- Highwire Press meta tags (used by Google Scholar) | ||
- OpenGraph meta tags (used by social media and messaging apps) | ||
- microformats2 annotations on relevant elements | ||
- Custom data attributes on relevant elements (the link to the technote source repository) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
############################## | ||
Metadata published by technote | ||
############################## | ||
|
||
Technote publishes metadata with HTML documents. | ||
This metadata can be used for a number of purposes, from search engine optimization, to inclusion in Google Scholar, unfurling in social media and message apps, and even for maintaining institutional documentation indices. | ||
Technote uses supports a number of metadata standards, including Highwire Press, Open Graph, microformats2, and custom element annotations with data attributes. | ||
This page describes the metadata that Technote publishes. | ||
|
||
Standard HTML metadata | ||
====================== | ||
|
||
Technote publishes standard HTML metadata: | ||
|
||
- ``meta name="title"`` is the document's title (h1 heading). | ||
- ``meta name="description"`` is the document's description derived from the ``abstract`` directive. | ||
- ``meta name="generator"`` is the name of the software that generated the document. Example: ``<meta name="generator" content="technote 1.0.0: https://technote.lsst.io">``. | ||
- ``link ref="canonical"`` is the canonical URL of the document, derived from the ``canonical_url`` field in a document's ``technote.toml`` configuration file. | ||
|
||
Highwire Press metadata | ||
======================= | ||
|
||
Google Scholar uses Highwire Press metadata to index literature. | ||
Technote publishes the following ``meta`` tags: | ||
|
||
- ``citation_title`` | ||
- ``citation_author`` | ||
- ``citation_author_institution`` | ||
- ``citation_author_email`` | ||
- ``citation_author_orcid`` | ||
- ``citation_date`` | ||
- ``citattion_doi`` | ||
- ``citation_technical_report_number`` | ||
- ``citation_fulltext_html_url`` | ||
|
||
OpenGraph metadata | ||
================== | ||
|
||
Social media and messaging apps use OpenGraph metadata to unfurl links. | ||
Technote publishes the following ``meta`` tags: | ||
|
||
- ``og:title`` | ||
- ``og:description`` | ||
- ``og:url`` | ||
- ``og:type`` (always ``article``) | ||
- ``og:article:author`` | ||
- ``og:article:published_time`` | ||
- ``og:article:modified_time`` | ||
|
||
microformats2 metadata | ||
====================== | ||
|
||
microformats2 is a standard for annotating HTML element that reflect standard document metadata. | ||
The annotations are published as ``class`` attributes on HTML elements. | ||
|
||
- ``h-entry`` is applied to the container element for the document (including sidebars). | ||
- ``e-content`` is applied to the container element for the document's content. | ||
- ``p-summary`` is applied to the abstract's container section. | ||
- ``p-author`` is applied to the name of each author. | ||
- ``dt-updated`` is applied to the date element of the last update. | ||
- ``dt-published`` is applied to the date element of the original publication date. | ||
|
||
Element data attributes | ||
======================= | ||
|
||
For on-page metadata that is not covered by the standards above, Technote annotates on-page metadata as data attributes on HTML elements. | ||
|
||
- ``data-technote-source-url`` is set to the URL of the source repository for the document (e.g. on GitHub). This data attribute is applied to the ``a`` element that links to the source repository. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
"""Support for the Hirewire schema for academic metadata HTML in HTML.""" | ||
|
||
from __future__ import annotations | ||
|
||
from typing import TYPE_CHECKING | ||
|
||
from .metatagbase import MetaTagFormatterBase | ||
|
||
if TYPE_CHECKING: | ||
from technote.config import TechnoteTable | ||
|
||
|
||
class HighwireMetadata(MetaTagFormatterBase): | ||
"""A class that transforms technote metadata into Highwire metadata | ||
tags. | ||
Notes | ||
----- | ||
Resources for learning about Highwire metadata tags: | ||
- https://cheb.hatenablog.com/entry/2014/07/25/002548#f-c017c3cf | ||
- https://scholar.google.com/intl/en/scholar/inclusion.html#indexing | ||
""" | ||
|
||
def __init__( | ||
self, | ||
*, | ||
metadata: TechnoteTable, | ||
title: str, | ||
abstract: str | None = None, | ||
) -> None: | ||
self._metadata = metadata | ||
self._title = title | ||
self._abstract = abstract | ||
|
||
@property | ||
def tag_attributes(self) -> list[str]: | ||
"""The names of class properties that create tags.""" | ||
return [ | ||
"title", | ||
"author_info", | ||
"date", | ||
"doi", | ||
"technical_report_number", | ||
"html_url", | ||
] | ||
|
||
@property | ||
def title(self) -> str: | ||
"""The title metadata.""" | ||
return f'<meta name="citation_title" content="{ self._title }">' | ||
|
||
@property | ||
def author_info(self) -> list[str]: | ||
"""The author metadata. | ||
Each author is represented with these tags: | ||
- ``citation_author`` | ||
- ``citation_author_institution`` | ||
- ``citation_author_email`` | ||
- ``citation_author_orcid`` | ||
""" | ||
authors = self._metadata.authors | ||
author_tags: list[str] = [] | ||
for author in authors: | ||
author_tags.append( | ||
self._format_tag("author", author.name.plain_text_name) | ||
) | ||
affil_tags = [ | ||
self._format_tag("author_institution", affiliation.name) | ||
for affiliation in author.affiliations | ||
if affiliation.name is not None | ||
] | ||
author_tags.extend(affil_tags) | ||
if author.email is not None: | ||
author_tags.append( | ||
self._format_tag("author_email", author.email) | ||
) | ||
if author.orcid is not None: | ||
author_tags.append( | ||
self._format_tag("author_orcid", str(author.orcid)) | ||
) | ||
return author_tags | ||
|
||
@property | ||
def date(self) -> str | None: | ||
"""The ``citation_date`` metadata tag.""" | ||
if self._metadata.date_updated is None: | ||
return None | ||
iso8601_date = self._metadata.date_updated.isoformat() | ||
return self._format_tag("date", iso8601_date) | ||
|
||
@property | ||
def doi(self) -> str | None: | ||
"""The ``citation_doi`` metadata tag.""" | ||
if self._metadata.doi is None: | ||
return None | ||
return self._format_tag("doi", str(self._metadata.doi)) | ||
|
||
@property | ||
def technical_report_number(self) -> str | None: | ||
"""The ``citation_technical_report_number`` metadata tag.""" | ||
if self._metadata.id is None: | ||
return None | ||
return self._format_tag("technical_report_number", self._metadata.id) | ||
|
||
@property | ||
def html_url(self) -> str | None: | ||
"""The ``citation_fulltext_html_url`` metadata tag.""" | ||
if self._metadata.canonical_url is None: | ||
return None | ||
return self._format_tag( | ||
"fulltext_html_url", str(self._metadata.canonical_url) | ||
) | ||
|
||
def _format_tag(self, name: str, content: str) -> str: | ||
"""Format a Highwire metadata tag.""" | ||
return ( | ||
f'<meta name="citation_{ name }" content="{ content }" ' | ||
f'data-highwire="true">' | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
"""Support for generating HTML meta tags.""" | ||
|
||
from __future__ import annotations | ||
|
||
from abc import ABC, abstractmethod | ||
|
||
|
||
class MetaTagFormatterBase(ABC): | ||
"""A base class for generating HTML meta tags.""" | ||
|
||
def __str__(self) -> str: | ||
"""Create the Highwire metadata tags.""" | ||
return self.as_html() | ||
|
||
@property | ||
@abstractmethod | ||
def tag_attributes(self) -> list[str]: | ||
"""The names of class properties that create tags.""" | ||
raise NotImplementedError | ||
|
||
def as_html(self) -> str: | ||
"""Create the Highwire metadata HTML tags.""" | ||
tags: list[str] = [] | ||
for prop in self.tag_attributes: | ||
self.extend_not_none(tags, getattr(self, prop)) | ||
return "\n".join(tags) + "\n" | ||
|
||
@staticmethod | ||
def extend_not_none( | ||
entries: list[str], new_item: None | str | list[str] | ||
) -> None: | ||
"""Extend a list with new items if they are not None.""" | ||
if new_item is None: | ||
return | ||
if isinstance(new_item, str): | ||
entries.append(new_item) | ||
else: | ||
entries.extend(new_item) |
Oops, something went wrong.