Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing accessibility metadata #94

Open
llemeurfr opened this issue Apr 24, 2019 · 14 comments
Open

Implementing accessibility metadata #94

llemeurfr opened this issue Apr 24, 2019 · 14 comments

Comments

@llemeurfr
Copy link
Contributor

llemeurfr commented Apr 24, 2019

During the 24/04/2019 call, the discussion led to:

  • We will use the EPUB OPF a11y metadata (or W3C Web Publications metadata) as a source
  • The RWPM will express each a11y metadata as arrays of strings
  • These metadata will be handled via the Readium accessibility mechanism
  • Each codebase will define helpers to “extract” a11y metadata from the object
  • These helpers will follow the Benetech UI recommendations (link below), i.e we will have:
    -- ScreenReaderFriendly() returning yes / no / unknown
    -- Audiobook() returning yes / no
    -- etc.

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/principles/

Can we agree this is the way to go?

@HadrienGardeur
Copy link
Contributor

I think that accessibilitySummary should either be a string or a localized string rather than an array of strings.

accessModeSufficient needs to be expressed as an array or array of strings (🙄).

@JayPanoz
Copy link
Contributor

accessModeSufficient → this one is even mega super confusing as an author.

Had to use it a few weeks ago, in my very last e-production gig and I was like “WTF‽”

Quite frankly, I hope that they redesign it at some point. Usage makes it even more difficult to understand what the definition is in the first place. 😫

@danielweck
Copy link
Member

danielweck commented Apr 15, 2020

Review of @JayPanoz 's current draft: https://github.com/JayPanoz/architecture/blob/a11y-metadata-parsing/streamer/parser/a11y-metadata-parsing.md

  • Correction: "The array is created from the meta elements whose property attribute has the value ..." => not just EPUB3 meta + property, but also EPUB2 meta + name
  • Missing: a11y:certifierCredential is a meta + name in EPUB2, but in EPUB3 can be meta + property, or alternatively link + property (in which case the value is expected to be a URL)
  • Missing: a11y:certifierReport is a meta + name in EPUB2, but in EPUB3 it cannot be meta + property, it must be a link + property (the value must be a URL)
  • Correction: dcterms:conformsTo not link + rel, but link + property (in EPUB3), or meta + name in EPUB2.
  • Correction: the enumerated values of schema:accessibilityFeature is actually open-ended, due to the possible displayTransformability suffixes which map to CSS rules (typically: /font-size, /font-family, /line-height, /word-spacing, /letter-spacing, /color, /background-color, etc.). Also, note the missing highContrastAudio suffixes (/noBackground, /reducedBackground and /switchableBackground)
  • Clarification: although dcterms:conformsTo is strictly-speaking an open-ended choice of arbitrary URLs, it is likely one of: http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-a, http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-aa, http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-aaa
  • Clarification: schema:accessMode, schema:accessibilityFeature, schema:accessibilityHazard and schema:accessibilitySummary are "required" properties (in terms of validation against the a11y conformance rules)
  • Cardinality: there is some ambiguity about which accessibility metadata can be repeated. For example, it does not make sense for schema:accessibilitySummary to repeat, yet the specification isn't clear about that, so there can potentially be several properties with the name/property in the EPUB package *.opf XML (a bit like dc:title). I think the R2 model should store them all, and it is the responsibility of the processor / consumer to figure out what to do with it (e.g. reading system can display the first one only, or a concatenation). The clearly repeatable properties are: schema:accessMode, schema:accessibilityFeature, schema:accessibilityHazard, schema:accessModeSufficient (the only one which allows comma-separated values from the enumerated list of tokens), schema:accessibilityAPI (although currently likely just ARIA), and schema:accessibilityControl. I guess it makes sense for these to be repeatable as well: dcterms:conformsTo, a11y:certifiedBy, and a11y:certifierCredential, but it would seem that a11y:certifierReport should be unique ... but then again, the R2 models should be ready for the possibility of several occurrences, I think.
  • To be debated: schema:accessModeSufficient can be repeated, and each occurrence is itself a comma-separated list of tokens from the enumeration. The current draft proposes to store these individual values as an array of tokens, rather than as the original linearized string. I am not so sure about this approach (I speak based on my own experience having implemented an editor for accessibility metadata), I think we should just naively preserve the original string value, with all its potential "weirdness" (e.g. insignificant whitespaces - or lack thereof - between tokens and comma separators, token ordering, duplicates, etc.)

References:

@danielweck
Copy link
Member

danielweck commented Apr 15, 2020

Note that r2-shared-js implements the above (nothing fancy, just boring repetitive parsing code), with careful handling of EPUB 2 name + content versus EPUB 3 property metadata, and of course special handling of metadata link + property for dcterms:conformsTo, a11y:certifierReport and optionally a11y:certifierCredential.

Code references:

https://github.com/readium/r2-shared-js/blob/77348ed92bdfdbf0e28573379d094a17297afc50/src/models/metadata.ts#L66-L217

https://github.com/readium/r2-shared-js/blob/77348ed92bdfdbf0e28573379d094a17297afc50/src/parser/epub.ts#L501-L710

@danielweck
Copy link
Member

Side note: I do not know what the W3C webpub accessibility-report is, in relation to the specs linked above.

https://www.w3.org/TR/pub-manifest/#accessibility-report

@danielweck
Copy link
Member

danielweck commented Apr 15, 2020

* **To be debated**: `schema:accessModeSufficient` can be repeated, and each occurrence is itself a comma-separated list of tokens from the enumeration. The current draft proposes to store these individual values as an array of tokens, rather than as the original linearized string. I am not so sure about this approach (I speak based on my own experience having implemented an editor for accessibility metadata), I think we should just naively preserve the original string value, with all its potential "weirdness"  (e.g. insignificant whitespaces - or lack thereof - between tokens and comma separators, etc.)

Note that the W3C draft spec. breaks down individual tokens in the linearized comma-separated enumeration for the accessModeSufficient property:

https://www.w3.org/TR/pub-manifest/#accessibility
https://www.w3.org/TR/pub-manifest/#webidl-wpm

https://www.w3.org/TR/pub-manifest/#example-19-setting-accessiblity-metadata-for-a-publication-that-provides-alternative-text-and-long-descriptions-appropriate-for-each-image-enabling-it-to-be-read-in-purely-textual-form:

{
    …
    "accessMode"              : ["textual", "visual"],
    "accessibilityFeature"    : ["alternativeText", "longDescription"]
    "accessModeSufficient"    : [
        {
            "type"            : "ItemList",
            "itemListElement" : ["textual", "visual"]
        },
        {
            "type"            : "ItemList",
            "itemListElement" : ["textual"]
        }
    ],
    …
}

@danielweck
Copy link
Member

danielweck commented Apr 15, 2020

The current draft proposes to store these individual values (schema:accessModeSufficient) as an array of tokens, rather than as the original linearized string. I am not so sure about this approach ...

So, in r2-shared-js I added a convenient utility helper function to decompose and normalize the original/authored AccessModeSufficient string (i.e. raw linearized comma-separated value, when parsed from EPUB) into a canonical "array-of-(array-of-(string))" form, with removed insignificant whitespace, eliminated duplicates, and preserved order (the duplicates are removed on the trailing edge of the matching iteration).

Unfortunately due to a limitation in the declarative JSON (de)serialization library used for the R2 models, I was not able to directly implement array-of-array (array-of-object works fine, we use it a lot, but because of how prototypal class inheritance works in Javascript, array-of-array seems a no-go) ... thus the convenient, but separate helper.

Thorium / readium-desktop will invoke this utility function as needed, in order to present the accessibility metadata as per the standard UX guidelines:
https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html

PS Javascript code:

const AccessModeSufficientParsed = AccessModeSufficient.map((ams) =>
                ams.split(",").
                map((token) => token.trim()).
                filter((token) => token.length).
                reduce((pv, cv) => pv.includes(cv) ? pv : pv.concat(cv).
                filter((arr) => arr.length), []);

Example input/output:
["", " visual , textual ,, visual ", "auditory, auditory,,"]
=>
[["visual","textual"],["auditory"]]

@HadrienGardeur
Copy link
Contributor

Aside from purely parsing and representing these metadata, I think that the real question remains: what can we actually use them for?

IMO the community around EPUB, has failed so far to build compelling use cases of how these various properties can be leveraged.

I'd rather have less metadata and know what to actually make of them.

@danielweck
Copy link
Member

the real question remains: what can we actually use them for?

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html

@JayPanoz
Copy link
Contributor

@danielweck thanks for the review.

I must admit that I wasn’t particularly confident/comfortable with this draft, as accessibility metadata in EPUB isn’t necessarily my forte – and well that was an external contribution in Blitz whose default was modified later as having everything by default instead of a reasonable subset might have well produced unreliable a11y metadata – so I’m indeed expecting quite a lot of massive changes to this draft.

@HadrienGardeur
Copy link
Contributor

the real question remains: what can we actually use them for?

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html

Sure that's better than nothing, but beyond displaying these metadata, how can we truly leverage them?

@llemeurfr
Copy link
Contributor Author

Sure that's better than nothing,

translate: this is already great :-)

beyond displaying these metadata, how can we truly leverage them?

Use them (I mean the mapped information, e.g. "Screen reader friendly") as filters in reading app bookshelves is the next step.

@danielweck
Copy link
Member

Unfortunately due to a limitation in the declarative JSON (de)serialization library used for the R2 models, I was not able to directly implement array-of-array (array-of-object works fine, we use it a lot, but because of how prototypal class inheritance works in Javascript, array-of-array seems a no-go) ... thus the convenient, but separate helper.

This is now fixed properly, so that the JSON syntax is optimal without the need of a helper function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants