Implementing accessibility metadata #94

llemeurfr · 2019-04-24T16:59:23Z

During the 24/04/2019 call, the discussion led to:

We will use the EPUB OPF a11y metadata (or W3C Web Publications metadata) as a source
The RWPM will express each a11y metadata as arrays of strings
These metadata will be handled via the Readium accessibility mechanism
Each codebase will define helpers to “extract” a11y metadata from the object
These helpers will follow the Benetech UI recommendations (link below), i.e we will have:
-- ScreenReaderFriendly() returning yes / no / unknown
-- Audiobook() returning yes / no
-- etc.

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/principles/

Can we agree this is the way to go?

HadrienGardeur · 2019-04-24T17:38:26Z

I think that accessibilitySummary should either be a string or a localized string rather than an array of strings.

accessModeSufficient needs to be expressed as an array or array of strings (🙄).

JayPanoz · 2019-04-24T17:44:47Z

accessModeSufficient → this one is even mega super confusing as an author.

Had to use it a few weeks ago, in my very last e-production gig and I was like “WTF‽”

Quite frankly, I hope that they redesign it at some point. Usage makes it even more difficult to understand what the definition is in the first place. 😫

danielweck · 2020-04-15T10:48:01Z

Review of @JayPanoz 's current draft: https://github.com/JayPanoz/architecture/blob/a11y-metadata-parsing/streamer/parser/a11y-metadata-parsing.md

Correction: "The array is created from the meta elements whose property attribute has the value ..." => not just EPUB3 meta + property, but also EPUB2 meta + name
Missing: a11y:certifierCredential is a meta + name in EPUB2, but in EPUB3 can be meta + property, or alternatively link + property (in which case the value is expected to be a URL)
Missing: a11y:certifierReport is a meta + name in EPUB2, but in EPUB3 it cannot be meta + property, it must be a link + property (the value must be a URL)
Correction: dcterms:conformsTo not link + rel, but link + property (in EPUB3), or meta + name in EPUB2.
Correction: the enumerated values of schema:accessibilityFeature is actually open-ended, due to the possible displayTransformability suffixes which map to CSS rules (typically: /font-size, /font-family, /line-height, /word-spacing, /letter-spacing, /color, /background-color, etc.). Also, note the missing highContrastAudio suffixes (/noBackground, /reducedBackground and /switchableBackground)
Clarification: although dcterms:conformsTo is strictly-speaking an open-ended choice of arbitrary URLs, it is likely one of: http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-a, http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-aa, http://www.idpf.org/epub/a11y/accessibility-20170105.html#wcag-aaa
Clarification: schema:accessMode, schema:accessibilityFeature, schema:accessibilityHazard and schema:accessibilitySummary are "required" properties (in terms of validation against the a11y conformance rules)
Cardinality: there is some ambiguity about which accessibility metadata can be repeated. For example, it does not make sense for schema:accessibilitySummary to repeat, yet the specification isn't clear about that, so there can potentially be several properties with the name/property in the EPUB package *.opf XML (a bit like dc:title). I think the R2 model should store them all, and it is the responsibility of the processor / consumer to figure out what to do with it (e.g. reading system can display the first one only, or a concatenation). The clearly repeatable properties are: schema:accessMode, schema:accessibilityFeature, schema:accessibilityHazard, schema:accessModeSufficient (the only one which allows comma-separated values from the enumerated list of tokens), schema:accessibilityAPI (although currently likely just ARIA), and schema:accessibilityControl. I guess it makes sense for these to be repeatable as well: dcterms:conformsTo, a11y:certifiedBy, and a11y:certifierCredential, but it would seem that a11y:certifierReport should be unique ... but then again, the R2 models should be ready for the possibility of several occurrences, I think.
To be debated: schema:accessModeSufficient can be repeated, and each occurrence is itself a comma-separated list of tokens from the enumeration. The current draft proposes to store these individual values as an array of tokens, rather than as the original linearized string. I am not so sure about this approach (I speak based on my own experience having implemented an editor for accessibility metadata), I think we should just naively preserve the original string value, with all its potential "weirdness" (e.g. insignificant whitespaces - or lack thereof - between tokens and comma separators, token ordering, duplicates, etc.)

References:

…tecture#94 (comment)

danielweck · 2020-04-15T11:53:42Z

Note that r2-shared-js implements the above (nothing fancy, just boring repetitive parsing code), with careful handling of EPUB 2 name + content versus EPUB 3 property metadata, and of course special handling of metadata link + property for dcterms:conformsTo, a11y:certifierReport and optionally a11y:certifierCredential.

Code references:

https://github.com/readium/r2-shared-js/blob/77348ed92bdfdbf0e28573379d094a17297afc50/src/models/metadata.ts#L66-L217

https://github.com/readium/r2-shared-js/blob/77348ed92bdfdbf0e28573379d094a17297afc50/src/parser/epub.ts#L501-L710

danielweck · 2020-04-15T11:56:10Z

Side note: I do not know what the W3C webpub accessibility-report is, in relation to the specs linked above.

https://www.w3.org/TR/pub-manifest/#accessibility-report

danielweck · 2020-04-15T11:58:54Z

* **To be debated**: `schema:accessModeSufficient` can be repeated, and each occurrence is itself a comma-separated list of tokens from the enumeration. The current draft proposes to store these individual values as an array of tokens, rather than as the original linearized string. I am not so sure about this approach (I speak based on my own experience having implemented an editor for accessibility metadata), I think we should just naively preserve the original string value, with all its potential "weirdness"  (e.g. insignificant whitespaces - or lack thereof - between tokens and comma separators, etc.)

Note that the W3C draft spec. breaks down individual tokens in the linearized comma-separated enumeration for the accessModeSufficient property:

https://www.w3.org/TR/pub-manifest/#accessibility
https://www.w3.org/TR/pub-manifest/#webidl-wpm

https://www.w3.org/TR/pub-manifest/#example-19-setting-accessiblity-metadata-for-a-publication-that-provides-alternative-text-and-long-descriptions-appropriate-for-each-image-enabling-it-to-be-read-in-purely-textual-form:

{
    …
    "accessMode"              : ["textual", "visual"],
    "accessibilityFeature"    : ["alternativeText", "longDescription"]
    "accessModeSufficient"    : [
        {
            "type"            : "ItemList",
            "itemListElement" : ["textual", "visual"]
        },
        {
            "type"            : "ItemList",
            "itemListElement" : ["textual"]
        }
    ],
    …
}

danielweck · 2020-04-15T15:03:48Z

The current draft proposes to store these individual values (schema:accessModeSufficient) as an array of tokens, rather than as the original linearized string. I am not so sure about this approach ...

So, in r2-shared-js I added a convenient utility helper function to decompose and normalize the original/authored AccessModeSufficient string (i.e. raw linearized comma-separated value, when parsed from EPUB) into a canonical "array-of-(array-of-(string))" form, with removed insignificant whitespace, eliminated duplicates, and preserved order (the duplicates are removed on the trailing edge of the matching iteration).

Unfortunately due to a limitation in the declarative JSON (de)serialization library used for the R2 models, I was not able to directly implement array-of-array (array-of-object works fine, we use it a lot, but because of how prototypal class inheritance works in Javascript, array-of-array seems a no-go) ... thus the convenient, but separate helper.

Thorium / readium-desktop will invoke this utility function as needed, in order to present the accessibility metadata as per the standard UX guidelines:
https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html

PS Javascript code:

const AccessModeSufficientParsed = AccessModeSufficient.map((ams) =>
                ams.split(",").
                map((token) => token.trim()).
                filter((token) => token.length).
                reduce((pv, cv) => pv.includes(cv) ? pv : pv.concat(cv).
                filter((arr) => arr.length), []);

Example input/output:
["", " visual , textual ,, visual ", "auditory, auditory,,"]
=>
[["visual","textual"],["auditory"]]

HadrienGardeur · 2020-04-15T15:27:33Z

Aside from purely parsing and representing these metadata, I think that the real question remains: what can we actually use them for?

IMO the community around EPUB, has failed so far to build compelling use cases of how these various properties can be leveraged.

I'd rather have less metadata and know what to actually make of them.

danielweck · 2020-04-15T15:31:22Z

the real question remains: what can we actually use them for?

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html

JayPanoz · 2020-04-15T18:18:26Z

@danielweck thanks for the review.

I must admit that I wasn’t particularly confident/comfortable with this draft, as accessibility metadata in EPUB isn’t necessarily my forte – and well that was an external contribution in Blitz whose default was modified later as having everything by default instead of a reasonable subset might have well produced unreliable a11y metadata – so I’m indeed expecting quite a lot of massive changes to this draft.

HadrienGardeur · 2020-04-16T10:27:09Z

the real question remains: what can we actually use them for?

https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html

Sure that's better than nothing, but beyond displaying these metadata, how can we truly leverage them?

llemeurfr · 2020-04-16T10:32:03Z

Sure that's better than nothing,

translate: this is already great :-)

beyond displaying these metadata, how can we truly leverage them?

Use them (I mean the mapped information, e.g. "Screen reader friendly") as filters in reading app bookshelves is the next step.

danielweck · 2020-04-22T15:38:02Z

Unfortunately due to a limitation in the declarative JSON (de)serialization library used for the R2 models, I was not able to directly implement array-of-array (array-of-object works fine, we use it a lot, but because of how prototypal class inheritance works in Javascript, array-of-array seems a no-go) ... thus the convenient, but separate helper.

This is now fixed properly, so that the JSON syntax is optimal without the need of a helper function.

danielweck · 2020-04-24T10:26:52Z

Another point of interest, cross-walk project (EPUB, Schema.org and ONIX):
http://www.a11ymetadata.org/the-specification/metadata-crosswalk/
https://docs.google.com/spreadsheets/d/e/2PACX-1vTBWK6YwcDNYQTjE5dodNsMaIqRDUWu9SLsNwiaAZIrGn3BKa7iVlnTM6Nw5aU_qFKMUBcThEXlQAds/pubhtml

Summary of various useful references thus far:

http://kb.daisy.org/publishing/docs/metadata/schema-org.html
http://kb.daisy.org/publishing/docs/metadata/evaluation.html
https://www.w3.org/wiki/WebSchemas/Accessibility
https://www.w3.org/TR/pub-manifest/#accessibility
https://w3c.github.io/publ-a11y/UX-Guide-Metadata/techniques/schema-org.html

PS: I am not sure about the accessibility-report link, which seems close to a11y:certifierReport?
https://www.w3.org/TR/pub-manifest/#accessibility-report

danielweck added a commit to readium/r2-shared-js that referenced this issue Apr 15, 2020

a11y metadata parsing from EPUB, and new R2 models, see readium/archi…

6314ead

…tecture#94 (comment)

JayPanoz mentioned this issue Apr 28, 2020

A11y metadata parsing for EPUB #135

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing accessibility metadata #94

Implementing accessibility metadata #94

llemeurfr commented Apr 24, 2019 •

edited by danielweck

Loading

HadrienGardeur commented Apr 24, 2019

JayPanoz commented Apr 24, 2019

danielweck commented Apr 15, 2020 •

edited

Loading

danielweck commented Apr 15, 2020 •

edited

Loading

danielweck commented Apr 15, 2020

danielweck commented Apr 15, 2020 •

edited

Loading

danielweck commented Apr 15, 2020 •

edited

Loading

HadrienGardeur commented Apr 15, 2020

danielweck commented Apr 15, 2020

JayPanoz commented Apr 15, 2020

HadrienGardeur commented Apr 16, 2020

llemeurfr commented Apr 16, 2020

danielweck commented Apr 22, 2020

danielweck commented Apr 24, 2020 •

edited

Loading

Implementing accessibility metadata #94

Implementing accessibility metadata #94

Comments

llemeurfr commented Apr 24, 2019 • edited by danielweck Loading

HadrienGardeur commented Apr 24, 2019

JayPanoz commented Apr 24, 2019

danielweck commented Apr 15, 2020 • edited Loading

danielweck commented Apr 15, 2020 • edited Loading

danielweck commented Apr 15, 2020

danielweck commented Apr 15, 2020 • edited Loading

danielweck commented Apr 15, 2020 • edited Loading

HadrienGardeur commented Apr 15, 2020

danielweck commented Apr 15, 2020

JayPanoz commented Apr 15, 2020

HadrienGardeur commented Apr 16, 2020

llemeurfr commented Apr 16, 2020

danielweck commented Apr 22, 2020

danielweck commented Apr 24, 2020 • edited Loading

llemeurfr commented Apr 24, 2019 •

edited by danielweck

Loading

danielweck commented Apr 15, 2020 •

edited

Loading

danielweck commented Apr 15, 2020 •

edited

Loading

danielweck commented Apr 15, 2020 •

edited

Loading

danielweck commented Apr 15, 2020 •

edited

Loading

danielweck commented Apr 24, 2020 •

edited

Loading