Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IIIF GraphQL - proof of concept #231

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

IIIF GraphQL - proof of concept #231

wants to merge 5 commits into from

Conversation

stephenwf
Copy link
Owner

@stephenwf stephenwf commented Oct 20, 2018

IIIF Graphql

IIIF specification through GraphQL over rest. Thats the goal, this is the start. It is a heavy client-side implementation built on top of IIIF Redux, which acts as safe-access and a unified API to both IIIF Presentation 2 and Presentation 3 resources.

The goal of this library will be to offer both a server implementation and client side implementation. The server implementation will have the option to be hooked up to a non-IIIF data source through a layer over the resolver API for institutions collections. It will also feature an out of the box proxy to content hosted in the IIIF space so you can host your own IIIF GraphQL instance that can query any IIIF content, composed together for building your own applications and tools.

The client implementation will get better with time, but the main focus is building a single implementation that can work both on the server and on the client.

The why

At the moment the IIIF space is full of duplicate code, multiple solutions to the same problem and lots of already solved problems that are hard to discover. This implementation of GraphQL is intended to become a place to collaborate on solving common problems parsing IIIF data and building a library of GraphQL resolvers and custom queries that can be used quickly in small to large projects.

The what

Heres the dream query:

query getManifest($id: String, $q: String!) {
  manifest(id: $id) {
    id
    label
    metadata {
      label
      value
    }
    search(query: $q) {
      results {
        canvasId
        contentAsText
        selector {
          x
          y
          width
          height
        }
      }
    }
    canvases {
      id
      label
      thumbnail: thumbnailAtSize(size: 200) {
        src
        height
        width
      }
      tiledImages {
        height
        width
        tiles
        scaleFactor
      }
    }
  }
}

If you are unfamiliar with GraphQL, the key concept is that they query language descibes a JSON format that you want returned. You can read this query and know what the fields will be and so can predicably start using these fields.

Reading through this query, you can see we are grabbing some basic properties like id and label from the manifest. We are also requesting the metadata pairs. This is always an array, so each item in the array will match the format {label: "str", value: "str"}. Translation is done automatically using a global context (to be documented).

The next section is where the power of GraphQL starts coming through, theres a search block that accepts query just like the IIIF specification. Behind the scenes this will be doing many things:

  • Checking if the resource has a search service
  • Fetching the service
  • Constructing and executing the search
  • Formatting and parsing that back in your desired format

In addition, because this is using IIIF Redux under the hood, any references (such as targets) in the search results or any request are maintained as logical links. IIIF Redux maintains that graph and makes it query-able here.

The next section canvases is dropping deeper into the graph of the manifest. You can go as deep as you want or need to here. You can see another non-IIIF property in the query thumbnailAtSize being assigned to the thumbnail property. This is another theoretical extension where some derived data is being reduced into a handful of fields. Under the hood this could be:

  • Finding a thumbnail service
  • Finding a thumbnail static image, checking its height/width against your requested height
  • Looking for an image service in the annotations

In the end its goal is to find the best thumbnail given the criteria. This is immediately available in the query language to be reused once created. It can be improved.

The how

Slowly. This is the start of this project that I am working on in my down-time. Currently with IIIF Redux you can use functional composition to create a bunch of queries for IIIF content, but the interface to that is low-level but fail-safe. The goal of IIIF-GraphQL is to be both fail-safe and nice to work with.

GraphQL and apollo, the library that this is using, has integrations with every Frontend framework under the sun. There is no lock-in with this library, this broadens the developer interest and collaboration.

Roadmap

The road map is not in any particular order. The first goal is a base for collaboration. The second goal is exploring some of the challenges in IIIF-space using GraphQL.

v1.0

  • Fully IIIF compliant GraphQL type definition
  • Full compatibility with Presentation 2 over a presentation 3 query interface
  • Extension model to start adding derived fields to queries
  • v1.0 IIIF explorer using GraphQL
  • Cookbook of GraphQL queries for common UI interfaces

v1.x

  • W3C Annotation compliant GraphQL type definition
  • IIIF Image 2.x compliant GraphQL type definition
  • IIIF Activity stream compliant GraphQL type definition
  • IIIF Discovery integration in GraphQL server (invalidating caches)
  • IIIF Authentication compliant GraphQL type defintion
  • Authentication extensions

@stephenwf
Copy link
Owner Author

Current working query in this proof of concept:

const queryText = gql`
query {
getCollection(
collectionId: "https://view.nls.uk/collections/7446/74466699.json"
) {
id
type
label
metadata {
label
value
}
}
getManifest(
manifestId: "https://wellcomelibrary.org/iiif/b20432033/manifest"
) {
id
type
label
metadata {
label
value
}
}
getTest {
label
}
}
`;

@coveralls
Copy link

coveralls commented Oct 20, 2018

Pull Request Test Coverage Report for Build 542

  • 18 of 38 (47.37%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-1.3%) to 98.712%

Changes Missing Coverage Covered Lines Changed/Added Lines %
packages/iiif-redux/src/spaces/iiif-resource.js 18 38 47.37%
Totals Coverage Status
Change from base Build 540: -1.3%
Covered Lines: 1634
Relevant Lines: 1654

💛 - Coveralls

@tomcrane
Copy link

So, what's the relationship (potential relationship) between this, Presley and Mathmos?

1 similar comment
@tomcrane
Copy link

So, what's the relationship (potential relationship) between this, Presley and Mathmos?

@stephenwf
Copy link
Owner Author

So I can think of a few integrations that would work for this case. The first is GraphQL mutations for Presley.

mutation updateManifest(id: "...") {
  updateDescriptive(label: "Some new label" summary: "Some new summary") {
    label
    summary
  }
}

This simple mutation, which updates some manifest with a label and summary. The body of the mutation is a response is the newly updated fields that we saved (label and summary) which are used to update our own local copy (keeping things reactive).

Mathmos could either have its search API consumed over HTTP, or a custom resolver for talking to it directly. It really depends on how far you want to go. This implementation will work as a proxy for content available over HTTP so it is still interoperable, but its not impossible to have this non-HTTP (talking directly to a collection) and then falling back to HTTP, if we wanted to host a IIIF-over-GraphQL server ourselves.

Ideally, this library could be a foundation for a unified API that can query across multiple resources (hosted in different databases or services) and even across institutions. If this was hosted on a server, you could in theory make a single API request and receive enough data to display the IIIF resource, an external annotation list and then inside the annotations any IIIF content through linking annotations (and maybe grab the tile source). Effectively allowing something like the Galway viewer to be bootstrapped with the data it needs in a single HTTP call (aside from the tile images themselves).

@tomcrane
Copy link

Effectively allowing something like the Galway viewer to be bootstrapped with the data it needs in a single HTTP call (aside from the tile images themselves).

This is very nice. But then it's not straight IIIF and becomes a GraphQL Service client; it's loading data by querying a service with a model it wants returned from across the graph, rather than loading a resource and following the links in that resource to more resources. Which is hugely powerful, but I would be careful about positioning - a really powerful library that might be assembling disparate IIIF data into resources under the hood (e.g., in Presley or other bits of the DLCS), vs an alternative public access model for that data, as a service rather than as a set of linked resources. Is a viewer like the Galway viewer positioning itself as a IIIF client, or a query service client? I wouldn't want to move to a world where IIIF publishers felt they had to offer services as well as resources (static, cheap) or sign up to having their resources indexed into a big IIIF Query Server.

https://iiif.io/api/annex/notes/design_patterns/#use-resource-oriented-design

This principle is important for simple publishing and simple consuming. For Presley, for DLCS, the query approach is valuable, and for certain specific types of client applications it's extremely powerful.

@stephenwf
Copy link
Owner Author

stephenwf commented Oct 29, 2018

sign up to having their resources indexed into a big IIIF Query Server.

This is something that this implementation won't do, it won't provide a search over an entire collection. The only thing it's really doing is following the graph of the flat resources, which a viewer would already be doing. Like dereferencing a tile source to get tile information, or annotation lists to display them, potentially across multiple services. It can however read an existing search service and provide an interface to search that. No more stress than an individual viewer making the requests

@stephenwf
Copy link
Owner Author

Here's a link to the GraphQL types that match up to Presentation 3 (but where Presentation 2 resources can be shoe-horned): https://github.com/stephenwf/iiif-redux/blob/e1d23c16f813762a8b5bba243f269e1671bcbdd7/packages/iiif-graphql/src/schema.graphql

@stephenwf
Copy link
Owner Author

Extracting the selector improvements and P2 -> P3 parity and then putting this on the back-burner. Interesting proof of concept, but way too heavy to run on the client side efficiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants