Skip to content

jindrichmynarz/MARC_A_to_RDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MARC for Authority Records to RDF

XSL transformation for converting data represented with MARC 21 for Authority Records and serialized in MARC XML to RDF. The resulting RDF uses primarily Simple Knowledge Organization System and MADS/RDF. The XSLT is accompanied with scripted tasks for driving the transformation, loading data into an RDF store and executing SPARQL queries for enriching the data.

The transformation was developed for an LOD2 Publink project with the National Library of Israel.

Steps

Steps of the transformation are implemented as Rake tasks. Use rake -T to list all available tasks. Before running any of the tasks edit the configuration file in etc/config.xml.

  1. rake xslt[path/to/marc-21-a.xml] to execute the XSL transformation from MARC XML to RDF/XML (file tmp/output.rdf).
  2. rake fuseki:load to load the created RDF in Jena TDB.
  3. rake fuseki:start to start a SPARQL endpoint for the loaded data.
  4. rake sparql:enrich to issue several SPARQL Update requests that will enrich the processed data.
  5. rake sparql:metadata to compute dataset statistics and generate corresponding metadata in separate named graph.
  6. rake fuseki:stop to stop the SPARQL endpoint.
  7. rake fuseki:dump to export the transformed dataset into N-Quads files located in the tmp directory.
  8. rake fuseki:purge to clear all Jena TDB files.

Dependencies

  • Fuseki
  • Jena: uses Jena TDB as database
  • Rake: works with Ruby version 1.8.7 or newer
  • Saxon: version 9.x, can be replaced by any XSLT 2.0 processor

Known caveats

In case you get timeout errors (Timeout::Error) for some of the enrichment or metadata SPARQL generation queries, try increasing the timeout limit (ja:cxtValue property for ja:cxtName "arq:queryTimeout") in the Fuseki server configuration in etc/fuseki.ttl and then run the enrichment Rake task again.

About

Conversion of MARC for Authority Records format to RDF

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published