Skip to content
Idoia edited this page Apr 28, 2015 · 37 revisions

Linked Data Server

The Linked Data Server module makes the URIs of Aliada´s dataset dereferenceable. This means that when a URI of a resource is accessed from an HTML or RDF browser, the server returns the HTML or RDF description and not a 404 Not Found error instead. The following request formats are handled:

  • RDF XML: application/rdf+xml, ".rdf"
  • JSON: application/rdf+json, ".json"
  • JSON-LD: application/ld+json
  • N3/Turtle: text/rdf+n3, ".ttl"
  • HTML: text/html, ".html"
  • TEXT: text/plain (N-Triples)
  • OPAC: ".opac"

This module executes several ISQL commands in the RDF store that create several URL rewrite rules so that when an URI of the dataset is requested, the server translates it to the corresponding SPARQL or redirection.

It also creates the default web page for the generated dataset, where the following information is presented:

  • A description of the dataset.
  • The URL of the data source from where the dataset has been generated.
  • The dataset generation date.
  • The dataset license.
  • The SPARQL endpoint of the dataset
  • The vocabulary used (e.g.:ALIADA ontology).
  • The number of triples in the dataset.
  • The list of resources of the dataset.
  • The list of subsets included in the datsets and the list of resources of each of them

To understand the URI de-referentiation table shown below, the terms used in it are explained next. The URI-s are composed of the following parameters:

These parameters are set in the dataset and subset DB tables.

The following table shows the URI dereferenciation carried out by the LInked Data Server module:

URI dereferencing table

REST Interface

The Linked Data Server module provides a RESTful interface. It offers the following services:

  • Create a new Linked Data Server job. The identifier of the job to be initiated must be provided, and it is supposed to be a valid integer.

    • method: POST
    • URL: http://<host>:<port>/lds/job
    • parameters sent inside a form (APPLICATION_FORM_URLENCODED):
      • jobid=<job identifier>
  • Get a Linked Data Server job state/info. The identifier of the job must be provided, and it is supposed to be a valid integer.

    • method: GET
    • URL: http://<host>:<port>/lds/job/<job identifier>

Once the Linked Data Server module receives any of these service invocations, it reads the input parameters of the job from table aliada.linkeddataserver_job_instances of a relational DB. The parameters to connect to this DB are obtained from the "context.xml" file of the Linked Data Server module. The services will return an XML or JSON structure with the following information:

  • id: the job identifier.
  • startDate: the starting date of the job.
  • endDate: the end date of the job.
  • status: the status of the job. Possible values:
    • idle: the job hasn´t started yet. That is, the DB table row exists, but the job creation REST service hasn´t been invoked yet.
    • running : the job is still running.
    • finished : the job has finished.

Here is an example in JSON format:

    {
        "endDate":"2014-07-09T10:33:43","id":1,"startDate":"2014-07-10T10:33:08","status":"finished"
    }

Relational DB tables used

The Linked Data Server module uses the following tables:

  • Table aliada.linkeddataserver_job_instances. This table is used for saving the configuration parameters and the state of each job instance. The configuration parameters are set by the module that creates the job instance in the DB, that is the IU module. The state related fields are set by the job itself.
  • Table aliada.dataset. This table contains information about the dataset created by the ALIADA tool when converting the input records into RDF.
  • Table aliada.subset. This table contains information about the subsets of a dataset.

Table aliada.linkeddataserver_job_instances

This table contains the following fields grouped by configuration parameters fields and state related fields:

  • job_id
  • Configuration fields:
    • store_ip: IP address of the machine where the RDF store resides.
    • store_sql_port: port of the RDF store for SQL access.
    • sql_login: the login of the SQL access.
    • sql_password: the password of the SQL access.
    • isql_command_path: full path to the ISQL command.
    • isql_commands_file_dataset_default: full path of the ISQL commands default file to execute for the dataset. If the dataset.isql_commands_file_dataset field is null or it does not exist, this one will be used.
    • isql_commands_file_subset_default: full path of the ISQL commands default file to execute for the subset. If the subset.isql_commands_file_subset field is null or it does not exist, this one will be used.
    • virtuoso_http_server_root: full path of Virtuoso HTTP server root folder, where the web page for the dataset will be generated.
    • aliada_ontology: ALIADA ontology URI, for dereferencing the vocabulary used for the generated dataset.
    • datasetId: dataset identifier to get the dataset information from dataset table.
    • organisationId: organization identifier to get the organization information from organisation table.
    • tmp_dir: the name of the temporary folder to be used to store temporarily the organisation logo image. Afterwards, it will copied to the web page folder of the dataset.
  • State fields:
    • start_date
    • end_date

Table aliada.dataset

This table contains the following fields:

  • datasetId: dataset identifier.
  • organisationId: organization identifier
  • dataset_desc: dataset description.
  • domain_name: dataset domain name, e.g.: data.artium.org
  • uri_id_part: used to generate Identifier URI-s, e.g.: ”id”, URI: http://data.szepmuveszeti.hu/id/museumcollection/E18_Physical_Thing/szepmuveszeti.hu_object_29
  • uri_doc_part: used to generate Document URI-s, e.g.: ”doc”, URI: http://data.szepmuveszeti.hu/doc/museumcollection/E18_Physical_Thing/szepmuveszeti.hu_object_29
  • uri_def_part: used to generate the Ontology URI-s, e.g.: ”def”, URI: http://data.szepmuveszeti.hu/def/museumcollection
  • uri_concept_part: used in all URI types as a prefix to give a description of the dataset in the URI, e.g.: ”data”, URI: http://data.szepmuveszeti.hu/id/data/museumcollection/E18_Physical_Thing/szepmuveszeti.hu_object_29
  • uri_set_part: used to generate the subsets URI-s, e.g.: ”set” URI: http://data.artium.org/set/library/bib
  • listening_host: The address of the network interface the Virtuoso HTTP server uses to listen and accept connections.
  • virtual_host: The address of the network interface the Virtuoso HTTP server uses to listen and accept connections.
  • sparql_endpoint_uri: SPARQL endpoint URI.
  • sparql_endpoint_login: SPARQL endpoint user name.
  • sparql_endpoint_password: SPARQL endpoint password.
  • public_sparql_endpoint_uri: public SPARQL endpoint URI.
  • dataset_author: dataset author name. E.g.: Aliada Consortium.
  • ckan_dataset_name: dataset name in CKAN datahub.
  • dataset_long_desc: dataset long description for CKAN datahub.
  • dataset_source_url: URL of the data source from where the dataset has been generated.
  • license_ckan_id: CKAN license identifier of the dataset to be published in CKAN datahub. E.g.: cc-zero.
  • license_url: license URL of the dataset to be published in CKAN datahub. E.g.: http://creativecommons.org/publicdomain/zero/1.0/
  • isql_commands_file_dataset: full path of the ISQL commands file to execute for the dataset. If it is null or it does not exist, the linkeddataserver_job_instances.isql_commands_file_dataset_default field will be used.
  • dataset_web_page_root: full path of the dataset web page folder.

Table aliada.subset

This table contains the following fields:

  • datasetId: dataset identifier.
  • subsetId: subset identifier.
  • subset_desc: subset description.
  • uri_concept_part: used in all URI types as a prefix to give a description of the subset in the URI, e.g.: ”museumcollection”, URI: http://data.szepmuveszeti.hu/id/data/museumcollection/E18_Physical_Thing/szepmuveszeti.hu_object_29
  • graph_uri: URI of the graph in Virtuoso where the generated RDF triples are saved.
  • links_graph_uri: URI of the graph in Virtuoso where the discovered links are saved.
  • isql_commands_file_subset: full path of the ISQL commands file to execute for the subset. If it is null or it does not exist, the linkeddataserver_job_instances.isql_commands_file_subset_default field will be used.

Configuration file

The Linked Data Server module uses ISQL commands files that contain the ISQL commands to create the URL rewrite rules for the datasets and the subsets. There are two default ISQL command files:

  • The default ISQL command file for the datasets, that creates the following rewrite rules:
    • Create the folder for the default web page of the dataset.
    • Identifier URI-s are 303 redirected to their corresponding Document URI-s. Supported formats: ".ttl", ".rdf", ".json", ".html", application/ld.json, text/plain.
    • Document URI-s. Supported formats: ".ttl", ".rdf", ".json", ".html", ".opac", text/rdf+n3, application/rdf+xml, application/rdf+json, application/ld+json, text/plain.
    • List URI-s of a dataset. Supported formats: ".ttl", ".rdf", ".json", text/rdf+n3, application/rdf+xml, application/rdf+json, application/ld+json, text/plain.
    • Ontology URI-s. Supported formats: ".ttl", ".rdf", ".json", text/rdf+n3, application/rdf+xml, application/rdf+json, application/ld+json, text/plain.
  • The default ISQL command file for the subsets, that creates the following rewrite rules:
    • Document URI-s. Supported formats: ".ttl", ".rdf", ".json", ".html", ".opac", text/rdf+n3, application/rdf+xml, application/rdf+json, application/ld+json, text/plain.
    • List URI-s of a subset. Supported formats: ".ttl", ".rdf", ".json", text/rdf+n3, application/rdf+xml, application/rdf+json, application/ld+json, text/plain.

In folder "aliada-tool /aliada/aliada-linked-data-server/src /main/resources/" of Aliada´s code, the following ISQL command files can be found:

  • isql_rewrite_rules_dataset_default.sql : This file contains the default URL rewrite rules for the datasets.
  • isql_rewrite_rules_subset_default.sql : This file contains the default URL rewrite rules for the subsets.
  • isql_rewrite_rules_dataset_mfab.sql : This file contains the URL rewrite rules for the MFAB dataset, considering that the OPAC representation is provided by http://www.szepmuveszeti.hu/adatlap/.
  • isql_rewrite_rules_subset_mfab.sql : This file contains the URL rewrite rules for the MFAB subsets, considering that the OPAC representation is provided by http://www.szepmuveszeti.hu/adatlap/.
  • isql_rewrite_rules_dataset_artium.sql : This file contains the URL rewrite rules for the ARTIUM dataset, considering that the OPAC representation is provided by http://biblioteca.artium.org/Record/.
  • isql_rewrite_rules_subset_artium.sql : This file contains the URL rewrite rules for the ARTIUM subsets, considering that the OPAC representation is provided by http://biblioteca.artium.org/Record/.
  • [User interface] (User_Interface)
  • [RDFizer] (RDFizer)
  • [Links Discovery] (Links_Discovery)
  • [Linked Data Server] (Linked_Data_server)
  • [CKAN Datahub Page Creation] (CKAN_Datahub_Page_Creation)
  • [Release Notes 1.0] (Release_Notes1)
  • [Release Notes 2.0] (Release_Notes2)
  • [Release Notes 2.1] (Release_Notes_2.1)
Clone this wiki locally