Open and FAIR implementation of the PREDICT method described in the paper titled "PREDICT: a method for inferring novel drug indications with application to personalized medicine.", Gottlieb A, Stein GY, Ruppin E, Sharan R. Mol Syst Biol. 2011;7:496. Published 2011 Jun 7. doi:10.1038/msb.2011.26
This issue is related to Bio2RDF datasets. The format of each Bio2RDF dataset has to be fixed before uploading it to the triple-store. Example:
python src/preprocess_bio2rdf.py -i drugbank.nq.gz -o refined_drugbank.nq.gz
Upload each RDF data into triple-store (GraphDB or Virtuoso)
see Dockerfile
CQ1.1: Which steps are meant to be executed manually and which to be executed computationally?
PREFIX bpmn: <http://dkm.fbk.eu/index.php/BPMN2_Ontology#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX opredict: <http://purl.org/plex/Instances/OpenPREDICT#>
SELECT ?step ?stepType ?instructions ?description
WHERE
{
values ?stepType { bpmn:ManualTask bpmn:ScriptTask }
?instructions rdf:type p-plan:Plan.
?step rdf:type ?stepType.
?step dul:isDescribedBy ?instructions.
?instructions dc:description ?description.
?step p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v01.
}
To run : Yasgui Link
CQ1.2: For the manual parts, who are the developers and who are the agents responsible to execute each step?
PREFIX bpmn: <http://dkm.fbk.eu/index.php/BPMN2_Ontology#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX opredict: <http://purl.org/plex/Instances/OpenPREDICT#>
SELECT ?step ?role ?agent ?creator ?publisher ?instructions ?description
WHERE
{
values ?stepType { bpmn:ManualTask }
?instructions rdf:type p-plan:Plan.
?step rdf:type ?stepType.
?step dul:isDescribedBy ?instructions.
?instructions dc:description ?description.
?association prov:hadPlan ?instructions.
?association prov:agent ?agent.
?association prov:hadRole ?role.
OPTIONAL {?plan dc:creator ?creator}
OPTIONAL {?plan dc:publisher ?publisher}
?step p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v01.
}
To run : Yasgui Link
CQ1.3: Which datasets were used as input for the computational steps and their respective formats?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX edam: <http://edamontology.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX opredict: <http://purl.org/plex/Instances/OpenPREDICT#>
SELECT ?step ?instructions ?usage ?usageEntity ?downloadURL ?dataFormat ?dataFormatLabel
WHERE
{
?usageEntity rdf:type dcat:Distribution.
?usage prov:entity ?usageEntity.
?plan prov:qualifiedUsage ?usage.
?step dul:isDescribedBy ?plan.
?step rdf:type edam:operation_2409.
?usageEntity dcat:mediaType ?dataFormat
OPTIONAL { ?dataFormat rdfs:label ?dataFormatLabel.}
OPTIONAL { ?usageEntity dcat:downloadURL ?downloadURL.}
OPTIONAL { ?plan dc:description ?instructions.}
OPTIONAL { ?step p-plan:hasInputVar ?varInput.}
OPTIONAL { ?step p-plan:hasOutputVar ?varOutput.}
?step p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v01.
}
To run : Yasgui Link
CQ1.4: What are the inputs and outputs of manual steps?
PREFIX bpmn: <http://dkm.fbk.eu/index.php/BPMN2_Ontology#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX opredict: <http://purl.org/plex/Instances/OpenPREDICT#>
SELECT ?step ?varInput ?varOutput ?instructions ?description
WHERE
{
values ?stepType { bpmn:ManualTask}
?instructions rdf:type p-plan:Plan.
?step dul:isDescribedBy ?instructions.
?instructions dc:description ?description.
OPTIONAL { ?step p-plan:hasInputVar ?varInput.}
OPTIONAL { ?step p-plan:hasOutputVar ?varOutput.}
?step p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v01.
}
ORDER BY DESC(?varInput)
To run : Yasgui Link
CQ2.1: What are the main steps of OpenPREDICT protocol?
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX opredict: <http://purl.org/plex/Instances/OpenPREDICT#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX pwo: <http://purl.org/spar/pwo#>
SELECT ?stepA ?stepB
WHERE
{
?stepA p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v01.
?stepA dul:precedes ?stepB.
OPTIONAL {
opredict:Plan_Main_Protocol_v01 pwo:hasFirstStep ?stepTopLevel.
?stepTopLevel dul:precedes ?stepB.
}
}
ORDER BY DESC(?stepTopLevel)
To run : Yasgui Link
CQ2.2: What are the steps of a plan and how each step instruction is described?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX dc: <http://purl.org/dc/terms/>
SELECT ?language ?instructions ?description ?step
WHERE
{
?instructions rdf:type p-plan:Plan.
?step dul:isDescribedBy ?instructions.
?instructions dc:description ?description.
?instructions dc:language ?language.
}
To run : Yasgui Link
CQ2.3: What instructions specify the code used in OpenPREDICT steps?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX bpmn: <http://dkm.fbk.eu/index.php/BPMN2_Ontology#>
PREFIX dc: <http://purl.org/dc/terms/>
SELECT ?specInstruction ?specification ?instructions ?description ?step ?language
WHERE
{
?instructions rdf:type p-plan:Plan.
?step dul:isDescribedBy ?instructions.
?step rdf:type bpmn:ScriptTask.
?instructions dc:description ?description.
?instructions dc:language ?language.
OPTIONAL
{
?instructions dul:isDescribedBy ?specInstruction.
?specInstruction dc:description ?specification.
}
}
ORDER BY ?step
To run : Yasgui Link
CQ3.1: What are the existing versions of a workflow and what are their provenance?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX dc: <http://purl.org/dc/terms/>
SELECT ?workflow ?wflVersion ?creator ?createDate
WHERE
{
?workflow rdf:type dul:Workflow.
?workflow dc:hasVersion ?wflVersion.
?workflow dc:creator ?creator.
?workflow dc:created ?createDate.
}
To run : Yasgui Link
CQ3.2: Which instructions were removed/changed/added from one version to another?
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX opredict: <http://purl.org/plex/Instances/OpenPREDICT#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX bpmn: <http://dkm.fbk.eu/index.php/BPMN2_Ontology#>
PREFIX prov: <http://www.w3.org/ns/prov#>
SELECT *
WHERE
{
?step p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v01.
?step dul:isDescribedBy ?instruction.
?step rdf:type ?stepType.
values ?stepType { bpmn:ManualTask bpmn:ScriptTask }
FILTER NOT EXISTS
{
?step p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v02.
}
FILTER NOT EXISTS
{
?instructionNextVersion prov:wasRevisionOf ?instruction.
?stepNextVersion dul:isDescribedBy ?instructionNextVersion.
?stepNextVersion p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v02.
}
}
To run : Yasgui Link
CQ3.3: Which steps were automatized from one version to another?
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX bpmn: <http://dkm.fbk.eu/index.php/BPMN2_Ontology#>
SELECT ?stepPriorVersion ?planPriorVersion ?stepNewVersion ?planNewVersion
WHERE
{
?planNewVersion prov:wasRevisionOf ?planPriorVersion.
?planNewVersion dc:description ?planNewVersionDesc.
?planPriorVersion dc:description ?planPriorVersionDesc.
?stepNewVersion dul:isDescribedBy ?planNewVersion.
?stepNewVersion rdf:type ?stepNewVersionType.
?stepPriorVersion dul:isDescribedBy ?planPriorVersion.
?stepPriorVersion rdf:type ?stepPriorVersionType.
values ?stepPriorVersionType { bpmn:ManualTask}.
values ?stepNewVersionType { bpmn:ScriptTask}
}
To run : Yasgui Link
CQ3.4: Which datasets were removed/changed/added for the different versions?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX edam: <http://edamontology.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX opredict: <http://purl.org/plex/Instances/OpenPREDICT#>
SELECT ?step ?instructions ?usage ?usageEntity ?downloadURL ?dataFormat ?dataFormatLabel
WHERE
{
?usageEntity rdf:type dcat:Distribution.
?usage prov:entity ?usageEntity.
?plan prov:qualifiedUsage ?usage.
?step dul:isDescribedBy ?plan.
?step rdf:type edam:operation_2409.
?usageEntity dcat:mediaType ?dataFormat
OPTIONAL { ?dataFormat rdfs:label ?dataFormatLabel.}
OPTIONAL { ?usageEntity dcat:downloadURL ?downloadURL.}
OPTIONAL { ?plan dc:description ?instructions.}
OPTIONAL { ?step p-plan:hasInputVar ?varInput.}
OPTIONAL { ?step p-plan:hasOutputVar ?varOutput.}
OPTIONAL {?step p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v01}
OPTIONAL {?step p-plan:isStepOfPlan opredict:Plan_Main_Protocol_v02}
}
To run : Yasgui Link
CQ3.5: Which workflow version was used in each execution and what was generated?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX p-plan: <http://purl.org/net/p-plan#>
PREFIX dul: <http://www.ontologydesignpatterns.org/ont/dul/DUL.owl#>
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX prov: <http://www.w3.org/ns/prov#>
SELECT ?plan ?version ?execution ?stepExecuted ?wflExecArtifact
WHERE
{
?execution rdf:type p-plan:Activity.
?execution p-plan:correspondsToStep ?stepExecuted.
?stepExecuted p-plan:isStepOfPlan ?plan.
?plan rdf:type dul:Workflow.
?plan dc:hasVersion ?version.
?execution prov:generated ?wflExecArtifact.
}
ORDER BY ?version
To run : Yasgui Link
- Use the OpenPREDICT GraphDB SPARQL endpoint (http://graphdb.dumontierlab.com/repositories/openpredict) to query all data
- If you don't want to use the given SPARQL endpoint, collect all sources from given links and pre-process bio2rdf datasets (see section: Pre-processing Bio2RDF data ), then create your triple store and upload each RDF data into your triple store (currently tested with GraphDB or Virtuoso)
- Clone the project
git clone https://github.com/fair-workflows/openpredict.git
- Install docker to set up the environment
To install docker: https://docs.docker.com/install/
- Build
From openpredict directory, edit workflow/config.yml file, set sparql_ep to the running SPARQL endpoint or your own SPARQL endpoint
cd openpredict/
docker build -t openpredict .
- Run Juypter
docker run -d --rm --name openpredict -p 8888:8888 openpredict
- Execute CWL workflow
docker exec -it openpredict cwltool --outdir=/juypter/run/ workflow/openpredict-ipynb.cwl workflow/config.yml
--outdir enter folder in which you want to generate the outputs
You would expect to see two juypter notebook output notebook files (output_fg.ipynb, output_ml.ipynb) and the other generated results to be stored in your outdir ('/juypter/run/')