Skip to content

Commit

Permalink
Merge pull request #120 from ntalluri/direction
Browse files Browse the repository at this point in the history
Adding Directionality
  • Loading branch information
agitter authored Dec 3, 2023
2 parents 1cc6acb + cccf5ce commit acea45e
Show file tree
Hide file tree
Showing 97 changed files with 162,957 additions and 162,458 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/test-spras.yml
Original file line number Diff line number Diff line change
Expand Up @@ -152,8 +152,8 @@ jobs:
path: docker-wrappers/Cytoscape/.
dockerfile: docker-wrappers/Cytoscape/Dockerfile
repository: reedcompbio/py4cytoscape
tags: v1
cache_froms: reedcompbio/py4cytoscape:v1
tags: v2
cache_froms: reedcompbio/py4cytoscape:v2
push: false

# Run pre-commit checks on source files
Expand Down
11 changes: 10 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,15 @@ Follow the example for any of the other pathway reconstruction algorithm.
First pull the image `<username>/local-neighborhood` from Docker Hub.
Then build the Docker image using the `Dockerfile` that was completed in Step 2.

Modify generate inputs:
1. Include a key-value pair in the algo_exp_file dictionary that links the specific algorithm to its expected network file.
2. Obtain the expected network file from the workflow, manually confirm it is correct, and save it to `test/generate-inputs/expected`. Name it as `{algorithm_name}-{network_file_name}-expected.txt`.

Modify parse outputs:
1. Obtain the raw-pathway output (e.g. from the run function in your wrapper by running the Snakemake workflow) and save it to `test/parse-outputs/input`. Name it as `{algorithm_name}-raw-pathway.txt`.
2. Obtain the expected universal output from the workflow, manually confirm it is correct, and save it to `test/parse-outputs/expected` directory. Name it as `{algorithm_name}-pathway-expected.txt`.
3. Add the new algorithm's name to the algorithms list in `test/parse-outputs/test_parse_outputs.py`.

### Step 6: Work with SPRAS maintainers to revise the pull request
Step 0 previously described how to create a `local-neighborhood` branch and create a pull request.
Make sure to commit all of the new and modified files and push them to the `local-neighborhood` branch on your fork.
Expand All @@ -205,7 +214,7 @@ The pull request will be closed so that the `master` branch of the fork stays sy
1. Import the new class in `src/runner.py` so the wrapper functions can be accessed
1. Document the usage of the Docker wrapper and the assumptions made when implementing the wrapper
1. Add example usage for the new algorithm and its parameters to the template config file
1. Write test functions and provide example input data in a new test subdirectory `test/<algorithm>`
1. Write test functions and provide example input data in a new test subdirectory `test/<algorithm>`. Provide example data and algorithm/expected files names to lists or dicts in `test/generate-inputs` and `test/parse-outputs`. Use the full path with the names of the test files.
1. Extend `.github/workflows/test-spras.yml` to pull and build the new Docker image

When adding new algorithms, there are many other considerations that are not relevant with the simple Local Neighborhood example.
Expand Down
22 changes: 12 additions & 10 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,11 @@ def write_dataset_log(dataset, logfile):
def make_final_input(wildcards):
final_input = []

#TODO analysis could be parsed in the parse_config() function.
# TODO analysis could be parsed in the parse_config() function.
if config["analysis"]["summary"]["include"]:
# add summary output file for each pathway
final_input.extend(expand('{out_dir}{sep}{dataset}-{algorithm_params}{sep}summary.txt',out_dir=out_dir,sep=SEP,dataset=dataset_labels,algorithm_params=algorithms_with_params))
# TODO: reuse in the future once we make summary work for mixed graphs. See https://github.com/Reed-CompBio/spras/issues/128
# final_input.extend(expand('{out_dir}{sep}{dataset}-{algorithm_params}{sep}summary.txt',out_dir=out_dir,sep=SEP,dataset=dataset_labels,algorithm_params=algorithms_with_params))
# add table summarizing all pathways for each dataset
final_input.extend(expand('{out_dir}{sep}{dataset}-pathway-summary.txt',out_dir=out_dir,sep=SEP,dataset=dataset_labels))

Expand Down Expand Up @@ -219,14 +220,15 @@ rule parse_output:
run:
runner.parse_output(wildcards.algorithm, input.raw_file, output.standardized_file)

# TODO: reuse in the future once we make summary work for mixed graphs. See https://github.com/Reed-CompBio/spras/issues/128
# Collect summary statistics for a single pathway
rule summarize_pathway:
input:
standardized_file = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'pathway.txt'])
output:
summary_file = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'summary.txt'])
run:
summary.run(input.standardized_file,output.summary_file,directed=algorithm_directed[wildcards.algorithm])
# rule summarize_pathway:
# input:
# standardized_file = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'pathway.txt'])
# output:
# summary_file = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'summary.txt'])
# run:
# summary.run(input.standardized_file,output.summary_file)

# Write GraphSpace JSON graphs
rule viz_graphspace:
Expand All @@ -235,7 +237,7 @@ rule viz_graphspace:
graph_json = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'gs.json']),
style_json = SEP.join([out_dir, '{dataset}-{algorithm}-{params}', 'gsstyle.json'])
run:
graphspace.write_json(input.standardized_file,output.graph_json,output.style_json,directed=algorithm_directed[wildcards.algorithm])
graphspace.write_json(input.standardized_file,output.graph_json,output.style_json)


# Write a Cytoscape session file with all pathways for each dataset
Expand Down
7 changes: 0 additions & 7 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,12 @@
- name: "pathlinker"
params:
include: true
directed: true
run1:
k: range(100,201,100)

- name: "omicsintegrator1"
params:
include: true
directed: false
run1:
r: [5]
b: [5, 6]
Expand All @@ -47,7 +45,6 @@
- name: "omicsintegrator2"
params:
include: true
directed: false
run1:
b: [4]
g: [0]
Expand All @@ -58,7 +55,6 @@
- name: "meo"
params:
include: true
directed: true
run1:
max_path_length: [3]
local_search: ["Yes"]
Expand All @@ -67,20 +63,17 @@
- name: "mincostflow"
params:
include: true
directed: false
run1:
flow: [1] # The flow must be an int
capacity: [1]

- name: "allpairs"
params:
include: true
directed: false

- name: "domino"
params:
include: true
directed: false
run1:
slice_threshold: [0.3]
module_threshold: [0.05]
Expand Down
9 changes: 2 additions & 7 deletions config/egfr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ algorithms:
-
name: pathlinker
params:
directed: true
include: true
run1:
k:
Expand All @@ -18,7 +17,6 @@ algorithms:
-
name: omicsintegrator1
params:
directed: false
include: true
run1:
b:
Expand All @@ -38,8 +36,7 @@ algorithms:
-
name: omicsintegrator2
params:
directed: false
include: false
include: true
run1:
b:
- 4
Expand All @@ -53,8 +50,7 @@ algorithms:
-
name: meo
params:
directed: true
include: false
include: true
run1:
local_search:
- "Yes"
Expand All @@ -65,7 +61,6 @@ algorithms:
-
name: domino
params:
directed: false
include: true
run1:
slice_threshold:
Expand Down
2 changes: 2 additions & 0 deletions docker-wrappers/Cytoscape/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@ The Docker wrapper can be tested with `pytest`.

## Versions:
- v1: Use supervisord to launch Cytoscape from a Python subprocess, then connect to Cytoscape with py4cytoscape. Only loads undirected pathways. Compatible with Singularity in local testing (Apptainer version 1.2.2-1.el7) but fails in GitHub Actions.
- v2: Add support for edge direction column.

## TODO
- Add an auth file for `xvfb-run`
- Java initial heap size, maximum Java heap size, and thread stack size are hard-coded in `Cytoscape.vmoptions` file
- Resolve issues with `Cytoscape.vmoptions` line endings being reset to Windows-style. They must be reset periodically, and the image will fail if they are not Unix-style.
2 changes: 1 addition & 1 deletion docker-wrappers/Cytoscape/cytoscape_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def load_pathways(pathways: List[str], output: str) -> None:
path, name = parse_name(pathway)
suid = p4c.networks.import_network_from_tabular_file(
file=path,
column_type_list='s,t,x',
column_type_list='s,t,x,ea',
delimiters='\t'
)
p4c.networks.rename_network(name, network=suid)
Expand Down
17 changes: 12 additions & 5 deletions input/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,20 @@ This format may be deprecated.

### Edge file
Edge files do not include a header row.
Each row lists the two nodes that are connected with an undirected edge and a weight for that edge.
Directed edges are not currently supported.
Each row lists the two nodes that are connected with an edge, the weight for that edge, and, optionally, a directionality column to indicate whether the edge is directed or undirected.
The directionality values are either a 'U' for an undirected edge or a 'D' for a directed edge.
If the directionality column is not included, SPRAS will assume that the file's edges are entirely undirected.
The weights are typically in the range [0,1] with 1 being the highest confidence for the edge.

For example:
```
A B 0.98 U
B C 0.77 D
```
or
```
A B 0.98
B C 0.77
B C 0.77
```

## Toy datasets
Expand All @@ -46,15 +52,15 @@ The following files are very small toy datasets used to illustrate the supported
This dataset represents protein phosphorylation changes in response to epidermal growth factor (EGF) treatment.
The network includes protein-protein interactions from [iRefIndex](http://irefindex.org/) and kinase-substrate interactions from [PhosphoSitePlus](http://www.phosphosite.org/).
The files are originally from the [Temporal Pathway Synthesizer (TPS)](https://github.com/koksal/tps) repository.
They have been lightly modified for SPRAS by lowering one edge weight that was greater than 1, removing a PSEUDONODE prize, adding a prize of 10.0 to EGF_HUMAN, and converting all edges to undirected edges.
They have been lightly modified for SPRAS by lowering one edge weight that was greater than 1, removing 182 self-edges, removing a PSEUDONODE prize, and adding a prize of 10.0 to EGF_HUMAN.
The only source is EGF_HUMAN.
All proteins with phosphorylation-based prizes are also labeled as targets.
All nodes are considered active.

If you use any of the input files `tps-egfr-prizes.txt` or `phosphosite-irefindex13.0-uniprot.txt`, reference the publication

[Synthesizing Signaling Pathways from Temporal Phosphoproteomic Data](https://doi.org/10.1016/j.celrep.2018.08.085).
Ali Sinan Köksal, Kirsten Beck, Dylan R. Cronin, Aaron McKenna, Nathan D. Camp, Saurabh Srivastava, Matthew E. MacGilvray, Rastislav Bodík, Alejandro Wolf-Yadlin, Ernest Fraenkel, Jasmin Fisher, Anthony Gitter.
Ali Sinan Köksal, Kirsten Beck, Dylan R. Cronin, Aaron McKenna, Nathan D. Camp, Saurabh Srivastava, Matthew E. MacGilvray, Rastislav Bodík, Alejandro Wolf-Yadlin, Ernest Fraenkel, Jasmin Fisher, Anthony Gitter.
*Cell Reports* 24(13):3607-3618 2018.

If you use the network file `phosphosite-irefindex13.0-uniprot.txt`, also reference iRefIndex and PhosphoSitePlus.
Expand All @@ -68,3 +74,4 @@ Peter V Hornbeck, Bin Zhang, Beth Murray, Jon M Kornhauser, Vaughan Latham, Elzb
*Nucleic Acids Research* 43(D1):D512-520 2015.

The TPS [publication](https://doi.org/10.1016/j.celrep.2018.08.085) describes how the network data and protein prizes were prepared.

18 changes: 9 additions & 9 deletions input/alternative-network.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
A B 0.98
B C 0.77
A D 0.12
C D 0.89
C E 0.59
C F 0.50
F G 0.76
G H 0.92
G I 0.66
A B 0.98 U
B C 0.77 U
A D 0.12 U
C D 0.89 U
C E 0.59 U
C F 0.50 U
F G 0.76 U
G H 0.92 U
G I 0.66 U
4 changes: 2 additions & 2 deletions input/network.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
A B 0.98
B C 0.77
A B 0.98 U
B C 0.77 U
Loading

0 comments on commit acea45e

Please sign in to comment.