Name		Name	Last commit message	Last commit date
parent directory ..
.Rprofile		.Rprofile
003_prepare-Tzelepis2016-score-sgrna-library.R		003_prepare-Tzelepis2016-score-sgrna-library.R
005_collate-score-readcounts.R		005_collate-score-readcounts.R
007_extract-score-pdna-batch-read-counts.R		007_extract-score-pdna-batch-read-counts.R
010_prepare-ccle-raw-data.R		010_prepare-ccle-raw-data.R
015_prepare-dempap-raw-data.R		015_prepare-dempap-raw-data.R
020_prepare-achilles-pdna-batch-read-counts.R		020_prepare-achilles-pdna-batch-read-counts.R
025_prepare-score-raw-data.R		025_prepare-score-raw-data.R
030_split-file-by-depmapid.R		030_split-file-by-depmapid.R
035_merge-modeling-data.R		035_merge-modeling-data.R
040_combine-modeling-data.R		040_combine-modeling-data.R
045_check-depmap-modeling-data_exec.ipynb		045_check-depmap-modeling-data_exec.ipynb
045_check-depmap-modeling-data_exec.md		045_check-depmap-modeling-data_exec.md
045_check-depmap-modeling-data_original.ipynb		045_check-depmap-modeling-data_original.ipynb
050_depmap-subset-dataframes.R		050_depmap-subset-dataframes.R
055_auxiliary-data-files.py		055_auxiliary-data-files.py
057_cell-line-information.R		057_cell-line-information.R
058_split-modeling-data-per-lineage.R		058_split-modeling-data-per-lineage.R
059_split-broad-modeling-data-per-sublineage.R		059_split-broad-modeling-data-per-sublineage.R
060_prep-sanger-cgc.R		060_prep-sanger-cgc.R
061_prep-bailey-2018-cancer-genes.R		061_prep-bailey-2018-cancer-genes.R
065_total-read-count-tables.py		065_total-read-count-tables.py
README.md		README.md
genes_over_chromosomal_range.R		genes_over_chromosomal_range.R
munge-config.json		munge-config.json
munge-dag.png		munge-dag.png
munge.sh		munge.sh
munge.smk		munge.smk
munge_functions.R		munge_functions.R

README.md

Data preparation

Raw data is stored in "data/" and the prepared data is saved to "modeling_data/". (The "cache/" directory is for saving intermediate data of analyses.) See the README in the "data/" for instructions to download the raw data for this project.

A single Snakemake workflow prepares all of the data. It can be run using the following command in the root directory of the project.

conda activate speclet_smk
make munge

If running on O2, the jobs can be parallelized over the HPC cluster using the following command, instead.

make munge_o2

Below is the DAG of the pipeline (scaled down to just 5 cell lines).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

munge

munge

README.md

Data preparation

Files

munge

Directory actions

More options

Directory actions

More options

Latest commit

History

munge

Folders and files

parent directory

README.md

Data preparation