Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PD-2800: Add filtered option as output to MergeStarOutput task #1437

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions pipeline_versions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,11 @@ ExomeReprocessing 3.3.3 2024-11-04
BuildIndices 3.1.0 2024-11-26
scATAC 1.3.2 2023-08-03
snm3C 4.0.4 2024-08-06
Multiome 5.9.4 2024-12-05
PairedTag 1.9.0 2024-12-05
Multiome 5.9.4 2025-01-09
PairedTag 1.9.0 2025-01-09
MultiSampleSmartSeq2 2.2.22 2024-09-11
MultiSampleSmartSeq2SingleNucleus 2.0.6 2024-11-15
Optimus 7.9.0 2024-12-05
MultiSampleSmartSeq2SingleNucleus 2.0.7 2025-01-09
Optimus 7.9.0 2025-01-09
atac 2.5.3 2024-11-22
SmartSeq2SingleSample 5.1.21 2024-09-11
SlideSeq 3.4.7 2024-12-3
SlideSeq 3.4.8 2024-01-09
3 changes: 2 additions & 1 deletion pipelines/skylab/multiome/Multiome.changelog.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# 5.9.4
2024-12-05 (Date of Last Commit)
2025-01-09 (Date of Last Commit)

* Moved the optional CellBender task to the Optimus.wdl
* Added filtered_mtx_files as an intermediate output to MergeStarOutput task; this does not affect the outputs of the pipeline

# 5.9.3
2024-12-3 (Date of Last Commit)
Expand Down
1 change: 1 addition & 0 deletions pipelines/skylab/multiome/Multiome.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ workflow Multiome {

String pipeline_version = "5.9.4"


input {
String cloud_provider
String input_id
Expand Down
3 changes: 2 additions & 1 deletion pipelines/skylab/optimus/Optimus.changelog.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# 7.9.0
2024-12-05 (Date of Last Commit)
2025-01-09 (Date of Last Commit)

* Added an optional task to the Optimus.wdl that will run CellBender on the Optimus output h5ad file
* Added filtered_mtx_files as an intermediate output to MergeStarOutput task; this does not affect the outputs of the pipeline

# 7.8.4
2024-12-3 (Date of Last Commit)
Expand Down
2 changes: 2 additions & 0 deletions pipelines/skylab/optimus/Optimus.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -74,9 +74,11 @@ workflow Optimus {
}

# version of this pipeline

String pipeline_version = "7.9.0"



# this is used to scatter matched [r1_fastq, r2_fastq, i1_fastq] arrays
Array[Int] indices = range(length(r1_fastq))

Expand Down
3 changes: 2 additions & 1 deletion pipelines/skylab/paired_tag/PairedTag.changelog.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# 1.9.0
2024-12-05 (Date of Last Commit)
2025-01-09 (Date of Last Commit)

* Added an optional task to the Optimus.wdl that will run CellBender on the Optimus output h5ad file
* Added filtered_mtx_files as an intermediate output to MergeStarOutput task; this does not affect the outputs of the pipeline

# 1.8.4
2024-12-3 (Date of Last Commit)
Expand Down
1 change: 0 additions & 1 deletion pipelines/skylab/paired_tag/PairedTag.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ workflow PairedTag {

String pipeline_version = "1.9.0"


input {
String input_id
# Additional library aliquot id
Expand Down
5 changes: 5 additions & 0 deletions pipelines/skylab/slideseq/SlideSeq.changelog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# 3.4.8
2024-01-09 (Date of Last Commit)

* Added filtered_mtx_files as an intermediate output to MergeStarOutput task; this does not affect the outputs of the pipeline

# 3.4.7
2024-12-3 (Date of Last Commit)

Expand Down
2 changes: 1 addition & 1 deletion pipelines/skylab/slideseq/SlideSeq.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import "../../../tasks/broad/Utilities.wdl" as utils

workflow SlideSeq {

String pipeline_version = "3.4.7"
String pipeline_version = "3.4.8"

input {
Array[File] r1_fastq
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# 2.0.7
2025-01-09 (Date of Last Commit)

* Added filtered_mtx_files as an intermediate output to MergeStarOutput task; this does not affect the outputs of the pipeline

# 2.0.6
2024-11-15 (Date of Last Commit)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ workflow MultiSampleSmartSeq2SingleNucleus {
}

# Version of this pipeline
String pipeline_version = "2.0.6"
String pipeline_version = "2.0.7"

if (false) {
String? none = "None"
Expand Down
17 changes: 11 additions & 6 deletions tasks/skylab/StarAlign.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -490,7 +490,10 @@ task MergeStarOutput {
}

command <<<
set -e
set -euo pipefail
set -x

# declare arrays for the files
declare -a barcodes_files=(~{sep=' ' barcodes})
declare -a features_files=(~{sep=' ' features})
declare -a matrix_files=(~{sep=' ' matrix})
Expand All @@ -511,10 +514,8 @@ task MergeStarOutput {
cp ~{input_id}.uniform.mtx ./matrix/matrix.mtx
cp ~{barcodes_single} ./matrix/barcodes.tsv
cp ~{features_single} ./matrix/features.tsv

tar -zcvf ~{input_id}.mtx_files.tar ./matrix/*


# Running star for combined cell matrix
# outputs will be called outputbarcodes.tsv. outputmatrix.mtx, and outputfeatures.tsv
STAR --runMode soloCellFiltering ./matrix ./output --soloCellFilter EmptyDrops_CR
Expand All @@ -523,8 +524,6 @@ task MergeStarOutput {
echo "listing files"
ls



if [ -f "${cell_reads_files[0]}" ]; then

# Destination file for cell reads
Expand Down Expand Up @@ -602,13 +601,18 @@ task MergeStarOutput {
echo "No text files found in the folder."
fi

#
# create the compressed raw count matrix with the counts, gene names and the barcodes
python3 /scripts/scripts/create-merged-npz-output.py \
--barcodes ${barcodes_files[@]} \
--features ${features_files[@]} \
--matrix ${matrix_files[@]} \
--input_id ~{input_id}

# tar up filtered matrix outputbarcodes.tsv, outputfeatures.tsv, outputmatrix.mtx
ls
echo "Tarring up filtered matrix files"
tar -cvf ~{input_id}_filtered_mtx_files.tar outputbarcodes.tsv outputfeatures.tsv outputmatrix.mtx
echo "Done"
>>>

runtime {
Expand All @@ -627,6 +631,7 @@ task MergeStarOutput {
File? cell_reads_out = "~{input_id}.star_metrics.tar"
File? library_metrics="~{input_id}_library_metrics.csv"
File? mtx_files ="~{input_id}.mtx_files.tar"
File? filtered_mtx_files = "~{input_id}_filtered_mtx_files.tar"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will outputbarcodes.tsv always exist? If not, will this line fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the outputbarcodes.tsv file will still exist

File? outputbarcodes = "outputbarcodes.tsv"
}
}
Expand Down
Loading