Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with grist output #278

Open
marsfro opened this issue Apr 19, 2023 · 0 comments
Open

Problem with grist output #278

marsfro opened this issue Apr 19, 2023 · 0 comments

Comments

@marsfro
Copy link

marsfro commented Apr 19, 2023

Hello everyone!
Could you please help me with it:
I launched grist next way:
genome-grist run conf-tutorial.yml summarize_gather summarize_mapping

conf-tutorial.yml
samples:

  • S1e8747dc_1
    outdir: output_S1e8747dc/
    sourmash_databases:
  • gtdb-rs207.genomic-reps.dna.k31.zip

Grist found only 1 genome and in mapping folder only 1 bam file
The output of sourmash fron this sample was 21 genomes and and the genome found by the grist is not among them.

What's wrong?
My log file:

Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads


copy_sample_genomes_to_output_wc 1 1 1
gather_reads_wc 1 1 1
make_combined_info_csv_wc 1 1 1
make_gather_notebook_wc 1 1 1
make_mapping_notebook_wc 1 1 1
set_kernel 1 1 1
smash_trim_wc 1 1 1
sourmash_gather_wc 1 1 1
sourmash_prefetch_wc 1 1 1
summarize_gather 1 1 1
summarize_mapping 1 1 1
summarize_samtools_depth_wc 2 1 1
total 13 1 1

Select jobs to execute...

[Fri Apr 14 00:36:17 2023]
rule smash_trim_wc:
input: output_S1e8747dc/trim/S1e8747dc_1.trim.fq.gz
output: output_S1e8747dc/sigs/S1e8747dc_1.trim.sig.zip
jobid: 3
reason: Missing output files: output_S1e8747dc/sigs/S1e8747dc_1.trim.sig.zip
wildcards: sample=S1e8747dc_1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/be8b6eadbe483a28b8a41700266a1d23_
[Fri Apr 14 03:00:37 2023]
Finished job 3.
1 of 13 steps (8%) done
Select jobs to execute...

[Fri Apr 14 03:00:37 2023]
Job 6:
Find all potentially relevant database matches for S1e8747dc_1

Reason: Missing output files: output_S1e8747dc/gather/S1e8747dc_1.prefetch.csv.gz; Input files updated by another job: output_S1e8747dc/sigs/S1e8747dc_1.trim.sig.zip

Activating conda environment: .snakemake/conda/be8b6eadbe483a28b8a41700266a1d23_
Touching output file output_S1e8747dc/gather/S1e8747dc_1.prefetch.csv.gz.
Touching output file output_S1e8747dc/gather/S1e8747dc_1.known.sig.zip.
Touching output file output_S1e8747dc/gather/S1e8747dc_1.unknown.sig.zip.
[Fri Apr 14 03:49:03 2023]
Finished job 6.
2 of 13 steps (15%) done
Select jobs to execute...

[Fri Apr 14 03:49:03 2023]
Job 2:
Run gather for S1e8747dc_1

Reason: Missing output files: output_S1e8747dc/gather/S1e8747dc_1.gather.csv.gz; Input files updated by another job: output_S1e8747dc/gather/S1e8747dc_1.prefetch.csv.gz, output_S1e8747dc/sigs/S1e8747dc_1.trim.sig.zip

Activating conda environment: .snakemake/conda/be8b6eadbe483a28b8a41700266a1d23_
[Fri Apr 14 03:49:21 2023]
Finished job 2.
3 of 13 steps (23%) done
Select jobs to execute...

[Fri Apr 14 03:49:21 2023]
localcheckpoint gather_reads_wc:
input: output_S1e8747dc/gather/S1e8747dc_1.gather.csv.gz
output: output_S1e8747dc/gather/.gather.S1e8747dc_1
jobid: 9
reason: Missing output files: output_S1e8747dc/gather/.gather.S1e8747dc_1; Input files updated by another job: output_S1e8747dc/gather/S1e8747dc_1.gather.csv.gz
wildcards: sample=S1e8747dc_1
resources: tmpdir=/tmp
Downstream jobs will be updated after completion.

Touching output file output_S1e8747dc/gather/.gather.S1e8747dc_1.
[Fri Apr 14 03:49:21 2023]
Finished job 9.
4 of 13 steps (31%) done
Select jobs to execute...

[Fri Apr 14 03:49:21 2023]
checkpoint copy_sample_genomes_to_output_wc:
input: genbank_cache/GCF_014648495.1_genomic.fna.gz, genbank_cache/GCF_014648495.1.info.csv
output: output_S1e8747dc/genomes/.genomes.S1e8747dc_1
jobid: 8
reason: Missing output files: output_S1e8747dc/genomes/.genomes.S1e8747dc_1
wildcards: sample=S1e8747dc_1
resources: tmpdir=/tmp
Downstream jobs will be updated after completion.

Touching output file output_S1e8747dc/genomes/.genomes.S1e8747dc_1.
[Fri Apr 14 03:49:21 2023]
Finished job 8.
5 of 23 steps (22%) done
Select jobs to execute...

[Fri Apr 14 03:49:21 2023]
rule minimap_wc:
input: output_S1e8747dc/genomes/GCF_014648495.1_genomic.fna.gz, output_S1e8747dc/trim/S1e8747dc_1.trim.fq.gz
output: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
jobid: 25
reason: Missing output files: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
wildcards: sample=S1e8747dc_1, ident=GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/052c7d4415d4fa072e20f9c6e1aa5026_
[Fri Apr 14 05:38:07 2023]
Finished job 25.
6 of 23 steps (26%) done
Select jobs to execute...

[Fri Apr 14 05:38:07 2023]
rule samtools_count_wc:
input: output_S1e8747dc/genomes/GCF_014648495.1_genomic.fna.gz, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
output: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.count_mapped_reads.txt
jobid: 27
reason: Missing output files: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.count_mapped_reads.txt; Input files updated by another job: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
wildcards: dir=mapping, sample=S1e8747dc_1, ident=GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/9b3a1923d8812e952bfc5c5b9669e4d4_
[Fri Apr 14 05:38:36 2023]
Finished job 27.
7 of 23 steps (30%) done
Select jobs to execute...

[Fri Apr 14 05:38:36 2023]
rule samtools_mpileup_wc:
input: output_S1e8747dc/genomes/GCF_014648495.1_genomic.fna.gz, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
output: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bcf, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.vcf.gz, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.vcf.gz.csi
jobid: 26
reason: Missing output files: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.vcf.gz; Input files updated by another job: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
wildcards: dir=mapping, sample=S1e8747dc_1, ident=GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/9b3a1923d8812e952bfc5c5b9669e4d4_
[Fri Apr 14 06:13:50 2023]
Finished job 26.
8 of 23 steps (35%) done
Select jobs to execute...

[Fri Apr 14 06:13:50 2023]
rule bam_to_fastq_wc:
input: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
output: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.mapped.fq.gz
jobid: 31
reason: Missing output files: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.mapped.fq.gz; Input files updated by another job: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
wildcards: bam=S1e8747dc_1.x.GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/052c7d4415d4fa072e20f9c6e1aa5026_
[Fri Apr 14 06:24:29 2023]
Finished job 31.
9 of 23 steps (39%) done
Select jobs to execute...

[Fri Apr 14 06:24:29 2023]
rule bam_to_depth_wc:
input: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
output: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.depth.txt
jobid: 24
reason: Missing output files: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.depth.txt; Input files updated by another job: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.bam
wildcards: dir=mapping, bam=S1e8747dc_1.x.GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/052c7d4415d4fa072e20f9c6e1aa5026_
[Fri Apr 14 06:25:29 2023]
Finished job 24.
10 of 23 steps (43%) done
Select jobs to execute...

[Fri Apr 14 06:25:29 2023]
rule extract_leftover_reads_wc:
input: output_S1e8747dc/gather/S1e8747dc_1.gather.csv.gz, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.mapped.fq.gz
output: output_S1e8747dc/leftover/.leftover.S1e8747dc_1
jobid: 30
reason: Missing output files: output_S1e8747dc/leftover/.leftover.S1e8747dc_1; Input files updated by another job: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.mapped.fq.gz
wildcards: sample=S1e8747dc_1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/be8b6eadbe483a28b8a41700266a1d23_
Touching output file output_S1e8747dc/leftover/.leftover.S1e8747dc_1.
[Fri Apr 14 07:04:28 2023]
Finished job 30.
11 of 23 steps (48%) done
Select jobs to execute...

[Fri Apr 14 07:04:28 2023]
rule summarize_samtools_depth_wc:
input: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.depth.txt, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.vcf.gz, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.count_mapped_reads.txt
output: output_S1e8747dc/mapping/S1e8747dc_1.summary.csv
jobid: 13
reason: Missing output files: output_S1e8747dc/mapping/S1e8747dc_1.summary.csv; Input files updated by another job: output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.count_mapped_reads.txt, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.vcf.gz, output_S1e8747dc/mapping/S1e8747dc_1.x.GCF_014648495.1.depth.txt
wildcards: dir=mapping, sample=S1e8747dc_1
resources: tmpdir=/tmp

[Fri Apr 14 07:04:31 2023]
Finished job 13.
12 of 23 steps (52%) done
Select jobs to execute...

[Fri Apr 14 07:04:31 2023]
rule map_leftover_reads_wc:
input: output_S1e8747dc/mapping/S1e8747dc_1.summary.csv, output_S1e8747dc/genomes/GCF_014648495.1_genomic.fna.gz, output_S1e8747dc/leftover/.leftover.S1e8747dc_1
output: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bam
jobid: 29
reason: Missing output files: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bam; Input files updated by another job: output_S1e8747dc/mapping/S1e8747dc_1.summary.csv, output_S1e8747dc/leftover/.leftover.S1e8747dc_1
wildcards: sample=S1e8747dc_1, ident=GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/052c7d4415d4fa072e20f9c6e1aa5026_
[Fri Apr 14 08:00:10 2023]
Finished job 29.
13 of 23 steps (57%) done
Select jobs to execute...

[Fri Apr 14 08:00:10 2023]
rule samtools_count_wc:
input: output_S1e8747dc/genomes/GCF_014648495.1_genomic.fna.gz, output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bam
output: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.count_mapped_reads.txt
jobid: 33
reason: Missing output files: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.count_mapped_reads.txt; Input files updated by another job: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bam
wildcards: dir=leftover, sample=S1e8747dc_1, ident=GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/9b3a1923d8812e952bfc5c5b9669e4d4_
[Fri Apr 14 08:00:38 2023]
Finished job 33.
14 of 23 steps (61%) done
Select jobs to execute...

[Fri Apr 14 08:00:38 2023]
rule samtools_mpileup_wc:
input: output_S1e8747dc/genomes/GCF_014648495.1_genomic.fna.gz, output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bam
output: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bcf, output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.vcf.gz, output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.vcf.gz.csi
jobid: 32
reason: Missing output files: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.vcf.gz; Input files updated by another job: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bam
wildcards: dir=leftover, sample=S1e8747dc_1, ident=GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/9b3a1923d8812e952bfc5c5b9669e4d4_
[Fri Apr 14 08:35:46 2023]
Finished job 32.
15 of 23 steps (65%) done
Select jobs to execute...

[Fri Apr 14 08:35:46 2023]
rule bam_to_depth_wc:
input: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bam
output: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.depth.txt
jobid: 28
reason: Missing output files: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.depth.txt; Input files updated by another job: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.bam
wildcards: dir=leftover, bam=S1e8747dc_1.x.GCF_014648495.1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/052c7d4415d4fa072e20f9c6e1aa5026_
[Fri Apr 14 08:36:47 2023]
Finished job 28.
16 of 23 steps (70%) done
Select jobs to execute...

[Fri Apr 14 08:36:47 2023]
rule summarize_samtools_depth_wc:
input: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.depth.txt, output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.vcf.gz, output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.count_mapped_reads.txt
output: output_S1e8747dc/leftover/S1e8747dc_1.summary.csv
jobid: 14
reason: Missing output files: output_S1e8747dc/leftover/S1e8747dc_1.summary.csv; Input files updated by another job: output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.depth.txt, output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.vcf.gz, output_S1e8747dc/leftover/S1e8747dc_1.x.GCF_014648495.1.count_mapped_reads.txt
wildcards: dir=leftover, sample=S1e8747dc_1
resources: tmpdir=/tmp

[Fri Apr 14 08:36:49 2023]
Finished job 14.
17 of 23 steps (74%) done
Select jobs to execute...

[Fri Apr 14 08:36:49 2023]
rule make_combined_info_csv_wc:
input: output_S1e8747dc/genomes/GCF_014648495.1.info.csv
output: output_S1e8747dc/gather/S1e8747dc_1.genomes.info.csv
jobid: 7
reason: Missing output files: output_S1e8747dc/gather/S1e8747dc_1.genomes.info.csv
wildcards: sample=S1e8747dc_1
resources: tmpdir=/tmp

[Fri Apr 14 08:36:49 2023]
Finished job 7.
18 of 23 steps (78%) done
Select jobs to execute...

[Fri Apr 14 08:36:49 2023]
rule set_kernel:
input: /home/mfrolova/anaconda3/envs/grist/lib/python3.7/site-packages/genome_grist/conf/env/papermill.yml
output: output_S1e8747dc/.kernel.set
jobid: 10
reason: Missing output files: output_S1e8747dc/.kernel.set
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/6ecac7d573969eb57b185be5d53a8113_
Touching output file output_S1e8747dc/.kernel.set.
[Fri Apr 14 08:36:50 2023]
Finished job 10.
19 of 23 steps (83%) done
Select jobs to execute...

[Fri Apr 14 08:36:50 2023]
rule make_gather_notebook_wc:
input: /home/mfrolova/anaconda3/envs/grist/lib/python3.7/site-packages/genome_grist/conf/../notebooks/report-gather.ipynb, output_S1e8747dc/gather/S1e8747dc_1.gather.csv.gz, output_S1e8747dc/gather/S1e8747dc_1.genomes.info.csv, output_S1e8747dc/.kernel.set
output: output_S1e8747dc/reports/report-gather-S1e8747dc_1.ipynb, output_S1e8747dc/reports/report-gather-S1e8747dc_1.html
jobid: 1
reason: Missing output files: output_S1e8747dc/reports/report-gather-S1e8747dc_1.html; Input files updated by another job: output_S1e8747dc/gather/S1e8747dc_1.gather.csv.gz, output_S1e8747dc/.kernel.set, output_S1e8747dc/gather/S1e8747dc_1.genomes.info.csv
wildcards: sample=S1e8747dc_1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/6ecac7d573969eb57b185be5d53a8113_
[Fri Apr 14 08:36:54 2023]
Finished job 1.
20 of 23 steps (87%) done
Select jobs to execute...

[Fri Apr 14 08:36:54 2023]
localrule summarize_gather:
input: output_S1e8747dc/reports/report-gather-S1e8747dc_1.html
jobid: 0
reason: Input files updated by another job: output_S1e8747dc/reports/report-gather-S1e8747dc_1.html
resources: tmpdir=/tmp

[Fri Apr 14 08:36:54 2023]
Finished job 0.
21 of 23 steps (91%) done
Select jobs to execute...

[Fri Apr 14 08:36:54 2023]
rule make_mapping_notebook_wc:
input: /home/mfrolova/anaconda3/envs/grist/lib/python3.7/site-packages/genome_grist/conf/../notebooks/report-mapping.ipynb, output_S1e8747dc/mapping/S1e8747dc_1.summary.csv, output_S1e8747dc/leftover/S1e8747dc_1.summary.csv, output_S1e8747dc/gather/S1e8747dc_1.gather.csv.gz, output_S1e8747dc/gather/S1e8747dc_1.genomes.info.csv, output_S1e8747dc/.kernel.set
output: output_S1e8747dc/reports/report-mapping-S1e8747dc_1.ipynb, output_S1e8747dc/reports/report-mapping-S1e8747dc_1.html
jobid: 12
reason: Missing output files: output_S1e8747dc/reports/report-mapping-S1e8747dc_1.html; Input files updated by another job: output_S1e8747dc/gather/S1e8747dc_1.gather.csv.gz, output_S1e8747dc/.kernel.set, output_S1e8747dc/mapping/S1e8747dc_1.summary.csv, output_S1e8747dc/gather/S1e8747dc_1.genomes.info.csv, output_S1e8747dc/leftover/S1e8747dc_1.summary.csv
wildcards: sample=S1e8747dc_1
resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/6ecac7d573969eb57b185be5d53a8113_
[Fri Apr 14 08:37:01 2023]
Finished job 12.
22 of 23 steps (96%) done
Select jobs to execute...

[Fri Apr 14 08:37:01 2023]
localrule summarize_mapping:
input: output_S1e8747dc/reports/report-mapping-S1e8747dc_1.html, output_S1e8747dc/reports/report-gather-S1e8747dc_1.html
jobid: 11
reason: Input files updated by another job: output_S1e8747dc/reports/report-mapping-S1e8747dc_1.html, output_S1e8747dc/reports/report-gather-S1e8747dc_1.html
resources: tmpdir=/tmp

[Fri Apr 14 08:37:01 2023]
Finished job 11.
23 of 23 steps (100%) done
Complete log: .snakemake/log/2023-04-14T003616.351359.snakemake.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant