You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I accidentally created a sample sheet in single end mode, and then ran an Illumina data set through. I don't know what I think is correct behavior here: I could see either failing with a clear error or running all the way through on just the forward reads I supplied. But what I got instead was an error that didn't make much sense. On dev (124abcf) I got: gzip: null.gz: No such file or directory and:
Executing jgi.BBDuk [in=stdin.fastq, ref=virus-genomes-masked.fasta.gz, out=[sample]_1_viral_bbduk_pass.fastq.gz, outm=[sample]_1_viral_bbduk_fail.fastq.gz, stats=[sample]_1_viral_bbduk.stats.txt, minkmerhits=1, k=24, interleaved=t, t=8, -Xmx16g]
Version 39.01
Set INTERLEAVED to true
Set threads to 8
0.015 seconds.
Initial:
Memory: max=16464m, total=16464m, free=15948m, used=516m
Added 77992608 kmers; time: 57.584 seconds.
Memory: max=16464m, total=16464m, free=12894m, used=3570m
Input is being processed as paired
java.lang.AssertionError:
Error in stdin.fastq, line 1150, with these 4 lines:
@[readid]
[sequence]
+
at stream.FASTQ.quadToRead_slow(FASTQ.java:735)
at stream.FASTQ.toReadList(FASTQ.java:642)
at stream.FastqReadInputStream.fillBuffer(FastqReadInputStream.java:107)
at stream.FastqReadInputStream.hasMore(FastqReadInputStream.java:73)
at stream.ConcurrentGenericReadInputStream$ReadThread.readLists(ConcurrentGenericReadInputStream.java:669)
at stream.ConcurrentGenericReadInputStream$ReadThread.run(ConcurrentGenericReadInputStream.java:658)
Fusion Info:
ami-id: ami-0ba0650e6cd15c2b4
instance-id: i-0f55f284adc2d19ee
instance-type: c4.8xlarge
fusion_version: 2.4.8-11743c7
clone_namespace: false
kernel_version: 6.1
disk_cache_size: 999Gb
max_open_files: 1048576
I think actually this isn’t a problem with incorrect endedness (the pipeline will automatically infer endedness from the structure of the samplesheet) but rather that the main RUN pipeline just isn’t properly set up to run single-end data yet. (Simon hasn’t implemented several parts of the workflow for single-end data yet, and parts of the pipeline assume paired-end.)
So I actually think what we should do is put a roadblock early in the main RUN workflow, where if it infers you’re running single-end data it just stops and errors out.
I accidentally created a sample sheet in single end mode, and then ran an Illumina data set through. I don't know what I think is correct behavior here: I could see either failing with a clear error or running all the way through on just the forward reads I supplied. But what I got instead was an error that didn't make much sense. On
dev
(124abcf) I got:gzip: null.gz: No such file or directory
and:More context: https://twist.com/a/197793/inbox/t/6738804/
The text was updated successfully, but these errors were encountered: