Skip to content
Pierre Lindenbaum edited this page Nov 20, 2013 · 11 revisions

##Motivation

implementation of of https://twitter.com/DNAntonie/status/402909852277932032 " +

Shrink your FASTQ.bz2 files by 40+% using this one weird tip -> order them by alignment to reference before compression

##Compilation See also Compilation.

$ ant bam2fastq

##Options

Name Description
-v print version and exit.
-E (name) restrict to that enzyme. Can be called multiple times. Optional
-t (dir) set temporary directory . Optional
-F (fastq) Save fastq_R1 to file (default: stdout) . Optional.
-R (fastq) Save fastq_R2 to file (default: interlaced with forward) . Optional
-r repair: insert missing read
-N (int) max records in memory. Optional

##Example

Example 1 : piping bwa mem

$ bwa mem -M  human_g1k_v37.fasta  Sample1_L001_R1_001.fastq.gz Sample2_S5_L001_R2_001.fastq.gz |\
  java -jar dist/bam2fastq.jar  -F tmpR1.fastq.gz -R tmpR2.fastq.gz

before:

$ ls -lah Sample1_L001_R1_001.fastq.gz Sample2_S5_L001_R2_001.fastq.gz
-rw-r--r-- 1 lindenb lindenb 181M Jun 14 15:20 Sample1_L001_R1_001.fastq.gz
-rw-r--r-- 1 lindenb lindenb 190M Jun 14 15:20 Sample1_L001_R2_001.fastq.gz

after:

$ ls -lah tmpR1.fastq.gz  tmpR2.fastq.gz
-rw-rw-r-- 1 lindenb lindenb  96M Nov 20 17:10 tmpR1.fastq.gz
-rw-rw-r-- 1 lindenb lindenb 106M Nov 20 17:10 tmpR2.fastq.gz

check the number of reads

$ gunzip -c Sample1_L001_R1_001.fastq.gz | wc -l
5824676
$ gunzip -c tmpR1.fastq.gz | wc -l
5824676

Example 2 from FASTQ

$ java -jar dist/bam2fastq.jar \
    -F tmpR1.fastq.gz -R tmpR2.fastq.gz file.bam

(...)
-rw-r--r-- 1 lindenb lindenb 565M Nov 18 10:44 Sample_S1_L001_R1_001.fastq.gz
-rw-r--r-- 1 lindenb lindenb 649M Nov 18 10:45 Sample_S1_L001_R2_001.fastq.gz
-rw-rw-r-- 1 lindenb lindenb 470M Nov 20 16:17 tmpR1.fastq.gz.fastq.gz
-rw-rw-r-- 1 lindenb lindenb 554M Nov 20 16:17 tmpR2.fastq.gz.fastq.gz
Clone this wiki locally