The IsoSeq pipeline is implemented in the following steps
- lima
- demultiplexing and primer removal
- merge
- combine ccs reads based on barcode pair
- only valid barcodes pairs are merged - all combinations of 3p__5p barcode pairs as defined in barcodes.fasta
- refine
- polyA tail trimming
- concatemer removal
- cluster
- hierarchical, n*log(n) clusttering, alignment of shorter to longer sequences
- iterative cluster merging
- generate consensus for each read cluster using QV guided PoA
- align
- align reads to genome using pbmm2
- collapse
- collapse reads
nextflow run isoseq3.nf --ccs_reads ccs/read/dir --barcodes barcodes.fasta --genome_fasta genome.fasta --name example
nextflow run isoseq3.nf --ccs_reads ccs/read/dir --barcodes barcode.fasta --genome_fasta genome.fasta --name example -with-singularity -without-docker
- --ccs_reads
- ccs.bam file or directory containing ccs.bam files
- --barcodes
- barcode primers to use when demultiplexing
- 3' barcodes must end in _3p
- 5' barcodes must end in _5p
- --genome_fasta
- genome fasta file to use in alignment
- --name
- name of experiment