Long Read Processing Pipeline

A nextflow pipeline of processing long reads

Dependencies

samtools

minimap2

clair3

File Format

Sample sheet -- csv

group	barcode_start	barcode_end	barcode_template	fastq	fasta
barcode_55	1200	1237	NNNNATNNNNATNNNNATNNNNATNNNNATNNNNATNN	/path/of/fastq/directory/55	/path/of/fasta/reference_55.fa
barcode_56	1200	1237	NNNNATNNNNATNNNNATNNNNATNNNNATNNNNATNN	/path/of/fastq/directory/56	/path/of/fasta/reference_56.fa
barcode_57	1200	1237	NNNNATNNNNATNNNNATNNNNATNNNNATNNNNATNN	/path/of/fastq/directory/57	/path/of/fasta/reference_57.fa

Structure of input directories

Usage

Run job

submit the bash script below

#!/bin/bash
#BSUB -o %J.o
#BSUB -e %J.e
#BSUB -R "select[mem>1000] rusage[mem=1000]"
#BSUB -M 1000
#BSUB -q normal

# modules
module load HGI/common/nextflow/23.10.0
module load HGI/softpack/groups/team354/nf_longread
module load HGI/common/clair3

#--------------#
# user specify #
#--------------#
# LSF group
export LSB_DEFAULT_USERGROUP=hgi

# Paths
export INPUTSAMPLE=$PWD/inputs/samplesheet.csv
export OUTPUTRES=$PWD/outputs

#-----------#
# pipelines #
#-----------#
nextflow run -resume nf_longread/main.nf --sample_sheet $INPUTSAMPLE \
                                         --protocol DNA \
                                         --platform nanopore \
                                         --outdir $OUTPUTRES

Usage options

nextflow run check_inputs.nf --sample_sheet "/path/of/sample/sheet"

    Mandatory arguments:
        --sample_sheet        Path of the sample sheet
    
    Optional arguments:
    Basic:
        --outdir              the directory path of output results, default: the current directory
    
    Alignment:
        --protocol            DNA, cDNA, directRNA, default: DNA
        --platform            nanopore, pacbio, hifi, default: nanopore
    
    Variant Calling:
        --model               the trainning model of variant calling, default: ont_r10
    
    Barcode Detection:
        --mapq                the mapping quality for filtering, default: 1
        --qualcut             the base quality in the barcode for filtering , default: 10
        --numcut              the number of low-quality bases in the barcode for filtering, default: 3
        --countcut            the number of reads supporting the barcode for filtering, default: 5

    Extract SNVs:
        --basequal            the base quality for filtering, default: 30
        --region              the expected region of variants, eg: 100,200, default: 0,0

    Step arguments:
        --skip_align          skip alignment
        --skip_variant        skip variant calling
        --skip_barcode        skip barcode detection
        --skip_snvcov         skip snv coverage extraction

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
conf		conf
image		image
modules		modules
resources		resources
scripts		scripts
workflows		workflows
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Long Read Processing Pipeline

Table of Contents

Dependencies

File Format

Sample sheet -- csv

Structure of input directories

Usage

Run job

Usage options

About

Releases

Packages

Languages

License

wtsi-hgi/nf_longread

Folders and files

Latest commit

History

Repository files navigation

Long Read Processing Pipeline

Table of Contents

Dependencies

File Format

Sample sheet -- csv

Structure of input directories

Usage

Run job

Usage options

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages