diff --git a/dev/fai/index.html b/dev/fai/index.html index 2304315..8ee174c 100644 --- a/dev/fai/index.html +++ b/dev/fai/index.html @@ -21,7 +21,7 @@ julia> idx = faidx(IOBuffer(str)); -julia> rdr = FASTAReader(IOBuffer(str), index=idx);

You can also add a index to an existing reader using the index! function:

FASTX.FASTA.index!Function
index!(r::FASTA.Reader, ind::Union{Nothing, Index, IO, AbstractString})

Set the index of r, and return r. If ind isa Union{Nothing, Index}, directly set the index to ind. If ind isa IO, parse the index from the FAI-formatted IO first. If ind isa AbstractString, treat it as the path to a FAI file to parse.

See also: Index, FASTA.Reader

source

Seeking using an Index

With an Index attached to a Reader, you can do the following operation in O(1) time. In these examples, we will use the following FASTA file:

>seq1 sequence
+julia> rdr = FASTAReader(IOBuffer(str), index=idx);

You can also add a index to an existing reader using the index! function:

FASTX.FASTA.index!Function
index!(r::FASTA.Reader, ind::Union{Nothing, Index, IO, AbstractString})

Set the index of r, and return r. If ind isa Union{Nothing, Index}, directly set the index to ind. If ind isa IO, parse the index from the FAI-formatted IO first. If ind isa AbstractString, treat it as the path to a FAI file to parse.

See also: Index, FASTA.Reader

source

Seeking using an Index

With an Index attached to a Reader, you can do the following operation in O(1) time. In these examples, we will use the following FASTA file:

>seq1 sequence
 TAGAAAGCAA
 TTAAAC
 >seq2 sequence
@@ -36,7 +36,7 @@
 "GAA"

FASTX.jl does not yet support indexing FASTQ files.

Reference:

FASTX.FASTA.faidxFunction
faidx(io::IO)::Index

Read a FASTA.Index from io.

See also: Index

Examples

julia> ind = faidx(IOBuffer(">ab\nTA\nT\n>x y\nGAG\nGA"))
 Index:
   ab	3	4	2	3
-  x	5	14	3	4
source
faidx(fnapath::AbstractString, [idxpath::AbstractString], check=true)

Index FASTA path at fnapath and write index to idxpath. If idxpath is not given, default to same name as fnapath * ".fai". If check, throw an error if the output file already exists

See also: Index

source
FASTX.FASTA.seekrecordFunction
seekrecord(reader::FASTAReader, i::Union{AbstractString, Integer})

Seek Reader to the i'th record. The next iterated record with be the i'th record. i can be the identifier of a sequence, or the 1-based record number in the Index.

The Reader needs to be indexed for this to work.

source
FASTX.FASTA.extractFunction
extract(reader::Reader, name::AbstractString, range::Union{Nothing, UnitRange})

Extract a subsequence given by index range from the sequence named in a Reader with an index. Returns a String. If range is nothing (the default value), return the entire sequence.

source
FASTX.FASTA.IndexType
Index(src::Union{IO, AbstractString})

FASTA index object, which allows constant-time seeking of FASTA files by name. The index is assumed to be in FAI format.

Notable methods:

  • Index(::Union{IO, AbstractString}): Read FAI file from IO or file at path
  • write(::IO, ::Index): Write index in FAI format
  • faidx(::IO)::Index: Index FASTA file
  • seekrecord(::Reader, ::AbstractString): Go to position of seq
  • extract(::Reader, ::AbstractString): Extract part of sequence

Note that the FAI specs are stricter than FASTX.jl's definition of FASTA, such that some valid FASTA records may not be indexable. See the specs at: http://www.htslib.org/doc/faidx.html

See also: FASTA.Reader

Examples

julia> src = IOBuffer("seqname\t9\t14\t6\t8\nA\t1\t3\t1\t2");
+  x	5	14	3	4
source
faidx(fnapath::AbstractString, [idxpath::AbstractString], check=true)

Index FASTA path at fnapath and write index to idxpath. If idxpath is not given, default to same name as fnapath * ".fai". If check, throw an error if the output file already exists

See also: Index

source
FASTX.FASTA.seekrecordFunction
seekrecord(reader::FASTAReader, i::Union{AbstractString, Integer})

Seek Reader to the i'th record. The next iterated record with be the i'th record. i can be the identifier of a sequence, or the 1-based record number in the Index.

The Reader needs to be indexed for this to work.

source
FASTX.FASTA.extractFunction
extract(reader::Reader, name::AbstractString, range::Union{Nothing, UnitRange})

Extract a subsequence given by index range from the sequence named in a Reader with an index. Returns a String. If range is nothing (the default value), return the entire sequence.

source
FASTX.FASTA.IndexType
Index(src::Union{IO, AbstractString})

FASTA index object, which allows constant-time seeking of FASTA files by name. The index is assumed to be in FAI format.

Notable methods:

  • Index(::Union{IO, AbstractString}): Read FAI file from IO or file at path
  • write(::IO, ::Index): Write index in FAI format
  • faidx(::IO)::Index: Index FASTA file
  • seekrecord(::Reader, ::AbstractString): Go to position of seq
  • extract(::Reader, ::AbstractString): Extract part of sequence

Note that the FAI specs are stricter than FASTX.jl's definition of FASTA, such that some valid FASTA records may not be indexable. See the specs at: http://www.htslib.org/doc/faidx.html

See also: FASTA.Reader

Examples

julia> src = IOBuffer("seqname\t9\t14\t6\t8\nA\t1\t3\t1\t2");
 
 julia> fna = IOBuffer(">A\nG\n>seqname\nACGTAC\r\nTTG");
 
@@ -45,4 +45,4 @@
 julia> seekrecord(rdr, "seqname");
 
 julia> sequence(String, first(rdr))
-"ACGTACTTG"
source
+"ACGTACTTG"source diff --git a/dev/fasta/index.html b/dev/fasta/index.html index eb0f788..7801162 100644 --- a/dev/fasta/index.html +++ b/dev/fasta/index.html @@ -14,7 +14,7 @@ "TAqACC" julia> typeof(description(rec)) == typeof(sequence(rec)) <: AbstractString -truesource

FASTAReader and FASTAWriter

FASTAWriter can optionally be passed the keyword width to control the line width. If this is zero or negative, it will write all record sequences on a single line. Else, it will wrap lines to the given maximal width.

Reference:

FASTX.FASTA.ReaderType
FASTA.Reader(input::IO; index=nothing, copy::Bool=true)

Create a buffered data reader of the FASTA file format. The reader is a BioGenerics.IO.AbstractReader, a stateful iterator of FASTA.Record. Readers take ownership of the underlying IO. Mutating or closing the underlying IO not using the reader is undefined behaviour. Closing the Reader also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTA.Record, FASTA.Writer

Arguments

  • input: data source
  • index: Optional random access index (currently fai is supported). index can be nothing, a FASTA.Index, or an IO in which case an index will be parsed from the IO, or AbstractString, in which case it will be treated as a path to a fai file.
  • copy::Bool: iterating returns fresh copies instead of the same Record. Set to false for improved performance, but be wary that iterating mutates records.

Examples

julia> rdr = FASTAReader(IOBuffer(">header\nTAG\n>another\nAGA"));
+true
source

FASTAReader and FASTAWriter

FASTAWriter can optionally be passed the keyword width to control the line width. If this is zero or negative, it will write all record sequences on a single line. Else, it will wrap lines to the given maximal width.

Reference:

FASTX.FASTA.ReaderType
FASTA.Reader(input::IO; index=nothing, copy::Bool=true)

Create a buffered data reader of the FASTA file format. The reader is a BioGenerics.IO.AbstractReader, a stateful iterator of FASTA.Record. Readers take ownership of the underlying IO. Mutating or closing the underlying IO not using the reader is undefined behaviour. Closing the Reader also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTA.Record, FASTA.Writer

Arguments

  • input: data source
  • index: Optional random access index (currently fai is supported). index can be nothing, a FASTA.Index, or an IO in which case an index will be parsed from the IO, or AbstractString, in which case it will be treated as a path to a fai file.
  • copy::Bool: iterating returns fresh copies instead of the same Record. Set to false for improved performance, but be wary that iterating mutates records.

Examples

julia> rdr = FASTAReader(IOBuffer(">header\nTAG\n>another\nAGA"));
 
 julia> records = collect(rdr); close(rdr);
 
@@ -24,10 +24,10 @@
 
 julia> foreach(println, map(sequence, records))
 TAG
-AGA
source
FASTX.FASTA.WriterType
FASTA.Writer(output::IO; width=70)

Create a data writer of the FASTA file format. The writer is a BioGenerics.IO.AbstractWriter. Writers take ownership of the underlying IO. Mutating or closing the underlying IO not using the writer is undefined behaviour. Closing the writer also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTA.Record, FASTA.Reader

Arguments

  • output: Data sink to write to
  • width: Wrapping width of sequence characters. If < 1, no wrapping.

Examples

julia> FASTA.Writer(open("some_file.fna", "w")) do writer
+AGA
source
FASTX.FASTA.WriterType
FASTA.Writer(output::IO; width=70)

Create a data writer of the FASTA file format. The writer is a BioGenerics.IO.AbstractWriter. Writers take ownership of the underlying IO. Mutating or closing the underlying IO not using the writer is undefined behaviour. Closing the writer also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTA.Record, FASTA.Reader

Arguments

  • output: Data sink to write to
  • width: Wrapping width of sequence characters. If < 1, no wrapping.

Examples

julia> FASTA.Writer(open("some_file.fna", "w")) do writer
     write(writer, record) # a FASTA.Record
-end
source
FASTX.FASTA.validate_fastaFunction
validate_fasta(io::IO) >: Nothing

Check if io is a valid FASTA file. Return nothing if it is, and an instance of another type if not.

Examples

julia> validate_fasta(IOBuffer(">a bc\nTAG\nTA")) === nothing
+end
source
FASTX.FASTA.validate_fastaFunction
validate_fasta(io::IO) >: Nothing

Check if io is a valid FASTA file. Return nothing if it is, and an instance of another type if not.

Examples

julia> validate_fasta(IOBuffer(">a bc\nTAG\nTA")) === nothing
 true
 
 julia> validate_fasta(IOBuffer(">a bc\nT>G\nTA")) === nothing
-false
source
+falsesource diff --git a/dev/fastq/index.html b/dev/fastq/index.html index 8dbbf98..4d33fd2 100644 --- a/dev/fastq/index.html +++ b/dev/fastq/index.html @@ -32,7 +32,7 @@ Int8[73, 73, 74] julia> typeof(description(rec)) == typeof(sequence(rec)) <: AbstractString -truesource

Qualities

Unlike FASTARecords, a FASTQRecord contain quality scores, see the example above.

The quality string can be obtained using the quality method:

julia> record = parse(FASTQRecord, "@ILL01\nCCCGC\n+\nKM[^d");
+true
source

Qualities

Unlike FASTARecords, a FASTQRecord contain quality scores, see the example above.

The quality string can be obtained using the quality method:

julia> record = parse(FASTQRecord, "@ILL01\nCCCGC\n+\nKM[^d");
 
 julia> quality(record)
 "KM[^d"

Qualities are numerical values that are encoded by ASCII characters. Unfortunately, multiple encoding schemes exist, although PHRED+33 is the most common. The scores can be obtained using the quality_scores function, which returns an iterator of PHRED+33 scores:

julia> collect(quality_scores(record))
@@ -54,13 +54,13 @@
 julia> qe = QualityEncoding('a':'z', 16); # hypothetical encoding
 
 julia> collect(quality_scores(read, qe)) == [Int8(i) - 16 for i in "abc"]
-true
source

Reference:

FASTX.FASTQ.qualityFunction
quality([T::Type{String, StringView}], record::FASTQ.Record, [part::UnitRange])

Get the ASCII quality of record at positions part as type T. If not passed, T defaults to StringView. If not passed, part defaults to the entire quality string.

Examples

julia> rec = parse(FASTQ.Record, "@hdr\nUAGUCU\n+\nCCDFFG");
+true
source

Reference:

FASTX.FASTQ.qualityFunction
quality([T::Type{String, StringView}], record::FASTQ.Record, [part::UnitRange])

Get the ASCII quality of record at positions part as type T. If not passed, T defaults to StringView. If not passed, part defaults to the entire quality string.

Examples

julia> rec = parse(FASTQ.Record, "@hdr\nUAGUCU\n+\nCCDFFG");
 
 julia> qual = quality(rec)
 "CCDFFG"
 
 julia> qual isa AbstractString
-true
source
FASTX.FASTQ.quality_scoresFunction
quality_scores(record::FASTQ.Record, [encoding::QualityEncoding], [part::UnitRange])

Get an iterator of PHRED base quality scores of record at positions part. This iterator is corrupted if the record is mutated. By default, part is the whole sequence. By default, the encoding is PHRED33 Sanger encoding, but may be specified with a QualityEncoding object

source
quality(record::Record, encoding_name::Symbol, [part::UnitRange])::Vector{UInt8}

Get an iterator of base quality of the slice part of record's quality.

The encoding_name can be either :sanger, :solexa, :illumina13, :illumina15, or :illumina18.

source
FASTX.FASTQ.quality_header!Function
quality_header!(record::Record, x::Bool)

Set whether the record repeats its header on the quality comment line, i.e. the line with +.

Examples

julia> record = parse(FASTQ.Record, "@A B\nT\n+\nJ");
+true
source
FASTX.FASTQ.quality_scoresFunction
quality_scores(record::FASTQ.Record, [encoding::QualityEncoding], [part::UnitRange])

Get an iterator of PHRED base quality scores of record at positions part. This iterator is corrupted if the record is mutated. By default, part is the whole sequence. By default, the encoding is PHRED33 Sanger encoding, but may be specified with a QualityEncoding object

source
quality(record::Record, encoding_name::Symbol, [part::UnitRange])::Vector{UInt8}

Get an iterator of base quality of the slice part of record's quality.

The encoding_name can be either :sanger, :solexa, :illumina13, :illumina15, or :illumina18.

source
FASTX.FASTQ.quality_header!Function
quality_header!(record::Record, x::Bool)

Set whether the record repeats its header on the quality comment line, i.e. the line with +.

Examples

julia> record = parse(FASTQ.Record, "@A B\nT\n+\nJ");
 
 julia> string(record)
 "@A B\nT\n+\nJ"
@@ -68,7 +68,7 @@
 julia> quality_header!(record, true);
 
 julia> string(record)
-"@A B\nT\n+A B\nJ"
source

FASTQReader and FASTQWriter

FASTQWriter can optionally be passed the keyword quality_header to control whether or not to print the description on the third line (the one with +). By default this is nothing, meaning that it will print the second header, if present in the record itself.

If set to a Bool value, the Writer will override the Records, without changing the records themselves.

Reference:

FASTX.FASTQ.ReaderType
FASTQ.Reader(input::IO; copy::Bool=true)

Create a buffered data reader of the FASTQ file format. The reader is a BioGenerics.IO.AbstractReader, a stateful iterator of FASTQ.Record. Readers take ownership of the underlying IO. Mutating or closing the underlying IO not using the reader is undefined behaviour. Closing the Reader also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTQ.Record, FASTQ.Writer

Arguments

  • input: data source
  • copy::Bool: iterating returns fresh copies instead of the same Record. Set to false for improved performance, but be wary that iterating mutates records.

Examples

julia> rdr = FASTQReader(IOBuffer("@readname\nGGCC\n+\njk;]"));
+"@A B\nT\n+A B\nJ"
source

FASTQReader and FASTQWriter

FASTQWriter can optionally be passed the keyword quality_header to control whether or not to print the description on the third line (the one with +). By default this is nothing, meaning that it will print the second header, if present in the record itself.

If set to a Bool value, the Writer will override the Records, without changing the records themselves.

Reference:

FASTX.FASTQ.ReaderType
FASTQ.Reader(input::IO; copy::Bool=true)

Create a buffered data reader of the FASTQ file format. The reader is a BioGenerics.IO.AbstractReader, a stateful iterator of FASTQ.Record. Readers take ownership of the underlying IO. Mutating or closing the underlying IO not using the reader is undefined behaviour. Closing the Reader also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTQ.Record, FASTQ.Writer

Arguments

  • input: data source
  • copy::Bool: iterating returns fresh copies instead of the same Record. Set to false for improved performance, but be wary that iterating mutates records.

Examples

julia> rdr = FASTQReader(IOBuffer("@readname\nGGCC\n+\njk;]"));
 
 julia> record = first(rdr); close(rdr);
 
@@ -79,10 +79,10 @@
 "GGCC"
 
 julia> show(collect(quality_scores(record))) # phred 33 encoding by default
-Int8[73, 74, 26, 60]
source
FASTX.FASTQ.WriterType
FASTQ.Writer(output::IO; quality_header::Union{Nothing, Bool}=nothing)

Create a data writer of the FASTQ file format. The writer is a BioGenerics.IO.AbstractWriter. Writers take ownership of the underlying IO. Mutating or closing the underlying IO not using the writer is undefined behaviour. Closing the writer also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTQ.Record, FASTQ.Reader

Arguments

  • output: Data sink to write to
  • quality_header: Whether to print second header on the + line. If nothing (default), check the individual Record objects for whether they contain a second header.

Examples

julia> FASTQ.Writer(open("some_file.fq", "w")) do writer
+Int8[73, 74, 26, 60]
source
FASTX.FASTQ.WriterType
FASTQ.Writer(output::IO; quality_header::Union{Nothing, Bool}=nothing)

Create a data writer of the FASTQ file format. The writer is a BioGenerics.IO.AbstractWriter. Writers take ownership of the underlying IO. Mutating or closing the underlying IO not using the writer is undefined behaviour. Closing the writer also closes the underlying IO.

See more examples in the FASTX documentation.

See also: FASTQ.Record, FASTQ.Reader

Arguments

  • output: Data sink to write to
  • quality_header: Whether to print second header on the + line. If nothing (default), check the individual Record objects for whether they contain a second header.

Examples

julia> FASTQ.Writer(open("some_file.fq", "w")) do writer
     write(writer, record) # a FASTQ.Record
-end
source
FASTX.FASTQ.validate_fastqFunction
validate_fastq(io::IO) >: Nothing

Check if io is a valid FASTQ file. Return nothing if it is, and an instance of another type if not.

Examples

julia> validate_fastq(IOBuffer("@i1 r1\nuuag\n+\nHJKI")) === nothing
+end
source
FASTX.FASTQ.validate_fastqFunction
validate_fastq(io::IO) >: Nothing

Check if io is a valid FASTQ file. Return nothing if it is, and an instance of another type if not.

Examples

julia> validate_fastq(IOBuffer("@i1 r1\nuuag\n+\nHJKI")) === nothing
 true
 
 julia> validate_fastq(IOBuffer("@i1 r1\nu;ag\n+\nHJKI")) === nothing
-false
source
+falsesource diff --git a/dev/files/index.html b/dev/files/index.html index 7556c61..8ddd1ec 100644 --- a/dev/files/index.html +++ b/dev/files/index.html @@ -53,4 +53,4 @@ for record in my_records write(writer, record) end -end +end diff --git a/dev/index.html b/dev/index.html index 50147f6..1ffdeef 100644 --- a/dev/index.html +++ b/dev/index.html @@ -23,4 +23,4 @@ for record in records write(writer, record) end -end

See more details in the sections in the sidebar.

Contributing

We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.

Take a look at the contributing files detailed contributor and maintainer guidelines, and code of conduct.

+end

See more details in the sections in the sidebar.

Contributing

We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.

Take a look at the contributing files detailed contributor and maintainer guidelines, and code of conduct.

diff --git a/dev/records/index.html b/dev/records/index.html index d0d19ff..a96bc0d 100644 --- a/dev/records/index.html +++ b/dev/records/index.html @@ -24,18 +24,18 @@ 6

Reference:

FASTX.identifierFunction
identifier(record::Record)::AbstractString

Get the sequence identifier of record. The identifier is the description before any whitespace. If the identifier is missing, return an empty string. Returns an AbstractString view into the record. If the record is overwritten, the string data will be corrupted.

See also: description, sequence

Examples

julia> record = parse(FASTA.Record, ">ident_here some descr \nTAGA");
 
 julia> identifier(record)
-"ident_here"
source
FASTX.descriptionFunction
description(record::Record)::AbstractString

Get the description of record. The description is the entire header line, minus the leading > or @ symbols for FASTA/FASTQ records, respectively, including trailing whitespace. Returns an AbstractString view into the record. If the record is overwritten, the string data will be corrupted.

See also: identifier, sequence

Examples

julia> record = parse(FASTA.Record, ">ident_here some descr \nTAGA");
+"ident_here"
source
FASTX.descriptionFunction
description(record::Record)::AbstractString

Get the description of record. The description is the entire header line, minus the leading > or @ symbols for FASTA/FASTQ records, respectively, including trailing whitespace. Returns an AbstractString view into the record. If the record is overwritten, the string data will be corrupted.

See also: identifier, sequence

Examples

julia> record = parse(FASTA.Record, ">ident_here some descr \nTAGA");
 
 julia> description(record)
-"ident_here some descr "
source
FASTX.sequenceFunction
sequence([::Type{S}], record::Record, [part::UnitRange{Int}])::S

Get the sequence of record.

S can be either a subtype of BioSequences.BioSequence, AbstractString or String. If elided, S defaults to an AbstractString subtype. If part argument is given, it returns the specified part of the sequence.

See also: identifier, description

Examples

julia> record = parse(FASTQ.Record, "@read1\nTAGA\n+\n;;]]");
+"ident_here some descr "
source
FASTX.sequenceFunction
sequence([::Type{S}], record::Record, [part::UnitRange{Int}])::S

Get the sequence of record.

S can be either a subtype of BioSequences.BioSequence, AbstractString or String. If elided, S defaults to an AbstractString subtype. If part argument is given, it returns the specified part of the sequence.

See also: identifier, description

Examples

julia> record = parse(FASTQ.Record, "@read1\nTAGA\n+\n;;]]");
 
 julia> sequence(record)
 "TAGA"
 
 julia> sequence(LongDNA{2}, record)
 4nt DNA Sequence:
-TAGA
source
FASTX.seqsizeFunction
seqsize(::Record)::Int

Get the number of bytes in the sequence of a Record. Note that in the presence of non-ASCII characters, this may differ from length(sequence(record)).

See also: sequence

Examples

julia> seqsize(parse(FASTA.Record, ">hdr\nKRRLPW\nYHS"))
+TAGA
source
FASTX.seqsizeFunction
seqsize(::Record)::Int

Get the number of bytes in the sequence of a Record. Note that in the presence of non-ASCII characters, this may differ from length(sequence(record)).

See also: sequence

Examples

julia> seqsize(parse(FASTA.Record, ">hdr\nKRRLPW\nYHS"))
 9
 
 julia> seqsize(parse(FASTA.Record, ">hdr\nαβγδϵ"))
-10
source
+10source diff --git a/dev/search/index.html b/dev/search/index.html index 0bb77c2..b7048d0 100644 --- a/dev/search/index.html +++ b/dev/search/index.html @@ -1,2 +1,2 @@ -Search · FASTX.jl

Loading search...

    +Search · FASTX.jl

    Loading search...