Version 0.15.0
What's Changed
Breaking changes
- The Rust component of STRkit is now required. Pre-built wheels of
strkit_rust_ext
for some platforms are provided, but otherwise you'll need the Rust toolchain to install STRkit 0.15+. - Fractional TR genotype calling has been removed.
- Support for specifying more than one alignment file at a time has been removed.
Features and changes
- Caller:
- Optional incorporation of haplotype-tagged reads from phased alignments
- Phased blocks of SNVs for tandem repeat phasing
- Progressive output for JSON, VCFs, and TSVs instead of storing all results in memory
- Tweaked SNV incorporation logic
- Added a minimum quality threshold for SNV incorporation
- Better consensus sequence logic
- More complete VCF output
- Call reads with
> max reads
aligned reads (truncate tomax reads
reads) - Log current processing rate: # loci / second
- Visualization
- Log version on startup
Bug fixes
- Caller:
- Many important fixes for VCF output
- Fix many issues with reference repeat-counting logic
- fix(call): properly wait for realign process to terminate on timeout
- fix(call): better contig format detection for snvs
- fix(call): bug with terminating progress worker (need to reset stuck count)
- fix(call): mitigate potential divide-by-zero in locus logging
- fix(call): using mix of process and main logger in SNV helpers
- fix(call): properly use process worker, log more info about worker ID
- fix(call): locking logic + SNV flip logic; make SNV calls tuples
- fix(call): missing lock call for phase set counter
- fix(call): missing peak k-mer calculation with SNV/HP peaks
- fix(call): properly normalize peak weights from GMM medians
- fix(call): fix peak calling from GMM parameters
Performance
- Caller:
- Reduce reference sequence re-fetching / reference FASTA accesses
- CIGAR decoding optimization
- Repeat-counting optimizations
- Maximum # of iterations for repeat-counting procedures
- Approximate repeat counting for very large regions
- Skip regions which take too long to repeat-count
- Some memory usage improvements
- Misc. micro optimizations (lambdas, partial functions, etc.)
Documentation
- docs: mention installing as a docker container
- docs: clarify that --realign adds time
- docs: specify that BED file should be sorted
- docs: update alliance install instructions
- docs: mention ONT duplex data
- Update output format documentation
Full Changelog: v0.14.0...v0.15.0