You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
The key has expired.
New features
kallisto quant-tcc: This new command can run the EM algorithm on a supplied transcripts-compatibility counts (TCC) matrix file, such as that generated by "bustools count", to generate transcript-level estimates. When a gene-mapping file is supplied, gene-level abundances will also be outputted. Effective length normalization will only be performed if a kallisto index is supplied and if fragment length information is provided.
New technologies were added to "kallisto bus": -x SmartSeq3 (--tag can be used to supply a 5′ tag sequence that identifies UMI-containing reads), -x BDWTA (BD Rhapsody), -x Visium (10x Visium), -x SPLIT-SEQ (SPLiT-seq preprocessing), and -x Bulk (for preprocessing non-demultiplexed Bulk RNA-seq files)
"kallisto bus" can be run with no technology specified: In this case, it will either process a batch file (supplied via --batch) like in the old "kallisto pseudo" or will process fastQ files supplied directly on the command line, treating each fastQ file or each pair of fastQ file (if --paired is specified) as an individual sample. This is useful for generating BUS files when each sample is in a separate fastQ file. With bustools and kallisto quant-tcc, this feature effectively entirely deprecates the old "kallisto pseudo".
Strand-specificity is now enabled by default for 10X, SureCell, CelSeq, BD Rhapsody, and Smart-seq3 UMI technologies (unstranded is default for other technologies) and the user can override this by supplying --fr-stranded, --rf-stranded, and --unstranded options.
Various performance improvements (mostly in regards to data ingestion throughput)
A minimal form of the kallisto index is outputted in a file named index.saved and a file containing fragment length distributions (flens.txt) is outputted when "kallisto bus" is run on paired-end reads (which can be specified via the option --paired). This is so kallisto quant-tcc can perform effective length normalization should the need arise.
Deprecation
"kallisto pseudo" is now deprecated and will be removed in a future release; users should supply batch files of fastQ file names to "kallisto bus" instead