Skip to content

Commit

Permalink
release 4.6.0
Browse files Browse the repository at this point in the history
  • Loading branch information
dbolotin committed Dec 9, 2023
1 parent edbf660 commit d6b91d9
Show file tree
Hide file tree
Showing 3 changed files with 99 additions and 126 deletions.
6 changes: 3 additions & 3 deletions build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -134,18 +134,18 @@ val toObfuscate: Configuration by configurations.creating {
val obfuscationLibs: Configuration by configurations.creating


val mixcrAlgoVersion = "4.5.0-132-alignment-overlap-optimization"
val mixcrAlgoVersion = "4.6.0"
// may be blank (will be inherited from mixcr-algo)
val milibVersion = ""
// may be blank (will be inherited from mixcr-algo or milib)
val miuVersion = ""
// may be blank (will be inherited from mixcr-algo)
val mitoolVersion = "2.1.0-6-main"
val mitoolVersion = ""
// may be blank (will be inherited from mixcr-algo)
val repseqioVersion = ""

val picocliVersion = "4.6.3"
val jacksonBomVersion = "2.15.2"
val jacksonBomVersion = "2.16.0"
val milmVersion = "4.1.0"

val cliktVersion = "3.5.0"
Expand Down
123 changes: 0 additions & 123 deletions changelogs/v4.5.1.md

This file was deleted.

96 changes: 96 additions & 0 deletions changelogs/v4.6.0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
## New features

### 🚀 Single-Cell Somatic Hypermutation Trees

Now MiXCR can build a combined Heavy-Light somatic hypermutation trees for single-cell data.

- Added step in `findShmTrees` to combine trees together using groups formed by `groupClones` command. In result topology in one node could be clones from different chains if they are connected by the same group. If there is no connection to any clone from different chain, there will be reconstructed version of clone from this chain. The step could be disabled by `--dont-combine-tree-by-cells` option
- Trees could be split or combined in step `Group trees by cells` of `findShmTrees`
- Added `exportShmSingleCellTrees` command that export one node per line. It there is several roots in a tree, data will be exported in different columns.
- Added `-subtreeId` to tree exports to differentiate part of trees from different chains
- `exportShmTreesWithNodes` and `exportShmTrees` commands will export subtrees with different chains at separate rows.

### Changes in `groupClones` command

- Previous algorithm replaced with a new one that have better way of working with contamination, can detect multi-mappers (when one cell barcode marks two different cells) and can work with non-functional clones
- Part of clones now could be marked as contamination. That will be marked with a separate label in `exportClones` in `groupId` column. Such clones could be filtered out from export by `--filter-out-group-types contamination`
- More informative report
- Fixed behaviour leading to clones with `undefiened` group being split by cell barcodes

### New characteristics in SHM trees exports

- `-subtreeId` for determination of different chains in the same tree
- `-numberOfClonesInTree [forChain]` Number of uniq clones in the SHM tree.
- `-numberOfNodesWithClones` Number of nodes with clones, i.e. nodes with different clone sequences.
- `-totalReadsCountInTree [forChain]` Total sum of read counts of clones in the SHM tree.
- `-totalUniqueTagCountInTree (Molecule|Cell|Sample) [forChain]` Total count of unique tags in the SHM tree with specified type.
- `-chains` Chain type of the tree
- `-treeHeight` Height of the tree
- `-vGene`, `-jGene`, `-vFamily`, `-jFamily` - in previous version thous were exported only for nodes with clones
- `-vBestIdentityPercent`, `-jBestIdentityPercent`, `-isOOF` and `-isProductive` now exported for reconstructed nodes too

### New characteristics in clonotype export

- `-aaLength` and `-allAALength` is available alongside `-nLength` and `-allNLength`
- `-aaMutationsRate` is available alongside `-nMutationsRate`
- Added optional arg `germline` in `-nFeature`, `-aaFeature`, `-nLength`, `-aaLength` in `exportClones`, `exportAlignments` and `exportCloneGroups`. It allows to export a sequence of the germline instead of a sequence of the gene.
- For all mutation exports (excluding `-mutationsDetailed` ) added optional filter by mutation type: `... [(substitutions|indels|inserts|deletions)]`
- Added `-nMutationsCount`, `-aaMutationsCount`, `-allNMutationsCount`, `-allAAMutationsCount` for all relatable exports
- For mutation exports in `exportShmTreesWithNodes` `(germline|mrca|parent)` option is now optional. Will be export mutations from `germline` by default
- Added `--export-clone-groups-sort-chains-by` mixin
- Nucleotide mutations now could be exported for features that contain `VCDR3Part`, `DCDR3Part` or `JCDR3Part`
- Now `-nLength`, `-nMutationsCount`, `-nMutationsRate` can be calculated for multiple gene features (e.g. `-nMutationsRate VRegionTrimmed,JRegionTrimmed`)
- Added `--export-clone-groups-sort-chains-by` mixin with type of sorting of clones for determination of the primary and the secondary chains. It applies to `exportCloneGroups` command. By default, it's `Auto` (by UMI if it's available, by Read otherwise; previous default value was `Read`)
- Added `--filter-out-group-types` mixin to filter-out clones having certain clone group assignment kind: `found`, `undefined` or `contamination`. It applies to `exportClones` command
- Now `exportCloneGroups` by default will export groups in separate files for `IG`, `TRAB`, `TRGD` and `mixed`. This behaviour could be switched off by using `--reset-export-clone-table-splitting` or single `--export-clone-groups-for-cell-type`. In case of several `--export-clone-groups-for-cell-type` every cell type will be exported in separate file.
- In case of `--export-clone-groups-for-cell-type` in `exportCloneGroups` all mixed or unmatched groups will be filtered out.
- Added read and Molecule fraction columns to single cell exportClones output.

## 📚 Preset updates

- The `milab-human-rna-tcr-umi-race` preset has been updated: now clones are assembled by default based on the CDR3, in line with the manufacturer's recommended read length.
- The `flairr-seq-bcr` preset has been updated: now the preset sets species to `human` by default according to a built-in tag pattern with primer sequences.
- The following presets have been added to cover Ivivoscribe assay panels: `invivoscribe-human-dna-trg-lymphotrack`,`invivoscribe-human-dna-trb-lymphotrack`, `invivoscribe-human-dna-igk-lymphotrack`,`invivoscribe-human-dna-ighv-leader-lymphotrack`,`invivoscribe-human-dna-igh-fr3-lymphotrack`, `invivoscribe-human-dna-igh-fr2-lymphotrack`,`invivoscribe-human-dna-igh-fr1-lymphotrack`,`invivoscribe-human-dna-igh-fr123-lymphotrack`.
- The following presets have been added for mouse Thermofisher assays: `thermofisher-mouse-rna-tcb-ampliseq-sr`,`thermofisher-mouse-dna-tcb-ampliseq-sr`,`thermofisher-mouse-rna-igh-ampliseq-sr`,`thermofisher-mouse-dna-igh-ampliseq-sr`.
- Preset for SMARTer Human scTCR a/b Profiling Kit: `takara-sc-human-rna-tcr-smarter`
- The `milab-human-rna-ig-umi-multiplex` preset has been updated: the pattern now trims fewer nucleotides, which facilitates CDR1 identification. The splits by V and J genes have been removed as redundant due to the full-length assembling feature.

## 🛠️ Other improvements & fixes

- More strict `Combining trees` step in `findShmTrees` command
- Better calculation of indel mutations between nodes in process of building shm trees
- Increased percent of successful alignment-aided overlaps by removing unnecessary overlap region quality sum threshold
- Impossible export of germline sequence for `VJJunction` in `shmTrees` exports now produces an error
- Parameter validation fix in `-nMutationsRate`
- Fix for `-nMutationsRate` if region is not covered for the clone
- Fix for the formal of `exportAlignmentsPretty` broken in the previous version
- Fix for IllegalArgumentException in `exportAlignmentsPretty` for cases where translation can't be performed
- Fix error for `analyze` executed with `-f` and `--output-not-used-reads` at the same time
- Resolutions of wildcards are excluded from calculation of `-nMutationsRate` for CDR3 in `exportShmTreesWithNodes`
- Fix OutOfMemory exception in command `extend` with `.vdjca` input
- In `findShmTrees` filter for productive only clones now check for stop codons in all features, not only in CDR3
- Change default value for filter for productive clones in `findShmTrees` to false (was true before)
- Add option `--productive-only` to `findShmTrees`
- Fixed parsing of `--export-clone-groups-for-cell-type` parameter
- Fixed usage of `slice` command on clnx files that weren't ordered by id.
- In `slice` now default behaviour is to keep original ids. Previous behaviour available with `--reassign-ids` option
- Fixed parsing of composite gene features with offsets like `--assemble-clonotypes-by [VDJRegion,CBegin(0,10)]`
- Fixed parent directory creation for output of `exportClonesOverlap`
- Fixed `exportAirr` in case of a clone with CDR3 that don't have VCDR3Part and JCDR3Part
- Optimize calculation of ranks in clone set. Speeds up export with tags and several other places.
- Added `clone_id` column in `exportAirr`
- Fixed `exportClones` in case of splitting file by `tag:...` if there is a clone that have several tags of requested level
- Fixed calculation of `-nMutationsCount`, `-nMutationsRate`, `-aaMutationsCount` and `-aaMutationsRate`. Previously in some cases it was calculated on different region, from what was requested.
- Added `CellBarcodesWithFoundGroups` for `groupClones` QC checks
- New filter `--no-feature` in `exportAlignmentsPretty`
- Fixed reporting in `align`, now coverage takes into account alignment-aided overlap

## ❗ Breaking changes

- Option `--build-from <path>` was removed from `findShmTrees` command

### Deprecations of export options

- `-lengthOf` now is deprecated, use `-nLength` instead
- `-allLengthOf` now is deprecated, use `-allNLength` instead
- `-mutationRate` now is deprecated, use `-nMutationsRate` instead

0 comments on commit d6b91d9

Please sign in to comment.