Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: dkoslicki/CMash
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.5.2
Choose a base ref
...
head repository: dkoslicki/CMash
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
  • 3 commits
  • 1 file changed
  • 1 contributor

Commits on Mar 8, 2022

  1. Put in correct citation

    Copy past error from this repo: https://github.com/dkoslicki/MinHashMetagenomics
    dkoslicki authored Mar 8, 2022
    Copy the full SHA
    59cee69 View commit details

Commits on Jan 12, 2024

  1. Update README.md

    dkoslicki authored Jan 12, 2024
    Copy the full SHA
    ef14495 View commit details

Commits on Jul 19, 2024

  1. Update README.md

    dkoslicki authored Jul 19, 2024
    Copy the full SHA
    fb6825f View commit details
Showing with 6 additions and 4 deletions.
  1. +6 −4 README.md
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# CMash
CMash is a fast and accurate way to estimate the similarity of two sets. This is a probabilisitic data analysis approach, and uses containment min hashing. Please see the [associated paper](http://www.biorxiv.org/content/early/2017/09/04/184150) for further details (and please cite if you use it):
>Improving Min Hash via the Containment Index with applications to Metagenomic Analysis
>David Koslicki, Hooman Zabeti
>bioRxiv 184150; doi: https://doi.org/10.1101/184150

# Please Note: CMash has largely been supplanted by [Sourmash](https://github.com/sourmash-bio/sourmash) and [YACHT](https://github.com/KoslickiLab/YACHT). While these packages technically does not possess the ability to change k-mer sizes, we have decided to adopt Sourmash and incorporate our other results (eg. estimating ANI and AAI) into the Sourmash code base and build tooling, YACHT, on top of that. Consider this repo depreciated.

CMash is a fast and accurate way to estimate the similarity of two sets. This is a probabilisitic data analysis approach, and uses containment min hashing. Please see the [associated paper]( https://doi.org/10.1101/2021.12.06.471436) for further details (and please cite if you use it):
```
Liu, S., & Koslicki, D. (2021). CMash: fast, multi-resolution estimation of k-mer-based Jaccard and containment indices. bioRxiv. https://doi.org/10.1101/2021.12.06.471436
```
# Be aware, this is a work in progress and isn't guaranteed to be functional

## Installation