-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
update readme with paper link and conda info
- Loading branch information
1 parent
870fafc
commit d6aa2ac
Showing
1 changed file
with
6 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
# Paraphase: HiFi-based SMN1/SMN2 variant caller | ||
|
||
SMN1, the gene that causes spinal muscular atrophy, is considered a 'dark' region of the genome due to high sequence similarity with its paralog SMN2. Paraphase is a Python tool that takes HiFi BAMs as input (whole-genome or enrichment), phases complete SMN1 and SMN2 haplotypes, determines copy numbers and makes phased variant calls for both genes. It also categorizes the haplotypes, enabling future haplotype-based screening of silent carriers (2+0). Please check out our [preprint](https://www.biorxiv.org/content/10.1101/2022.10.19.512930) for more details about the method and our population-wide haplotype analysis. | ||
SMN1, the gene that causes spinal muscular atrophy, is considered a 'dark' region of the genome due to high sequence similarity with its paralog SMN2. Paraphase is a Python tool that takes HiFi BAMs as input (whole-genome or enrichment), phases complete SMN1 and SMN2 haplotypes, determines copy numbers and makes phased variant calls for both genes. It also categorizes the haplotypes, enabling future haplotype-based screening of silent carriers (2+0). Please check out our paper for more details about the method and our population-wide haplotype analysis. | ||
|
||
Chen X, Harting J, Farrow E, et al. Comprehensive SMN1 and SMN2 profiling for spinal muscular atrophy analysis using long-read PacBio HiFi sequencing. The American Journal of Human Genetics. 2023;0(0). doi:10.1016/j.ajhg.2023.01.001 | ||
|
||
For whole-genome sequencing (WGS) data, we recommend >20X, ideally 30X, genome coverage. Low coverage or short read length could result in less accurate phasing, especially when haplotypes are highly similar to each other in Exons 1-6. For hybrid capture-based enrichment data, a higher read depth (>50X) is recommended as the read length is generally shorter than WGS. | ||
|
||
|
@@ -17,9 +19,11 @@ Xiao Chen: [email protected] | |
|
||
## Installation | ||
|
||
Paraphase can be installed through pip: | ||
Paraphase can be installed through pip or conda: | ||
```bash | ||
pip install paraphase | ||
# or | ||
conda install -c conda-forge -c bioconda paraphase | ||
``` | ||
|
||
Alternatively, Paraphase can be installed from GitHub. | ||
|