Skip to content

holiday01/HASCAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HASCAD

Concept

The HASCAD is a cell composition deconvolution model to predict the 15 immune cell abundance from RNA-seq data, which the emdenble depp learning model trained on three PBMC scRNA-seq. We utilize the Harmony and Symphony to do pre-processing and remove batch effects between scRNA-seq to build the reference data.

Model trained on PBMC results

While you prepare your gene expression matrix, you should check if the sort of genes is as same as the reference genes. You can follow the example file for your query.

At starting

You can run the main.ipynb and get a result.

Without Symphony-Harmony

And, your can modify this script to replace the file "Example.csv".

sample = pd.read_csv("../Source/Example.csv",header=None)

With Symphony-Harmony

Run Harmony-Symphony/HS_main.R And, your can modify this script to replace the file "Example.csv".

sample = pd.read_csv("Harmony-Symphony/hs_exmple_output.csv",header=None)

Run your query

The gene expression without/with Symphony-Harmony.

Then you can run the script and obtain a plot like this

alt text

Training your model

The two steps in this section. The first is that you will prepare your reference data and query data. The second is that the HASCAD trained by the reference data to predict the cell composition of query data.

With Symphony-Harmony

make preparations

Require

R for harmony-symphony analysis

R version 4.1.0

irlba 2.3.5

Symphony

Python for model training, predicting, and figure out

See the requirement_python.txt

Manuscript

Under review on BMC journal

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published