Update docs and README

gao-lab · Oct 20, 2024 · 09292d2 · 09292d2
1 parent 73698d4
commit 09292d2
Show file tree

Hide file tree

Showing 4 changed files with 622 additions and 41 deletions.
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-<!-- [![stars-badge](https://img.shields.io/github/stars/gao-lab/DECIPHER?logo=GitHub&color=yellow)](https://github.com/gao-lab/DECIPHER/stargazers) -->
+[![stars-badge](https://img.shields.io/github/stars/gao-lab/DECIPHER?logo=GitHub&color=yellow)](https://github.com/gao-lab/DECIPHER/stargazers)
 [![build-badge](https://github.com/gao-lab/DECIPHER/actions/workflows/build.yml/badge.svg)](https://github.com/gao-lab/DECIPHER/actions/workflows/build.yml)
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 [![license-badge](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
@@ -8,7 +8,7 @@
 # DECIPHER
 <div align="center">
 
-[Installation](#Installation) • [Documentation](#Documentation) • [Manuscript](#Manuscript) • [FAQ](#FAQ) • [Acknowledgement](#Acknowledgement)
+[Installation](#Installation) • [Documentation](#Documentation) • [Citation](#Citation) • [FAQ](#FAQ) • [Acknowledgement](#Acknowledgement)
 
 </div>
 
@@ -18,7 +18,7 @@
 
 ## Installation
 > [!IMPORTANT]
-> Requires Python >= 3.10 and CUDA-enabled GPU (not recommend using CPU because it‘s too slow).
+> Requires Python >= 3.10 and CUDA-enabled GPU (CPU-only device is not recommended).
 
 ### PyPI
 We recommend to install `cell-decipher` to a new conda environment with [RAPIDS](https://docs.rapids.ai/install) dependencies.
@@ -36,7 +36,10 @@ docker pull huhansan666666/decipher:latest
 ```
 
 ## Documentation
-Here is a minimal example for quick start. Please check our [**documentation**](https://slat.readthedocs.io/en/latest/) for advanced tutorials.
+> Please check [**documentation**](https://slat.readthedocs.io/en/latest/) for detailed tutorial.
+
+### Minimal example
+Here is a minimal example for quick start:
 
 ```python
 import scanpy as sc
@@ -57,7 +60,17 @@ omics_emb = model.center_emb
 spatial_emb = model.nbr_emb
 ```
 
-## Manuscript
+### Demo
+
+| Name                                    | Description                                                  |
+| --------------------------------------- | ------------------------------------------------------------ |
+| [Basic Model Tutorial]() ([Colab](https://colab.research.google.com/drive/14PEtrgqlf-KbLOTfBLc9gbx0YvY6mi0S?usp=sharing))                | Tutorial on how to train DECIPHER                            |
+| [Identify Localization Related Genes]() | Tutorial on how to identify cells’ localization related genes via DECIPHER embeddings|
+| [Multi-slices with Batch Effects]()     | Tutorial on how to remove batch effects across multiple slices |
+| [DDP Training]()                        | Tutorial on how to use multi-GPUs on large datasets          |
+
+
+## Citation
 TBD
 
 > If you want to repeat our benchmarks and case studies, please check the [**benchmark**](./benchmark/README.md) and [**experiments**](./experiments/README.md) folder.

diff --git a/docs/tutorials.md b/docs/tutorials.md
@@ -5,24 +5,26 @@ We provide following tutorials for you to get started with `decipher`.
 ## Basic usage
 You can check basic usages of `decipher` in following notebooks:
 
-1. **Train model**: shows how to train `decipher` and get independent omics embedding and spatial embedding from spatial omics data.
-2. **Identify localization related genes**: shows how to identity cell localization related genes via `decipher` omics embedding and spatial embedding.
+1. **Train model**: shows how to train `decipher` model on spatial omics data and get high-fidelity disentangled omics and spatial embedding.
+2. **Identify localization related genes**: shows how to identity cell localization related genes via `decipher`'s disentangled embeddings.
 
 ```{eval-rst}
 .. nbgallery::
     tutorials/train_model.ipynb
     tutorials/explain_select_genes.ipynb
 ```
 
-## Advanced
+## Advanced topics
 
 ### Hyperparameter setting
 
-Default hyperparameters are robust for most cases. If you want to change the hyperparameters, you can specify them when initializing the `decipher` object. The `CFG` object is a nested dictionary that contains all the hyperparameters. You can modify the hyperparameters by changing the values in `CFG` object.
+Default hyperparameters are robust for most cases. If you want to change the hyperparameters, you can specify them when initializing the DECIPHER class. You can modify any hyperparameters by changing its values in `CFG` object, which is a nested dictionary.
 
 ```python
 from decipher import DECIPHER, CFG
 
+print(CFG)
+
 # modify the hyperparameters
 CFG.omics.model.batch_size = 512
 
@@ -33,7 +35,7 @@ model = DECIPHER(work_dir='/path/to/work_dir', user_cfg=CFG)
 
 
 ### Multiple slices / Batch removal
-You can input a list of Anndata objects, each one is a spaital slices. DECIPHER model will automatically view each object as one batch and remove the batch effects. If you do not want to remove batch effects, please set `CFG.omics.ignore_batch = True`
+You can input a list of Anndata objects, each one is a spaital slices. `DECIPHER` will automatically view each object as one batch and remove the batch effects. If you do not want to remove batch effects, just set `CFG.omics.ignore_batch = True`
 
 ```python
 from decipher import DECIPHER, CFG
@@ -48,13 +50,11 @@ model.register_data(adatas)
 model.fit_omics()
 ```
 
-If your input is an integrated Anndata object, you can specify `split_by` in `register_data()` function, decipher will automatically split the integrated Anndata object into multiple batches inside. If you do not want to remove batch effects, please set `CFG.omics.ignore_batch = True`
+If the input is an integrated Anndata object, you can specify `split_by` in `register_data()` function, `DECIPHER` will automatically split the Anndata object into batches inside
 
 ```python
 from decipher import DECIPHER, CFG
 
-# CFG.omics.ignore_batch = True  # uncomment this line if you do not want to remove batch effects
-
 model = DECIPHER(work_dir='/path/to/work_dir', user_cfg=CFG)
 
 adata = sc.read_h5ad('/path/to/adata.h5ad')
@@ -66,20 +66,22 @@ model.fit_omics()
 
 ### Multi-GPU training
 
-For spatial atlas with > 5 millons cells, use DDP (distributed data parallel) mode with multi-GPUs can save a lot of time. Here is an example showing how to run with DDP:
+`decipher` support DDP (distributed data parallel on multi-GPUs) to accelerate training on big spatial atlas (especially > 1 millons cells). You just need change **1 line** of code to enable DDP:
 
 ```python
 from decipher import DECIPHER
 
 # Init model
 model = DECIPHER('/path/to/work_dir')
+
 # Register data
 adata = sc.read_h5ad('/path/to/adata.h5ad')
 model.register_data(adata)
-# DDP training
+
+# DDP training, use `fit_ddp` instead of `fit_omics`
 model.fit_ddp(gpus = 4)
 ```
 
 ```{note}
-DDP training also consumes n times the memory
+DDP training will consumes n times the system memory (n is the number of GPUs)
 ```
diff --git a/docs/tutorials/explain_select_genes.ipynb b/docs/tutorials/explain_select_genes.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# Identify localization related genes\n",
     "\n",
-    "> (Estimated time: ~3 min with 2 GPUs)"
+    "> (Estimated time: ~5 min with single GPU)"
    ]
   },
   {

diff --git a/docs/tutorials/train_model.ipynb b/docs/tutorials/train_model.ipynb