Skip to content

Latest commit

 

History

History
47 lines (36 loc) · 3.17 KB

README.md

File metadata and controls

47 lines (36 loc) · 3.17 KB

Tutorial for selecting marker genes for image-based spatial transcriptomic experiments

This repository provides a series of Python notebooks designed to assist users in selecting marker genes for image-based spatial transcriptomic experiments. Due to experimental constraints, only a limited number of genes can be used in these experiments. Therefore, it's crucial to select a small, yet effective subset of genes that can answer biological questions such as cell type mapping and neighborhood in the tissue.

Marker genes must be discriminative and robust enough to identify the cell type of interest. In the absence of cytoplasm staining, the RNA-cell assignment is complex.
Errors in RNA-cell assignment can occur due to irregular cell shapes or RNA expression, potentially leading to inaccuracies in cell type identification.

To mitigate these challenges, our notebooks utilize tissue simulation to validate the chosen marker genes, estimate potential RNA-cell assignment errors and how it can affect cell type identification. The notebooks are designed to be user-friendly, allowing users to easily follow the steps and adapt the code to their specific data. the workflow is illustrated in the following figure.

The notebooks are written in Python and are divided in the following sections:

1- Marker gene selection: The notebook guides the user in the selection of marker genes from scRNA-seq data.
2- Marker gene validation: The notebook guides the user in the validation of the selected marker gene using scRNA-seq data.
3- Tissue simulation: The notebook guides the user in the simulation of the tissue using the selected markers gene. these simulations will be used to validate the selected markers gene while taking into account experimental smFISH constraints.
4-RNA-cell assignment: The notebook guides the user in the assignment of the RNA to the cells in the simulated tissue.
5-Assess in-situ cell type calling in smFISH simulation. The notebook guides the user in the assessment of the cell type calling in the simulated tissue, using the previously simulated tissue and assigning RNA to the cells.

The primary objective of using tissue simulation is to validate the chosen marker genes, considering the constraints of smFISH experiments. This process takes into account potential RNA-cell assignment errors, which can arise from irregular cell shapes or RNA expression, particularly when cytoplasm staining is not feasible.

Depending on the final estimated cell type accuracy obtained, the user can choose to adjust its markers gene list.

Alt text

A - List of user-provided marker genes is used to simulate smFISH images. Optionally, realistic RNA levels are drawn from scRNA-seq data. Simulation framework provides cell volumes and nuclei. RNA-to-cell assignment can then be compared quantitatively to the known ground truth with metrics such as Jaccard index
B - The marker selection method can be compared in terms of in-situ cell type accuracy and RNA-cell assignment error (Jacard index))
C - RNA-cell assignment method can also be compared in terms of in-situ cell type accuracy and RNA-cell assignment error (Jacard index))