DeepSpot: Leveraging Spatial Context for Enhanced Spatial Transcriptomics Prediction from H&E Images
Authors: Kalin Nonchev, Sebastian Dawo, Karina Selina, Holger Moch, Sonali Andani, Tumor Profiler Consortium, Viktor Hendrik Koelzer, and Gunnar Rätsch
The preprint is available here.
We introduce DeepSpot, a novel deep-learning model that predicts spatial transcriptomics from H&E images. DeepSpot employs a deep-set neural network to model spots as bags of sub-spots and integrates multi-level tissue details and spatial context. This integration, supported by the robust pre-trained H&E models, significantly enhances the accuracy and granularity of gene predictions from H&E images.
Fig.: DeepSpot leverages pathology foundation models and spatial tissue context. Workflow of DeepSpot: H&E slides are first divided into tiles, each corresponding to a spot. For each spot, we create a bag of sub-spots by dividing it into sub-tiles that capture the local morphology, and a bag of neighboring spots to represent the global context. A pretrained pathology model extracts tile features, which are input to the model. The concatenated representations are then fed into the gene head predictor, ρgene, to predict spatial gene expression.
git clone https://github.com/ratschlab/DeepSpot
cd DeepSpot
conda env create --file=environment.yaml
conda activate deepspot
python setup.py install
NB: Please ensure you have installed pyvips depending on your machine's requirements. We suggest installing pyvips through conda:
conda install conda-forge::pyvips
Install jupyter kernel
python -m ipykernel install --user --name deepspot --display-name "deepspot"
Please take a look at our notebook collection to get started with DeepSpot. We provide a small toy example.
- Spatial transcriptomics data preprocessing
- DeepSpot training
- DeepSpot inference
- DeepSpot inference with pretrained model
Moreover, we provide pretrained weights for DeepSpot, which were generated during the training of the model in our publication and were used, for example, to generate spatial transcriptomics data for TCGA skin melanoma and kidney cancer slides. Download DeepSpot weights here.
Please ensure that you download the weights for the pathology foundation models and update their file path deepspot/utils/utils_image.py. You may need to agree to specific terms and conditions before downloading.
We provide publicly the predicted spatial transcriptomics data with over 37 million spots from ~1 792 TCGA patients with melanoma or kidney cancer. You can find the data here. Please navigate to the Hugging Face dataset card for more information.
pip install datasets
from huggingface_hub import login, hf_hub_download, snapshot_download
import squidpy as sq
import pandas as pd
import scanpy as sc
import datasets
login(token="YOUR HUGGINGFACE TOKEN")
# Define dataset details
repo_id = "nonchev/TCGA_digital_spatial_transcriptomics"
filename = "metadata_2025-01-11.csv"
# Create path
file_path = hf_hub_download(repo_id=repo_id, filename=filename, repo_type="dataset")
# Load metata
metadata = pd.read_csv(file_path)
metadata.head()
dataset slide_type sample_id n_spots file_path
0 TCGA_SKCM FFPE TCGA-BF-AAP6-01Z-00-DX1.EFF1D6E1-CDBC-4401-A10... 5860 TCGA_SKCM/FFPE/TCGA-BF-AAP6-01Z-00-DX1.EFF1D6E...
1 TCGA_SKCM FFPE TCGA-FS-A1ZU-06Z-00-DX3.0C477EE6-C085-42BE-8BA... 2856 TCGA_SKCM/FFPE/TCGA-FS-A1ZU-06Z-00-DX3.0C477EE...
2 TCGA_SKCM FFPE TCGA-D9-A1X3-06Z-00-DX1.17AC16CC-5B22-46B3-B9C... 6236 TCGA_SKCM/FFPE/TCGA-D9-A1X3-06Z-00-DX1.17AC16C...
local_dir = 'TCGA_data' # Change the folder path as needed
snapshot_download("nonchev/TCGA_digital_spatial_transcriptomics",
local_dir=local_dir,
allow_patterns="TCGA_SKCM/FFPE/TCGA-D9-A3Z3-06Z-00-DX1.C4820632-C64D-4661-94DD-9F27F75519C3.h5ad.gz",
repo_type="dataset")
adata = sc.read_h5ad("path/to/h5ad.gz")
sq.pl.spatial_scatter(adata,
color=["SOX10", "CD37", "COL1A1", "predicted_label"],
size=20, img_alpha=0.8, ncols=2)
local_dir = 'TCGA_data' # Change the folder path as needed
# Note that the full dataset is around 2TB
snapshot_download("nonchev/TCGA_digital_spatial_transcriptomics",
local_dir=local_dir,
repo_type="dataset")
NB: To distinguish in-tissue spots from the background, tiles with a mean RGB value above 200 (near white) were discarded. Additional preprocessing can remove potential image artifacts.
In case you found our work useful, please consider citing us:
@article{nonchev2025deepspot,
title={DeepSpot: Leveraging Spatial Context for Enhanced Spatial Transcriptomics Prediction from H\&E Images},
author={Nonchev, Kalin and Dawo, Sebastian and Silina, Karina and Moch, Holger and Andani, Sonali and Tumor Profiler Consortium and Koelzer, Viktor H and Raetsch, Gunnar},
journal={medRxiv},
pages={2025--02},
year={2025},
publisher={Cold Spring Harbor Laboratory Press}
}
The code for reproducing the paper results can be found here.
In case, you have questions, please get in touch with Kalin Nonchev.