+
+ +
+

Generating citations for Census slices

+

This notebook demonstrates how to generate a citation string for all datasets contained in a Census slice.

+

Contents

+
    +
  1. Requirements

  2. +
  3. Generating citation strings

    +
      +
    1. Via cell metadata query

    2. +
    3. Via an AnnData query

    4. +
    +
  4. +
+

⚠️ Note that the Census RNA data includes duplicate cells present across multiple datasets. Duplicate cells can be filtered in or out using the cell metadata variable is_primary_data which is described in the Census schema.

+
+

Requirements

+

This notebook requires:

+
    +
  • cellxgene_census Python package.

  • +
  • Census data release with schema version 1.3.0 or greater.

  • +
+
+
+

Generating citation strings

+

First we open a handle to the Census data. To ensure we open a data release with schema version 1.3.0 or greater, we use census_version="latest"

+
+
[1]:
+
+
+
import cellxgene_census
+
+census = cellxgene_census.open_soma(census_version="latest")
+census["census_info"]["summary"].read().concat().to_pandas()
+
+
+
+
+
[1]:
+
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
soma_joinidlabelvalue
00census_schema_version1.3.0
11census_build_date2024-01-01
22dataset_schema_version4.0.0
33total_cell_count75694072
44unique_cell_count45846761
55number_donors_homo_sapiens16292
66number_donors_mus_musculus2153
+
+
+

Then we load the dataset table which contains a column "citation" for each dataset included in Census.

+
+
[2]:
+
+
+
datasets = census["census_info"]["datasets"].read().concat().to_pandas()
+datasets["citation"]
+
+
+
+
+
[2]:
+
+
+
+
+0      Dataset Version: https://datasets.cellxgene.cz...
+1      Dataset Version: https://datasets.cellxgene.cz...
+2      Dataset Version: https://datasets.cellxgene.cz...
+3      Dataset Version: https://datasets.cellxgene.cz...
+4      Publication: https://doi.org/10.1002/ctm2.1356...
+                             ...
+695    Publication: https://doi.org/10.1038/s41586-02...
+696    Publication: https://doi.org/10.1038/s41586-02...
+697    Publication: https://doi.org/10.1016/j.isci.20...
+698    Publication: https://doi.org/10.1371/journal.p...
+699    Publication: https://doi.org/10.1016/j.isci.20...
+Name: citation, Length: 700, dtype: object
+
+
+

And now we can use the column "dataset_id" present in both the dataset table and the Census cell metadata to create citation strings for any Census slice.

+
+

Via cell metadata query

+
+
[3]:
+
+
+
# Query cell metadata
+cell_metadata = census["census_data"]["homo_sapiens"].obs.read(
+    value_filter="tissue == 'cardiac atrium'", column_names=["dataset_id", "cell_type"]
+)
+cell_metadata = cell_metadata.concat().to_pandas()
+
+# Get a citation string for the slice
+slice_datasets = datasets[datasets["dataset_id"].isin(cell_metadata["dataset_id"])]
+print(*slice_datasets["citation"], sep="\n\n")
+
+
+
+
+
+
+
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/4866a804-37eb-436f-8c87-9cd585260061.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/bfd80f12-725c-4482-ad7f-1ed2b4909b0d.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/e6df8a57-f54f-413a-9d4d-dee03294d778.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/8d599205-5c51-4b50-9d48-3dec31238587.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/f6065c51-bd26-4aa5-a05d-2805aeea48d9.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/8cdbf790-4d29-4f46-9aef-21adfb2e21da.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+
+
+
+

Via AnnData query

+
+
[4]:
+
+
+
# Fetch an AnnData object
+adata = cellxgene_census.get_anndata(
+    census=census,
+    organism="homo_sapiens",
+    measurement_name="RNA",
+    obs_value_filter="tissue == 'cardiac atrium'",
+    var_value_filter="feature_name == 'MYBPC3'",
+    column_names={"obs": ["dataset_id", "cell_type"]},
+)
+
+# Get a citation string for the slice
+slice_datasets = datasets[datasets["dataset_id"].isin(adata.obs["dataset_id"])]
+print(*slice_datasets["citation"], sep="\n\n")
+
+
+
+
+
+
+
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/4866a804-37eb-436f-8c87-9cd585260061.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/bfd80f12-725c-4482-ad7f-1ed2b4909b0d.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/e6df8a57-f54f-413a-9d4d-dee03294d778.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/8d599205-5c51-4b50-9d48-3dec31238587.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/f6065c51-bd26-4aa5-a05d-2805aeea48d9.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+Publication: https://doi.org/10.1126/science.abl4896 Dataset Version: https://datasets.cellxgene.cziscience.com/8cdbf790-4d29-4f46-9aef-21adfb2e21da.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/e5f58829-1a66-40b5-a624-9046778e74f5
+
+
+

And don’t forget to close the Census handle

+
+
[6]:
+
+
+
census.close()
+
+
+
+
+
+
+ + +
+ +