How can I generate the UMAP like this (cluster centroids). Can I also change the format and position of this figure? #80

melancholy12 · 2022-10-16T01:55:12Z

parashardhapola · 2022-10-18T16:18:51Z

Sorry for the slow reply on this. You see, this functionality doesn't exist in Scarf, but we did provide the function to draw in the code that accompanied the manuscript.

But to simplify things, I have created the following function that you can run out of the box. I will bake this into Scarf in the next release (lots of good things coming there).

Just copy and paste the function below and then call it like this:

plot_centroid_umap(ds)
wherein ds is the scarf DataStore

def plot_centroid_umap(
    ds,
    group_key: str = "RNA_leiden_cluster",
    layout_key: str ="RNA_UMAP",
    show_edges: bool = True,
    padding: float = 1.25,
    cluster_size_multiplier: float = 1.0,
    edge_width_multiplier: float = 100.0,
    cmap: str = None,
    custom_color_key = None,
    save_name: str = None
):
    
    import pandas as pd
    import matplotlib.pyplot as plt
    from matplotlib.collections import LineCollection
    from scarf.plots import _scatter_make_colors
    
    c_x = f"{layout_key}1"
    c_y = f"{layout_key}2"
    
    df = ds.cells.to_pandas_dataframe(
        key="I",
        columns=[c_x, c_y, group_key]
    ).groupby(group_key).mean()
    clusts = ds.cells.fetch(group_key)
    df["cluster_size"] = pd.Series(clusts).value_counts() * cluster_size_multiplier
    
    # Fix group colurs as per Scarf's `plot_layout` logic
    cmap = _scatter_make_colors(
        df.index.astype("category"),
        cmap,
        custom_color_key, None, None
    )[1]
    colors = [cmap[x] for x in df.index]

    fig, ax = plt.subplots(1, 1, figsize=(4,4))
    ax.scatter(
        df[c_x],
        df[c_y],
        s=df.cluster_size,
        lw=1,
        edgecolors='k',
        c=colors,
    )
    
    # Give the plot a bit of padding so that the blobs are not truncated
    ax.set_xlim(
        df[c_x].min()*padding,
        df[c_x].max()*padding
    )
    ax.set_ylim(
        df[c_y].min()*padding,
        df[c_y].max()*padding
    )
    
    # Calculate the mean number of edges from the KNN graph
    if show_edges:
        g = ds.load_graph()
        lines = []
        widths = []
        for i in set(clusts):
            v = pd.Series(
                clusts[g[clusts==i].tocoo().col]
            ).value_counts()
            v = v/v.sum()
            for j,k in v.to_dict().items():
                if j == i:
                    continue
                lines.append([
                    (df[c_x][i], df[c_y][i]),
                    (df[c_x][j], df[c_y][j])
                ])
                widths.append((edge_width_multiplier*k))
    
        lc = LineCollection(lines, linewidths=widths, color='k', zorder=0)
        ax.add_collection(lc)
    
    if save_name is not None:
        plt.savefig(save_name, dpi=300)
        
    plt.show()

Pro-Tip: To copy the code, click the icon in the top-right corner when you hover over the code block.

Expected output:

The UMAP for this data looked like this:

Hope this helps, and feel free to reach out if you face an issue.

/PD

parashardhapola self-assigned this Oct 18, 2022

parashardhapola added the enhancement New feature or request label Oct 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I generate the UMAP like this (cluster centroids). Can I also change the format and position of this figure? #80

How can I generate the UMAP like this (cluster centroids). Can I also change the format and position of this figure? #80

melancholy12 commented Oct 16, 2022

parashardhapola commented Oct 18, 2022 •

edited

Loading

How can I generate the UMAP like this (cluster centroids). Can I also change the format and position of this figure? #80

How can I generate the UMAP like this (cluster centroids). Can I also change the format and position of this figure? #80

Comments

melancholy12 commented Oct 16, 2022

parashardhapola commented Oct 18, 2022 • edited Loading

parashardhapola commented Oct 18, 2022 •

edited

Loading