Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I generate the UMAP like this (cluster centroids). Can I also change the format and position of this figure? #80

Open
melancholy12 opened this issue Oct 16, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@melancholy12
Copy link

image

@parashardhapola parashardhapola self-assigned this Oct 18, 2022
@parashardhapola parashardhapola added the enhancement New feature or request label Oct 18, 2022
@parashardhapola
Copy link
Owner

parashardhapola commented Oct 18, 2022

Hi @melancholy12

Sorry for the slow reply on this. You see, this functionality doesn't exist in Scarf, but we did provide the function to draw in the code that accompanied the manuscript.

But to simplify things, I have created the following function that you can run out of the box. I will bake this into Scarf in the next release (lots of good things coming there).

Just copy and paste the function below and then call it like this:

plot_centroid_umap(ds)
wherein ds is the scarf DataStore

def plot_centroid_umap(
    ds,
    group_key: str = "RNA_leiden_cluster",
    layout_key: str ="RNA_UMAP",
    show_edges: bool = True,
    padding: float = 1.25,
    cluster_size_multiplier: float = 1.0,
    edge_width_multiplier: float = 100.0,
    cmap: str = None,
    custom_color_key = None,
    save_name: str = None
):
    
    import pandas as pd
    import matplotlib.pyplot as plt
    from matplotlib.collections import LineCollection
    from scarf.plots import _scatter_make_colors
    
    c_x = f"{layout_key}1"
    c_y = f"{layout_key}2"
    
    df = ds.cells.to_pandas_dataframe(
        key="I",
        columns=[c_x, c_y, group_key]
    ).groupby(group_key).mean()
    clusts = ds.cells.fetch(group_key)
    df["cluster_size"] = pd.Series(clusts).value_counts() * cluster_size_multiplier
    
    # Fix group colurs as per Scarf's `plot_layout` logic
    cmap = _scatter_make_colors(
        df.index.astype("category"),
        cmap,
        custom_color_key, None, None
    )[1]
    colors = [cmap[x] for x in df.index]

    fig, ax = plt.subplots(1, 1, figsize=(4,4))
    ax.scatter(
        df[c_x],
        df[c_y],
        s=df.cluster_size,
        lw=1,
        edgecolors='k',
        c=colors,
    )
    
    # Give the plot a bit of padding so that the blobs are not truncated
    ax.set_xlim(
        df[c_x].min()*padding,
        df[c_x].max()*padding
    )
    ax.set_ylim(
        df[c_y].min()*padding,
        df[c_y].max()*padding
    )
    
    # Calculate the mean number of edges from the KNN graph
    if show_edges:
        g = ds.load_graph()
        lines = []
        widths = []
        for i in set(clusts):
            v = pd.Series(
                clusts[g[clusts==i].tocoo().col]
            ).value_counts()
            v = v/v.sum()
            for j,k in v.to_dict().items():
                if j == i:
                    continue
                lines.append([
                    (df[c_x][i], df[c_y][i]),
                    (df[c_x][j], df[c_y][j])
                ])
                widths.append((edge_width_multiplier*k))
    
        lc = LineCollection(lines, linewidths=widths, color='k', zorder=0)
        ax.add_collection(lc)
    
    if save_name is not None:
        plt.savefig(save_name, dpi=300)
        
    plt.show()

Pro-Tip: To copy the code, click the icon in the top-right corner when you hover over the code block.

Expected output:
image

The UMAP for this data looked like this:
image

Hope this helps, and feel free to reach out if you face an issue.

/PD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants