Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to build the symphony reference if the reference has no clear batch effect? #9

Closed
Echo226 opened this issue May 13, 2021 · 2 comments

Comments

@Echo226
Copy link

Echo226 commented May 13, 2021

Hi Symphony team,

Thanks for developing this great tool.
I have a question about the building the Symphony Reference. What if I have a reference dataset but there is no clear batch effect (the cells are not clustered by donor or technology), how can I use the Symphony to build this reference? There is a RunHarmony step before building the Symphony compression, but I am not sure if running harmony on a dataset without batch-effect is a good practice or not?

For example, I have a study containing 3 batches. I have done the unsupervised clustering and used the marker genes to annotate the cells on batch1, then I want to transfer the label from batch1 to the remaining 2 batches. What will be your suggestion to utilize Symphony in this case?

Thanks in advance for your reply.
Xinting

@joycekang
Copy link
Collaborator

joycekang commented May 13, 2021

Hi Xinting,

Good question. There are 2 quick fixes:

(1) You can currently run buildReference and set vars=NULL to skip batch correction. What happens in this case is that Symphony will define soft clusters for the reference mixture model using soft k-means (using cosine distance). This option is already implemented.

(2) If you don't want to run buildReference from scratch for some reason (e.g. because you already have the PCA embedding for your batch1, called Z_pca_ref), then you can run the following code to build a Symphony reference piece by piece. As long as you name the various reference components correctly, then Symphony mapping should be able to work.

reference = list(meta_data = metadata_ref) # initialize reference as a list with metadata slot
reference$loadings = s$u # add loadings from PCA
reference$vargenes = vargenes_means_sds # add variable gene info
clust_res <- symphony:::soft_kmeans(Z_pca_ref, K)
reference$centroids <- clust_res$Y
reference$R <- clust_res$R
reference$Z_orig <- Z_pca_ref
reference$Z_corr <- Z_pca_ref # no batch correction
reference$cache = symphony::compute_ref_cache(res$R, res$Z_corr)

That should be able to get you through until we can implement a better solution.

@Echo226
Copy link
Author

Echo226 commented Jun 28, 2021

Hi joy,

Thanks for your reply. This solves my question!

Best regards,
Xinting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants