-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAJOR BUG: reanalyze diffExp may discard significantly differentially expressed genes in the csv output #164
Comments
Hi, we're sorry that there is no such parameter at present.. Because highly variable genes(3000 by default) will be selected before differential analysis, if you want to obtain information about all genes, you could read the h5ad file in the directory after diffExp analysis. For example, uns['DE_1_marker_genes']['mean_count'] in the h5ad stores the average expression of all genes in different regions. |
3000 appears to be somewhat arbitrary? If I have used the lasso tool to delineate 5 regions, how to obtain the gene counts for each region. In particular, is there a documentation on the data structure of the h5ad file? Thanks! |
Hi, Here is the code to get the gene count for each region
|
Hi, 3000 is just the default value in SAW. According to some spatial omics analysis software such as scanpy and actual test data, 3000 highly variable genes could be used for downstream feature analysis and reduce computing pressure. At the same time, if the user want to adjust the analysis parameters, we encourage user to use Stereopy for personalized analysis. If you want to obtain the gene expression levels in different regions, you could export lasso.geojson in StereoMap, and then use h5ad is the Anndata format, which records the gene expression, spatial coordinates, clustering results and other information of all spots. You could refer to the following URL |
Ok thanks! |
@melop Would you mind providing the tissue.gef and geojson files for us to troubleshoot? |
I sent these files to your email, please have a look, thanks! |
Hi, the SAW pipeline defaults to differential analysis based on 3,000 highly variable genes. These genes are not highly variable genes and therefore are not included in the differential expression gene analysis. This is a shortcoming of the SAW pipeline. We will discuss and correct it as soon as possible. Users could use Stereopy to adjust the parameters for DEG analysis. Thank you for your suggestion and sorry for the inconvenience caused to you. |
Hello, after running saw reanalyze diffExp, I only saw 3000 genes in the output file *.find_marker_genes.csv. Is there a way to make it output all genes even if no differential expression is found? This would be important for some downstream analyses like GO enrichment etc.
Thanks!
The text was updated successfully, but these errors were encountered: