-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add new features #30
Comments
1 Updated data import functionalityReplace hard coded file location with e.g., cellxgene (search over metadata and download datasets by ID) |
2. Ask dataframe agentFunctionality to ask questions of a dataframe |
3. Omics data QCAdd filtering based on QC metrics such as mitochondrial gene counts, doublets, # of genes, etc. |
4. batch effects correction and data visualizationPossible calling Seurat for batch correction and UMAP |
6. plot UMAP for tissue typesUnify the plotting by cell type or by tissue type. In general, one should be able to plot a UMAP on whatever dimension they are interested in. |
7a. Cell annotationOption 1: user provides a custom gene list or Option 2: database e.g. ImmuneCellAI or CyberSort 7b. Cell discoveryClustering and detection of clusters via a defined metric |
12. Cell compositional analysisQuantified proportions of cell types in a tissue or region of interest |
14. Metabolic modelingIntegration with GRN to constrain a metabolic model with applications in cancer, immunological and inflammatory related diseases, etc. |
15. Multi-omics data integration
|
Prerequisites
|
Background
The script addon.py already have functions for preprocessing, normalization, clustering, cell type prediction, gene list for cells/tissues, gene expression per cell types, DEG per tissue types, and UMAP plotting for cell types.
Suggested features
Addition of following features will complete the omics agent task:
1. Data import module
2. csv reader function for agent
3. Omics data QC
4. batch effects correction
5. DEG for cell types
6. plot UMAP for tissue types
7. Cell annotation and cell discovery
8. Lineage/trajectory inference analysis
9. Functional enrichment, ontology and pathway analysis
10. Gene regulatory networks
11. cell-cell communication
12. Cell compositional analysis
13. Gene perturbation modeling
14. Metabolic modeling
15. Multi-omics data integration
16. Method-specific features (e.g. TFBS from scATAC-seq, immune receptors)
Next steps
1. unit testing
2. Connect these tasks with LangChain agent
The text was updated successfully, but these errors were encountered: