Skip to content

Latest commit

 

History

History
73 lines (61 loc) · 2.86 KB

README.md

File metadata and controls

73 lines (61 loc) · 2.86 KB

ecDNA gene expression

Lin, M.S.; Jo, S., Luebeck, J.; Chang, H.Y.; Wu, S.; Mischel, P.S.; Bafna, V. “Transcriptional immune suppression and upregulation of double stranded DNA damage and repair repertoires in ecDNA-containing tumors.” eLife. 12:RP88895 (Reviewed preprint). February 27, 2024. https://doi.org/10.7554/eLife.88895.2.

Impute missing values using K-nearest neighbors (KNN)

python -u ./src/cBioPortal_data_preprocessing.py --cBioPortal_data ./data/reference/TCGA/cBioportal --output ./data/KNN_imputed_data

Generate gene expression matrices

python -u ./src/Generate_TCGA_KNN_matrix.py --minimum_ecdna 3 --protein_coding 1 --KNN_imputed_data_dir ./data/KNN_imputed_data --output ./data/RNAseq_expression_matrix
python -u ./src/Generate_TCGA_KNN_matrix.py --minimum_ecdna 0 --protein_coding 0 --KNN_imputed_data_dir ./data/KNN_imputed_data --output ./data/RNAseq_expression_matrix 

Generate training/testing datasets

python -u ./src/Generate_testing_training_datasets.py --matrix ./data/RNAseq_expression_matrix/AC049_TCGA_KNN_imputed_min_ecdna_3_matrix.txt

Compute cliff delta for all TCGA genes in matrix

python -u ./src/Compute_cliff_delta_all_TCGA_genes.py --matrix ./data/RNAseq_expression_matrix/AC049_TCGA_KNN_imputed_min_ecdna_3_matrix.txt
python -u ./src/Compute_cliff_delta_all_TCGA_genes.py --matrix ./data/RNAseq_expression_matrix/AC049_TCGA_KNN_imputed_matrix.txt

Generate read counts matrix

python -u ./src/Generate_TCGA_level3_RSEM_raw_count_matrix.py --minimum_ecdna 3 --protein_coding 1 --output ./data/RNAseq_expression_matrix
python -u ./src/Generate_TCGA_level3_RSEM_raw_count_matrix.py --minimum_ecdna 0 --protein_coding 0 --output ./data/RNAseq_expression_matrix

Compute DESeq2 LFC for all TCGA genes in matrix

./src/Run_deseq2.R

Determine gene directionality based on cliff's delta and DESeq2 LFC

python -u ./src/Gene_directionality_cliffd_LFC.py --matrix ./data/RNAseq_expression_matrix/AC049_TCGA_KNN_imputed_min_ecdna_3_matrix.txt
python -u ./src/Gene_directionality_cliffd_LFC.py --matrix ./data/RNAseq_expression_matrix/AC049_TCGA_KNN_imputed_matrix.txt

Boruta analysis

python -u ./src/Perform_Boruta_Trials.py --matrix ./data/RNAseq_expression_matrix/AC049_TCGA_KNN_imputed_min_ecdna_3_matrix.txt

Co-expression

./src/Parse_pvclust_object.R
python -u ./src/Coexpressed_genes.py --boruta_dir ./data/AC049_TCGA_KNN_imputed_min_ecdna_3_matrix/Boruta_Trials

Precision/recall CorEx genes

python -u ./src/CorEx_genes_precision_recall.py --matrix ./data/RNAseq_expression_matrix/AC049_TCGA_KNN_imputed_min_ecdna_3_matrix.txt

Geneset enrichment & clustering of genesets

python -u ./src/Perform_geneset_enrichment.py --matrix ./data/RNAseq_expression_matrix/AC049_TCGA_KNN_imputed_min_ecdna_3_matrix.txt