The scripts in this repository rely on the data in the associated Zenodo repository ( and are described roughly in the intended order of execution.
hc_clust.R - use hierarchical clustering of SNP distances between isolate genomes to define genomic clusters. - subset FASTA alignment to samples of interest.
transmission.R - infer potential transmission links based on genomic relatedness.
cf_per_clone.R - get clone CF/non-CF proportions (CF vs non-CF patients) and compare with surveillance data.
plotDCCBayesianSkylinePlot.R - based on C. Ruis 20. Aggregates Skyline model population size data across clones and outputs plots of inferred clone historical population sizes. - modifies BEAST XML template generated with BEAUti to prepare input XMLs for BEAST runs for multiple clones. - from Treeannotator annotated trees, alignment and sample dates work out root age for input tree. - use outgroup to root global Pseudomonas tree. - implementation of parsimony ancestral character-state reconstruction of binary traits for gene presence/absence or indels. - subset Panaroo final_graph.gml to samples of interest. - subset Panaroo gene presence/absence (csv/Rtab) according to samples of interest. - get ancestral genome representatives based on parsimony ancestral character state reconstruction. - extract FASTA file with nucleotide/protein sequences for events reconstructed. - extract multi-FASTA file for Panaroo gene ids of interest from gene_data.csv. - parse emapper output for functional enrichment analysis.
emapper_stats.R - test functional enrichment of COG categories in epidemic/sporadic clones and along CF proportions.
cf_association_analysis.R - Perform gene expression CF association analysis.
analyse_assay_measurements.R - Plot virulence factor measurements and test association of virulence factor expression with CF proportions.
plot_abundance_individua.R - Plot macrophage survival across clones with variable CF proportions and test differential survival. - annotate nodes/branches in clone trees based on (ancestral) infection type and transmissibility (needs correct columns from samples_qc_filtered_and_annotated_v2.txt as inputs).
mutational_distribution.R - aggregate variant effect annotation across clones and perform mutational burden test.
manhattan_plot.R - make Manhattan and QQ-plot (run mutational_distribution.R). - infer sequence of mutations in evolutionary time (only use mutations in mutational burden test hits as input if used for subsequent analyses.).
plot_gene2patho_position.R - plot gene specific burden over evolutionary time. - get UMAP of pathoadaptive space.
plot_patho_per_sample.R - UMAP scatter plot (run first).
test_transmissibility_and_cfness.R - test mutational burden test hits for over/underrepresentation of CF vs non-CF, and transmitted vs untransmitted mutations (run mutational_distribution.R and first).