PhotoModGNN for visualizing genome neighborhood as a network
Input
1 gene neighborhood list (SR1_synteny_table.txt)
2 gene homolog list (SR1_cluster_table.txt)
3 coding sequence of neighboring genes (protein_fasta.txt)
4 16s rRNA of observed genomes
These inputs can be generated by GLASSgo_postprocessing_8_modi.r
Run
1 Install conda build_tree package and activate
"conda env create --name build_tree --file build_tree.yml"
"conda activate build_tree"
2 download uniprot database and extract GO terms and protein sequences (ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.dat.gz)
"python extract_GO_from_uniprot_DAT.py uniprot_sprot.dat >uniprot_GO2019.txt"
"python extract_fasta_from_uniprot_DAT.py uniprot_sprot.dat >uniprot.fasta"
3 calculate genome neighborhood conservation score and specify taxonomy level for color labeling (e.g. 'genus', 'family', 'order')
"python run.py ./test_input/SR1_synteny_table.txt ./test_input/SR1_cluster_table.txt uniprot.fasta 'genus' ./test_input/16s_GenBank.fasta ./test_input/protein_fasta.txt"
output
1 node2edge.cy
2 mapping.q
3 mapping.n
#install xvfb if cannot connect to X server
"apt-get install -y xvfb"
"xvfb-run python run.py ./test_input/SR1_synteny_table.txt ./test_input/SR1_cluster_table.txt uniprot.fasta 'genus' ./test_input/16s_GenBank.fasta ./test_input/protein_fasta.txt"
4 build js file
"python read_output_network_photomod_v0_3.py node2edge.cy mapping.q mapping.n result_file_name"
5 open file "result_file_name.html" to see the output