-
- Comparative genomic analysis of flavivirids using GLUE -- -
- This is Flavivirid-GLUE, a - GLUE - project for the - flavivirids - (family Flaviviridae). -
- -- The Flaviviridae comprise enveloped, positive-strand RNA viruses, - many of which pose serious risks to human health on a global scale. - Arthropod-borne flaviviruses such as Zika virus (ZIKV), - Dengue virus (DENV), and - yellow fever virus (YFV) - are the causative agents of large-scale outbreaks that result - in millions of human infections every year, while the bloodborne hepatitis C virus - (HCV) - is a major cause of chronic liver disease. -
- -- - - - -
- -- -- - Projected urbanisation in 2027 (from The Economist magazine). - Urbanisation is often associated with the emergence and spread of mosquito-borne diseases - by creating favourable conditions for the survival of mosquito vector species. - Genome data can directly inform efforts to control diseases caused by mosquito-borne flaviviruses. - -
- -
- - -
- Since the emergence of the SARS-COV2 pandemic, many have become familiar with - the use of virus genome data to track the spread and evolution of pathogenic viruses - - e.g. via tools such as NextStrain. - However, it is less widely appreciated that the same kinds of data sets and comparative genomic approaches - can also be used to explore the structural and functional basis of virus adaptations. -
- -- The GLUE software framework - provides an extensible platform for implementing computational genomic - analysis of viruses in an efficient, standardised and reproducible way. - GLUE projects can not only incorporate all of the data items typically used in - comparative genomic analysis - (e.g. sequences, alignments, genome feature annotations) but can also represent the complex - semantic links between these data items via a relational database. - This 'poises' sequences and associated data for application in computational - analysis, minimising the requirement for labour-intensive pre-processing of datasets. -
- -- GLUE projects are equally suited for carrying out exploratory work - (e.g. using virus genome data to investigate structural and functional properties of viruses) - as they are for implementing operational procedures (e.g. producing - standardised reports - in a public or animal health setting). -
- - -- Hosting of GLUE projects in an online version control system (e.g. GitHub) provides - a mechanism for their stable, collaborative development, as shown below. -
- - - - --
-
- What is a GLUE project? -- -
- GLUE is an open, integrated - software toolkit that provides functionality for storage and interpretation of - sequence data. It supports the development of “projects” containing the data items - required for comparative genomic analysis - (e.g. sequences, multiple sequence alignments, genome feature annotations, - and other sequence-associated data). -
- - -- -
- - -
- Projects are loaded into the GLUE "engine", creating a relational database - that represents the semantic relationships between data items. - This provides a robust foundation for the implementation of systematic - comparative analyses and the development of sequence-based resources. - The database schema can be extended to accommodate the idiosyncrasies of different projects. - GLUE provides a scripting layer (based on JavaScript) - for developing custom analysis tools. -
- - - - - -- -
- - - -
- Some examples of 'sequence-based resources' built for viruses using GLUE include: - -
- -- - -
- -
-
-
-
- - COV-GLUE: - A GLUE resource for tracking genetic variation in SARS-COV2. - CoV-GLUE contains a database of amino acid replacements, insertions and - deletions which have been observed in GISAID hCoV-19 sequences sampled from the pandemic - -
- - RABV-GLUE: - Tailored toward epidemiological tracking of rabies virus (RABV). - Includes a database of RABV sequences and metadata from NCBI, updated daily and arranged into major and minor clades, and - an analysis tool providing genotyping, analysis and visualisation of submitted FASTA sequences. - -
- - HCV-GLUE: - This GLUE resource aims to support analysis of drug resistance and vaccine - escape in hepatitis C virus (HCV). - A database of HCV sequences and metadata from NCBI, updated daily and arranged - into clades (genotypes, subtypes). As well as pre-built multiple-sequence - alignments of NCBI sequences, it includes an analysis tool providing genotyping, - drug resistance analysis and visualisation of submitted FASTA sequences. - - -
- -
- -