Skip to content
Robert J. Gifford edited this page Oct 17, 2024 · 28 revisions

Welcome to the Flu-GLUE User Guide!

Flu-GLUE is an open resource supporting the comparative genomic analysis of influenza viruses, developed using the GLUE software framework. This resource facilitates the study of influenza A virus (IAV), influenza B virus (IBV), influenza C virus (ICV), and influenza D virus (IDV), emphasizing collaborative research through decentralized data sharing.

Influenza viruses are members of the Orthomyxoviridae family, known for causing seasonal epidemics of respiratory disease worldwide. The segmented genomes of these viruses consist of negative-sense, single-stranded RNA. The ongoing evolution of influenza viruses presents a significant public health challenge due to the emergence of new strains capable of causing seasonal outbreaks or even pandemics.

Decentralized Data Sharing

Flu-GLUE promotes a decentralized model of data sharing, enabling researchers to directly share data, tools, and insights with each other. In contrast to centralized resources like the Global Initiative on Sharing All Influenza Data (GISAID) and **Bacterial and Viral Bioinformatics Resource Center (BV-BRC) **, which follow a "hub-and-spoke" model, Flu-GLUE encourages active collaboration among researchers. Platforms like NextStrain, which aggregate data from GenBank and GISAID, act as "spokes" by utilizing centralized data but not necessarily facilitating direct collaboration between users. Flu-GLUE's decentralized approach fosters enhanced interaction, shared tool development, and distributed governance, accelerating discoveries and improving responses to influenza outbreaks.

GenBank Filtering Tools

An integral component of Flu-GLUE is the inclusion of GenBank filtering tools. These tools enhance the usability of influenza virus data from GenBank by:

  • Capturing Links Between Sequences and Isolates: Establishing connections between sequences and their corresponding isolates to ensure data consistency.

  • Validating and Standardizing Metadata: Addressing issues like redundancy, variable data quality, and non-standard definitions by validating and standardizing sequence-associated metadata.

  • Segment Recognition and Genotyping: Independently confirming the segment origin of sequences and performing genotyping, which is crucial for influenza virus classification.

  • Redundancy Management: Handling redundant sequences by selecting the best representative for each isolate segment.

  • Incomplete Isolate Identification: Identifying isolates with missing segment sequences and exporting this information for further analysis.

By incorporating these filtering tools, Flu-GLUE introduces a higher level of order to influenza virus sequence data, allowing researchers to efficiently process and analyze large datasets with improved accuracy.

Key Features

  • Comprehensive Database: Flu-GLUE integrates influenza genome feature definitions, genome-length reference sequences, multiple sequence alignments, and standardized metadata for all major influenza lineages (IAV, IBV, ICV, IDV).

  • GLUE Framework Integration: Built on the GLUE software framework, Flu-GLUE offers an extensible platform for efficient, standardized, and reproducible genomic analysis of influenza viruses.

  • Phylogenetic Structure: Flu-GLUE organizes influenza virus sequence data in a phylogenetically structured manner, enabling easy exploration of evolutionary relationships among different virus strains.

  • Rich Annotations: Annotated reference sequences provide rigorous comparative genomic analysis capabilities related to conservation, viral adaptation, structural context, and genotype-to-phenotype associations.

  • Automated Genotyping: Flu-GLUE supports automated genotyping of influenza virus sequences (including subgenomic sequences) via GLUE's maximum likelihood clade assignment (MLCA) algorithm.

  • Collaborative and Decentralized: Flu-GLUE fosters direct collaboration between researchers, moving beyond the traditional hub-and-spoke model by promoting a decentralized system for data sharing and tool development.

  • GenBank Filtering Tools Integration: The inclusion of GenBank filtering tools enhances data quality and consistency, facilitating more accurate and reliable analyses.

  • Extensible Resource: The core Flu-GLUE project can be extended with additional layers, openly available via GitHub, enabling customized analyses and further project-specific developments.

Getting Started

To begin using Flu-GLUE for comparative genomic analysis of influenza viruses, follow these steps:

  1. Install GLUE: First, install the GLUE software framework. You can either opt for a native installation or use Docker, depending on your system setup.

  2. Download and Install the Flu-GLUE Project: Flu-GLUE can be installed as a prebuilt database for rapid setup, or it can be built from scratch through a local project build process for more customization.

  3. Utilize the GenBank Filtering Tools: Enhance your data analysis by using the provided GenBank filtering tools to process influenza virus sequence data from GenBank. These tools will help you:

    • Download and process GenBank sequence entries for IAV, IBV, ICV, and IDV.
    • Validate and standardize metadata.
    • Recognize segments and perform genotyping.
    • Export processed data for use in Flu-GLUE.

By following these steps, you can leverage Flu-GLUE to conduct comprehensive comparative genomic analyses of influenza viruses, benefiting from a collaborative, decentralized approach to data and tool sharing in the field of influenza genomics.