Skip to content

EVE Nomenclature

Robert J. Gifford edited this page Nov 28, 2024 · 15 revisions

Overview

In DIGS-for-EVEs, we have applied a systematic approach to naming non-retroviral endogenous viral elements (EVEs), following a convention developed for endogenous retroviruses (ERVs). This naming convention helps facilitate clear identification, organization, and comparison of EVEs across different species and research contexts.


Contents

  1. Construction of EVE Nomenclature
  2. Guidelines for Using EVE Nomenclature
  3. Important Considerations
  4. Glossary of EVE Types

Construction of EVE Nomenclature

Each EVE is assigned a unique identifier (ID) composed of three components, separated by hyphens. This ID structure captures the essential characteristics of the EVE, including its viral origin, insertion event, and host species. An example of a typical EVE ID is shown below:

EBLG-Carbovirus.2-Boreoeutheria

The components of the EVE ID are as follows:

  1. First Component: EVE Type (e.g., EBLG)

    • Identifies the type of EVE. For example, EBLG stands for endogenous borna-like glycoprotein.
    • A glossary of EVE types is provided below to clarify these abbreviations.
  2. Second Component: Subgroup and Numeric ID (e.g., Carbovirus.2)

    • Subgroup: Represents the taxonomic group of the virus from which the EVE derives, such as Carbovirus.
    • Numeric ID: A unique identifier for the insertion event within the specific EVE category and taxonomic group, with orthologous copies in different species sharing the same number.
  3. Third Component: Host Species or Species Group (e.g., Boreoeutheria)

    • Specifies the host species or species group where the EVE is found.
    • For EVEs known to occur in a single species, the full Latin binomial name is used (e.g., Myotis daubentonii). If the EVE is found in multiple species, a taxonomic group name is used to represent that range.

This structured approach enables precise referencing of EVEs and highlights their evolutionary relationships within and across species.


Guidelines for Using EVE Nomenclature

1. Always Use a Complete ID for Initial Reference

When first referencing an EVE, always use the full identifier (ID) to ensure clear and unambiguous communication. For example, a complete ID for an endogenous bornavirus-like L-protein element could be:

EBLL-Cultervirus.10-Myotis_daubentonii

This full ID provides all the necessary details for identifying the EVE, including its virus origin, insertion event, and host species.

In contexts where a focus on specific host species is implied, a slightly abbreviated form can be used:

EBLL-Cultervirus.10-MyoDau

Note: For official records, always use the unabbreviated host species name to ensure that the taxonomy is clear and precise.

2. Abbreviate IDs to Facilitate Clear Discussion

While full-form EVE IDs are essential for the initial reference, they can be cumbersome in extended discussions. It is advisable to abbreviate or shorten the IDs once the context has been established.

Examples of abbreviations:

  • If the genus Cultervirus is referred to as CV, the ID can be shortened to:
EBLL-CV.10-MyoDau
  • Further compression can occur if a two-letter abbreviation like Md is sufficient to identify the host species:
EBLL-Cultervirus.10-Md
  • In discussions specifically about EBLL elements, the classifier can be omitted if it becomes redundant:
CV.10-MyoDau

or

CV.10-Md

These shortened forms are permissible as long as they unambiguously refer to the specific EVE element in the given context.

3. Adjust IDs When Referring to EVE Loci Versus EVE Alleles

It's essential to distinguish between species-specific copies of an EVE (alleles) and the broader EVE locus shared by orthologous copies in multiple species.

  • EVE Alleles: The species-specific copy should include the host species name:
EBLL-Cultervirus.10-Myotis_daubentonii
  • EVE Loci: When referring to the locus itself, shared by multiple species, the host component can be generalized or omitted:
EBLL-CV.10-Myotis
EBLL-Cultervirus.10

Important Considerations

  1. Taxonomic Assignment: EVEs were assigned to virus taxonomic groups based on phylogenetic and genomic analysis. When a precise subgroup assignment was not possible, the lowest confidently assigned taxonomic rank was used.
  2. Ortholog Grouping: Numeric IDs were used to group orthologous EVEs, although some relationships may be uncertain. The 'digs_results' table includes BLAST-based confidence scores to help assess these groupings.
  3. Species Representation: When an EVE is only found in one species, its full species name is used. For orthologous EVEs across multiple species, a taxonomic group name is provided, with an 'UR' (unranked) designation used if the group cannot be precisely named.
  4. ERV Nomenclature: While this convention is adapted from ERVs, it has not yet been applied to ERV loci within DIGS-for-EVEs due to the complexity of resolving orthologous relationships for large numbers of ERV insertions, representing a future goal for the project.