-
Notifications
You must be signed in to change notification settings - Fork 0
Recording Variations in AAV Atlas
AAV Atlas identifies and catalogs amino acid replacements in AAV sequences relative to reference sequences. This process ensures that only biologically relevant coding features are analyzed, maintaining the accuracy and specificity of the data. Below is an explanation of the core principles behind the variation analysis script used in AAV Atlas.
The script follows a structured workflow to systematically capture amino acid differences between AAV sequences and reference sequences. The key steps include:
- Identifying Relevant Alignments -- The script retrieves a list of tip alignments. Tip alignments are those without child alignments, representing the most granular level of sequence data.
- Filtering for Coding Features -- Only coding features are considered, ensuring non-coding regions are excluded. This filtering is essential as amino acid variations occur within coding regions.
- Mapping Reference Features -- The script maps each reference sequence to its corresponding coding features, creating a comprehensive list of amino acid locations to analyze.
- Processing Alignment Members -- For each alignment, the script processes individual member sequences, comparing their amino acid composition to the reference.
- Recording Variations -- Any differences in amino acid sequences between the alignment member and reference are cataloged. Each variation is assigned a unique identifier, ensuring it can be tracked and revisited.
The script begins by identifying tip alignments. These alignments are crucial because they represent terminal nodes in the alignment hierarchy, directly reflecting raw sequence data.
getTipAlignments(tipAlignments);
The function getTipAlignments
ensures that only alignments without child nodes are included, refining the pool of alignments to the most informative subset.
To ensure only relevant amino acid replacements are analyzed, the script retrieves coding features marked by the metatag CODES_AMINO_ACIDS
.
var featuresList = glue.tableToObjects(
glue.command(["list", "feature", "-w", "featureMetatags.name = 'CODES_AMINO_ACIDS'"])
);
Each feature is stored in a codingFeaturesMap
to facilitate quick lookup.
The script iterates over each alignment and maps coding features to their respective reference sequences. This ensures that the analysis focuses on biologically significant regions.
refFeaturesMap[refseqName] = _.filter(featureLocations, function(featureLoc) {
return codingFeaturesMap[featureLoc["feature.name"]];
});
For each alignment member, the script compares amino acids at each codon position to the reference.
if (refAaObj && refAaObj.definiteAas !== memberAaObj.definiteAas) {
// Record mismatch
}
Mismatched amino acids are flagged as replacements and cataloged for further analysis.
Once replacements are identified, they are stored in a custom table. The script also calculates biochemical distances, such as Grantham and Miyata distances, to classify the nature of the amino acid change.
glue.command(["create", "custom-table-row", "aav_replacement", replacementObj.id]);
glue.command(["set", "field", "grantham_distance_double", grantham_distance_double]);
Recording amino acid variations in this structured manner allows researchers to:
- Monitor evolutionary changes in AAV sequences
- Identify potentially significant mutations
- Maintain a comprehensive, accessible catalog of sequence variations
By focusing on coding regions and systematically logging replacements, AAV Atlas provides a reliable tool for gene therapy and genomic research. This approach supports the development of improved AAV vectors and enhances our understanding of AAV diversity.
AAV Atlas by Robert J Gifford Lab.
For questions, issues, or feedback, please open an issue on the GitHub repository.
For collaboration please contact Dr Robert Gifford.