You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Exon objects in Biosurfer are considered different if they belong to different Transcript objects, even if they have the same coordinates. However, this leads to duplication of information in the underlying SQL database's exon table and long insert query times when populating the database. (I haven't measured select query times specifically but I imagine they are impacted as well.)
A possible solution might be to have the exon table only store a single row for each unique exon (as defined by its genomic coordinates), while somehow still mapping each Exon object to a combination of a genomic exon and a transcript from the database. It seems like this can be accomplished by having a transcript-exon association table (where column 1 holds transcript IDs, column 2 holds exon IDs, and each row represents a transcript-includes-exon relationship) and then mapping Exon against a join of the association and exon tables, as per [1].
The text was updated successfully, but these errors were encountered:
jsaquing
added
postponed
This should be handled in the future, but is low priority right now
design
high-level design ideas, brainstorming
labels
Sep 24, 2021
It's something I'd like to try optimizing eventually, but it isn't necessary right now (hence the postponed label). I just wanted to write this down for future reference.
Exon
objects in Biosurfer are considered different if they belong to differentTranscript
objects, even if they have the same coordinates. However, this leads to duplication of information in the underlying SQL database's exon table and longinsert
query times when populating the database. (I haven't measuredselect
query times specifically but I imagine they are impacted as well.)A possible solution might be to have the exon table only store a single row for each unique exon (as defined by its genomic coordinates), while somehow still mapping each
Exon
object to a combination of a genomic exon and a transcript from the database. It seems like this can be accomplished by having a transcript-exon association table (where column 1 holds transcript IDs, column 2 holds exon IDs, and each row represents a transcript-includes-exon relationship) and then mappingExon
against a join of the association and exon tables, as per [1].The text was updated successfully, but these errors were encountered: