Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cardinalties on subjectVariant, objectTherapeutic, and geneContextQualifier #252

Open
mbrush opened this issue Jan 16, 2025 · 4 comments
Open
Assignees
Milestone

Comments

@mbrush
Copy link
Contributor

mbrush commented Jan 16, 2025

Originally posted by @brendanreardon in #234 (comment):

I just noticed that subjectVariant, geneContextQualifier, and objectTherapeutic in VariantTherapeuticResponseProposition are of limit 1..1. Should this instead be a 1..m to allow for statements that involve multiple subjectVariants or objectTherapeutics?

I can imagine that an argument could be made that a statement involving AND conditions on multiple variants could be represented as a single categorical variant. For therapies, do you intend for implementers to link to therapy groups, which then associates to therapies? It looks like therapy group has a minimum set size of 2 though.

p.s. Still wrapping my head around this trial version of va-spec

@mbrush
Copy link
Contributor Author

mbrush commented Jan 21, 2025

Responses to original comment:

From korikuzma:

I also have a question on TherapyGroups. Before, we had a model for combination and a model for substitutes. Therapy Group is nice, but is there a way where we can indicate the interaction type?

Edit: Oh, is this was groupType is for? It might be good to have this be an enum

From mbrush:

@korikuzma I had thought this is what `grouptType' was for as well, and this is consistent with the value of this field in this example:

However, I want @larrybabb to weigh in, as he told me that TherapyGroup is specifically for cases of combination therapies. And that the CIViC notion of 'substitutes' should be handled by creating separate Statements for each of the individual therapies in a 'substitute group'. I approve of this convention (separate statements for substitutes). But if this is indeed the case it seems we don't need the groupType attribute (as it should always be 'combination') - unless it is there for other purposes?

From mbrush:

@brendanreardon My take is that similar logic applies for sets of variants:

  • If the set represents variations whose co-occurrence is associated with a condition, then I think the subject could be represented as VRS Haplotype or Genotype.
  • If the set represents variants that fit a Categorical Variation definition (e.g. they map to each other across genome references, or they all result in the same protein change, or they are all missense changes impacting the same exon) - then the relevant CatVRS Categorical Variant model would be used to represent the Statement subject.
  • If a data provider wants to say that multiple variants are associated with a condition, but they are completely unrelated and don't fit a Categorical Variation definition - then separate Statements should be made to relate each variant to the condition.

From korikuzma:

@mbrush Ah okay. Ya, that makes sense. I would also be in favor of removing groupType if it will always be combination. Additionally, I'd suggest renaming to include Combination in the name.

From brendanreardon:

@mbrush Thank you so much, especially over the holidays, for those thoughts. Is the primary concept around limits for subjectVariant, geneContextQualifier, and objectTherapeutic being 0..1 or 1..1 that a single object is intended to be passed to these fields, which may then reference 1..n entities? This seems to be how therapies are handled at the moment, with objectTherapeutic accepting 1..1 objects, which can be either a Therapeutic (1..1) or TherapyGroup (2..n).

CIViC got around using lists/arrays for multiple co-occurring variants by implementing molecular profiles, and we'll have to figure something out in CatVRS eventually. Still, it may be worthwhile reconsidering these limits, at least for geneContextQualifier. For example, implementers would want to be able to fill in details for both BCR and ABL1 in VariantTherapeuticResponsePropositions that involve BCR::ABL1, and fusions more generally.

Something that Wes, I think, has said on the cat-vrs calls at times is that the schema should be flexible enough to enable implementers to be as vague or as specific as they want or need to be. You and the va-spec group have been super thoughtful in crafting this version. My worry, as an outsider, about these fields only supporting 1 value per statement is coming from a place of concern that it may be too strict and thus alienate some users.

What do you think?

From mbrush:

Hi Brendan. Your explanation above is well aligned with my thinking, and the rationale for crafting the schema as we did. As for the cardinality of the geneContextQualifier attribute - you raise a scenario that I don't think we considered when we restricted this to having max 1 value. We will discuss how to address this on a upcoming call, or in a dedicated ticket. Thanks!

@ahwagner ahwagner moved this to Backlog in VA-Spec Jan 22, 2025
@mbrush mbrush added this to the VA 1.0 milestone Jan 22, 2025
@larrybabb larrybabb moved this from Backlog to In Progress in VA-Spec Jan 22, 2025
@larrybabb
Copy link
Contributor

MB, LB, and AW discussed a resolution for this on the 2025-01-22 call.
MB will respond to the geneContextQualifier cardinality concern (used only when needed to further qualify the variant. fusions inherently reference the transcripts>genes that they are fusing)
MB will create an issue for AW to explain rationale and use of therapyGroup.groupType

@mbrush
Copy link
Contributor Author

mbrush commented Jan 23, 2025

The geneContextQualifier attribute is there to add info about the context of a variant that is not apparent form the subjectVariation. This is not a required field, but it is useful for many simpler variants / SNVs where we cannot infer the gene context form the representation of the variant.

For gene fusions, these are necessarily described in the context of transcripts, which are described in the context of Genes. The genes are included in the CatVRS representation of these fusion variants, so there is no need for a geneContextQualifierhere - which is duplicative and could accidentally pass on contradictory information.

@brendanreardon if this addresses your concern, go ahead and close this comment.


Re: TraitSet.groupType Alex clarified that there is a specific scenario that is found in CIViC data and the cancer literature where a 'substitutes' value is warranted. This is to be used in the specific scenario where a study is done on a cohort that is comprised of individuals with Variant X and one of Conditions A, B, or C - and the treatment response is determined based on an aggregate statistical analysis of this cohort. We cannot say whether there is significant statistical power to make a conclusion for each condition individually - because the analysis wasn't done this way. So we should not create separate statements that assert an association between the variant and treatment in the context of each condition separately. This could introduce erroneous conclusions that the study does not actually support.

Similarly for TherapyGroup.groupType - some cohorts are assembled to include patients with Variant V and Condition C and the intervention may be either Therapy A, B, or C. Typically these are all members of the same class of drug (e.g. EGFR inhibitors) . . . But for the same reasons, we don’t have the statistical power to say that the association holds for each treatment independently, given how the study / analysis was performed. Notably, we did acknowledge that in the future we will be able to formalize representation of things like Drug Classes that can be the subjects of statements in these cases (e.g. 'EGFR Inhibitors' - such that we will not need to sue the groupType = Substitutes pattern.

So, while groupType for TriatSets and TherapyGroups will most often be Combination - in this specific type of situation, a groupType of Substitutes can be used.

@ahwagner will ensure that this is all very clearly documented in our specification / implementation guidance, with examples.

Finally, we acknowledged that in the future we will be able to formalize representation of things like Drug Classes that can be the subjects of statements in these cases (e.g. 'EGFR Inhibitors' - such that we may not need to use the groupType = Substitutes pattern.

@mbrush
Copy link
Contributor Author

mbrush commented Feb 3, 2025

Outcome of 1-30-25 Call:

  • TraitSets and TherapyGroups are intended to hold arrays of things that are OR-ed vs AND-ed
  • we will change the name of groupType to membershipOperator, and include this attribute in in both TraitSet and TherapyGroup
  • the data type for htis attribute wil be a string with an enumeration of {"AND", "OR"} as permissible values
  • we will leave complex operations involving nested combinations of ANDs and ORs for future development

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

No branches or pull requests

2 participants