Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downstream plugin inconsistently reports a leading wild type (unmutated) amino acid #342

Open
susannasiebert opened this issue Aug 5, 2020 · 10 comments
Assignees
Labels

Comments

@susannasiebert
Copy link

We've encountered situations were sometimes the Downstream plugin will report one leading wildtype amino acid and sometimes it does not. For example this user (griffithlab/pVACtools#576) using the GRCh37 cache reports result that do not include a leading wild type amino acid while this user's prediction on GRCh38 (griffithlab/pVACtools#484) does return a leading wild type amino acid. I'm not sure if the different reference builds are indeed the problem but in our variant prediction pipelines using GRCh38 and VEP95 we are also encountering the leading wildtype amino acid.

Is this intentional and if so, is there a heuristic to determine when a Downstream prediction includes this leading wild type amino acid?

@susannasiebert susannasiebert changed the title Downstream plugin inconsistently reports a leading wild type amino acid Downstream plugin inconsistently reports a leading wild type (unmutated) amino acid Aug 5, 2020
@at7 at7 self-assigned this Aug 6, 2020
@at7
Copy link
Contributor

at7 commented Aug 6, 2020

Hi @susannasiebert,

the Downstream plugin never reports a leading wild type amino acid. It always reports the amino acid of the first codon which is affected by the variant allele.
Since release 100 we introduced the option of shifting variant alleles in repetitive regions. The new functionality is switched off by default except for the HGVS calculation. The user from griffithlab/pVACtools#576 who uses VEP release 100 and is also asking for HGVS gets a different Downstream amino acid sequence than when using VEP release 95 or when using VEP release 100 without hgvs option.
And the difference is due to the shifted position of the variant allele in release 100.

But I need to emphasise that I'm not expecting that when --hgvs is used that it has an effect on the calculations in the Downstream plugin. This looks like a bug to me and we need to investigate further. I will keep you updated here with any progress we make on the issue. For now I would recommend that the user shouldn't use --hgvs when annotating the input data which is then used by your tool.

Best wishes,
Anja

@at7 at7 added the bug label Aug 6, 2020
@susannasiebert
Copy link
Author

Please see griffithlab/pVACtools#576 (comment) for example results from running VEP 100 with and without the --hgvs flag. As suspected, not using the --hgvs flag does result in a leading wildtype amino acid in the Downstream protein prediction.

@at7
Copy link
Contributor

at7 commented Aug 7, 2020

Thank you for letting us know. We are still working on a fix and will let you know as soon as we have updated the VEP code.

@susannasiebert
Copy link
Author

Thank you for looking into it. On a (marginally) related note: would it be possible to output the version of a plugin in the plugin's VCF header entry? This would make it possible for our downstream tools to ensure that the new/fixed plugin was used during annotation.

@susannasiebert
Copy link
Author

@at7 I was wondering if there had been any updates on this issue?

@at7
Copy link
Contributor

at7 commented Oct 6, 2020

Hi @susannasiebert,
we are still working on a bug fix and hope to release the fix with the new Ensembl version 102 which we are planning to release by the end of October. I will keep you updated here about any progress.

Best regards,
Anja

@susannasiebert
Copy link
Author

Wonderful. Thank you!

@at7
Copy link
Contributor

at7 commented Nov 30, 2020

Hi Susanna,
I would like to give you an update for this issue. We have been working on a bug fix but we will still need some more time for testing. I will give you another update this week with a more concrete time for when we can provide the fix.
Thank you very much for your patience.
Anja

@susannasiebert
Copy link
Author

@at7 we were able to resolve this issue on our end by switching to a different plugin we wrote ourselves called Frameshift which reports the full mutated transcript protein sequence for frameshift mutations.

Is there a process to submit new plugins to VEP for consideration?

@at7
Copy link
Contributor

at7 commented Feb 11, 2021

We are happy to receive pull requests for new VEP plugins. Our contribution guide is here. Please let me know if you have any more questions.

Best wishes,
Anja

@jamie-m-a jamie-m-a assigned likhitha-surapaneni and unassigned at7 Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants