Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue #606 - catch index error for variants at alignment gap #609

Closed

Conversation

kyuhas
Copy link

@kyuhas kyuhas commented Dec 17, 2020

Hi Reece,

I tried adding unit tests in test_hgvs_variantmapper.py for the scenarios described in issue 606. However, I couldn't add them because of the cache that was being used. I was unable to open the cache-py3.hdp file. Would you be able to help me with this? I'd like to add a couple of tests to confirm that the correct exception is being thrown.

Thanks!
Kaylee

@coveralls
Copy link

coveralls commented Dec 17, 2020

Pull Request Test Coverage Report for Build 352

  • 38 of 38 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.02%) to 92.416%

Totals Coverage Status
Change from base Build 347: 0.02%
Covered Lines: 2169
Relevant Lines: 2347

💛 - Coveralls

@reece
Copy link
Member

reece commented Dec 17, 2020

Happy to help! Thanks for give this a shot.

Did you find/confirm the underlying cause? Do you have test cases? Please see tests/issues/test_* files for examples and put your tests in a new file. Also, a diagram of the issue would be very valuable. (test_437.py has a good example)

You don't edit cache-py3.hdp directly. Instead, you run tests in a special mode to add new data required for tests. For example:

$ HGVS_CACHE_MODE=learn make test
$ HGVS_CACHE_MODE=learn pytest tests/issues/test_606.py

The learn mode will fetch and cache and new data.

@kyuhas
Copy link
Author

kyuhas commented Dec 18, 2020

Thanks Reece, that was super helpful! While I don't think this code change solves the core issue (which seems to be related to the _get_altered_sequence method), this at least solves my use case of throwing a more meaningful error. So this wouldn't close issue 606 (since the core issue isn't addressed), but it would make the integration with my code a little easier. I'll be working on some diagrams and will update this PR early next week.

Thanks!

@reece
Copy link
Member

reece commented Dec 19, 2020

Hi @kyuhas. I would prefer to understand the root cause and solve that once rather than expect all callers to wrap. I look forward to seeing your assessment of what's happening and appreciate your time.

@kyuhas
Copy link
Author

kyuhas commented Dec 23, 2020

Hi Reece,
I have updated the pull request to (1) add unit tests, and (2) move the try/catch for Index Errors into the "_get_altered_sequence" method where the error occurs. _get_altered_sequence is only used in variantmapper.py, and only when the variant is at an "alignment gap". So the idea is that if we already know something is funky with the alignment, and then you end up with an index error, throwing the HGVSInvalidIntervalError seems appropriate.

There seems to be multiple root causes possible, two of which are illustrated by the unit tests.

The first unit test example, NC_000001.10:g.16890441C>G has transcript NM_017940.4 with 25 exons in the 'transcript' alignment to itself, but 28 exons in the 'splign' alignment to the reference sequence. So in this case, ultimately the root cause is inconsistent data.

The second unit test example, NM_001291722.1:c.283-3C>T, seems to be invalid input (this is the example from issue 606 #606). According to the transcript in UTA, exon 5 starts at position 286 of the transcript, not position 283. The input "NM_001291722.1:c.286-3C>T" does not give an index error.

In both cases, in the "_get_altered_sequence" method, either bad data or bad input ultimately causes pos_start/pos_end to be outside of the bounds of the seq array, causing the index error.

I am not yet familiar enough with hgvs to identify if there is a more appropriate spot to identify the bad data/bad input-- it will take a while to fully understand.

Unfortunately, we have a time-sensitive need to handle this error. Let me know if this updated pull request is a good fix for now. If not, then I'll have my code catch Index Errors when calling the hgvs package until this is resolved some time in the future.

Thanks!

Base automatically changed from master to main February 26, 2021 21:22
Copy link

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Issue is stale and subject to automatic closing label Dec 13, 2023
Copy link

This PR was closed because it has been stalled for 7 days with no activity.

@github-actions github-actions bot closed this Dec 21, 2023
@reece reece reopened this Feb 19, 2024
@reece reece requested a review from a team as a code owner February 19, 2024 17:45
@reece reece added resurrected and removed stale Issue is stale and subject to automatic closing labels Feb 19, 2024
@reece
Copy link
Member

reece commented Feb 19, 2024

This issue was closed by stalebot. It has been reopened to give more time for community review. See biocommons coding guidelines for stale issue and pull request policies. This resurrection is expected to be a one-time event.

Copy link

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Issue is stale and subject to automatic closing label Mar 21, 2024
@ahwagner ahwagner removed the stale Issue is stale and subject to automatic closing label Mar 26, 2024
Copy link

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Issue is stale and subject to automatic closing label Apr 26, 2024
Copy link

github-actions bot commented May 3, 2024

This PR was closed because it has been stalled for 7 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-by-stale resurrected stale Issue is stale and subject to automatic closing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants