You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found this behavior to be a bit unintuitive... I would be happy to submit a PR to address it, but it's not necessarily clear what the right behavior is. Here is an MRE; the issue is that calculating diversity on a range between variant raises an exception, rather than just returning zero pairwise diversity:
Generally, I think addressing this by modifying the behavior through pos.locate_range() is probably the wrong approach. Instead, we could wrap this in a try/except block for sequence_diversity() and like functions, and immediately return 0 if there are no variants within the region of interest (though, perhaps if no bases are accessible there, it should be np.nan instead? With the usual caveats this is a bit of a hack to deal with missingness values....).
Additionally, it's worth pointing out that this behavior is similar if no bases are accessible, but the range does include variants:
The issue here is that mask_inaccessible() is removing all positions that are inaccessible. In this case, I think a np.nan might be a more desired return value too?
I am curious what you think.
thanks,
Vince
The text was updated successfully, but these errors were encountered:
Hello,
I found this behavior to be a bit unintuitive... I would be happy to submit a PR to address it, but it's not necessarily clear what the right behavior is. Here is an MRE; the issue is that calculating diversity on a range between variant raises an exception, rather than just returning zero pairwise diversity:
Because the selected range resides between two variants,
pos.locate_range()
errors out with aKeyError
. Here is the full traceback:Generally, I think addressing this by modifying the behavior through
pos.locate_range()
is probably the wrong approach. Instead, we could wrap this in atry/except
block forsequence_diversity()
and like functions, and immediately return 0 if there are no variants within the region of interest (though, perhaps if no bases are accessible there, it should benp.nan
instead? With the usual caveats this is a bit of a hack to deal with missingness values....).Additionally, it's worth pointing out that this behavior is similar if no bases are accessible, but the range does include variants:
The issue here is that mask_inaccessible() is removing all positions that are inaccessible. In this case, I think a
np.nan
might be a more desired return value too?I am curious what you think.
thanks,
Vince
The text was updated successfully, but these errors were encountered: