You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi scikit-allel team and thanks for doing a good job!
Checking the implementation of Hudson's Fst estimator, I had an impression that despite the reference to Bhatia et al. (2013) the allel.stats.hudson_fst() function does not account for the systematic bias of using $π_{within}$ as estimator for $H_w$ and $π_{between}$ as estimator for $H_b$. It just does
, right? These are naive estimators for numerator and denominator sensu Bhatia et al. (2013), the way to correct them is shown in equation (10) in the paper (and there is the nice section on its justification and derivation in Supplementary materials).
where $n_1$ and $n_2$ are allele counts for populations 1 and 2 and $p_1$ and $p_2$ are their allele frequencies.
Did I miss the place where the bias is eventually accounted for, or should the function be modified (or at least the note about using unbiased estimator removed)? I can try to work on fixing the function if this is the case.
Cheers,
Nikita
The text was updated successfully, but these errors were encountered:
Update: I tried to mimic the formula above and compare with hudson_fst() and the results are the same. Apparently I am missing something in the source code...
Hi scikit-allel team and thanks for doing a good job!
Checking the implementation of Hudson's Fst estimator, I had an impression that despite the reference to Bhatia et al. (2013) the$π_{within}$ as estimator for $H_w$ and $π_{between}$ as estimator for $H_b$ . It just does
allel.stats.hudson_fst()
function does not account for the systematic bias of using, right? These are naive estimators for numerator and denominator sensu Bhatia et al. (2013), the way to correct them is shown in equation (10) in the paper (and there is the nice section on its justification and derivation in Supplementary materials).
where$n_1$ and $n_2$ are allele counts for populations 1 and 2 and $p_1$ and $p_2$ are their allele frequencies.
Did I miss the place where the bias is eventually accounted for, or should the function be modified (or at least the note about using unbiased estimator removed)? I can try to work on fixing the function if this is the case.
Cheers,
Nikita
The text was updated successfully, but these errors were encountered: