Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is allel.stats.hudson_fst() actually unbiased following Bhatia et al. 2013? #407

Open
taprs opened this issue Feb 13, 2024 · 1 comment

Comments

@taprs
Copy link

taprs commented Feb 13, 2024

Hi scikit-allel team and thanks for doing a good job!

Checking the implementation of Hudson's Fst estimator, I had an impression that despite the reference to Bhatia et al. (2013) the allel.stats.hudson_fst() function does not account for the systematic bias of using $π_{within}$ as estimator for $H_w$ and $π_{between}$ as estimator for $H_b$. It just does

$$ F_{st} = { { π_{between}-π_{within} } \over { π_{between} } } $$

, right? These are naive estimators for numerator and denominator sensu Bhatia et al. (2013), the way to correct them is shown in equation (10) in the paper (and there is the nice section on its justification and derivation in Supplementary materials).

image

where $n_1$ and $n_2$ are allele counts for populations 1 and 2 and $p_1$ and $p_2$ are their allele frequencies.

Did I miss the place where the bias is eventually accounted for, or should the function be modified (or at least the note about using unbiased estimator removed)? I can try to work on fixing the function if this is the case.

Cheers,
Nikita

@taprs
Copy link
Author

taprs commented Feb 14, 2024

Update: I tried to mimic the formula above and compare with hudson_fst() and the results are the same. Apparently I am missing something in the source code...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant