Skip to content

Input Format

Katie Siewert edited this page Jan 15, 2019 · 12 revisions

BetaScan takes in a tab separated file with three columns. The first column contains the coordinate of each variant, and the second contains the frequency of the derived allele (note: this is opposite of the BALLET software), in number of haploid individuals, of the variant. However, in practice, for folded Beta only, it doesn't matter if the derived, ancestral, or already folded allele frequency is used in the second column, as BetaScan will fold the frequency anyway. The third column contains the sample size, in number of haploid individuals, that were used to calculate the frequency of that variant. The file should be sorted by position (the unix command sort -g will do this for you). If you are using the Beta2 statistic, substitutions should be coded as SNPs of frequency equal to the sample size. The scan should be run on each chromosome separately. An example of a sample file is below:

14  2 99  
15 99 99
25  1 100  
47  99  100
48  82  95
98 100 100
103 10  100
245 93  96
Clone this wiki locally