-
Hi, I don't know if this is the right forum to post as this is not so much an issue as a question on best practice usage. Is there a mailing list or similar for usage questions? Anyway, I'm trying out tsinfer on a data set that includes some 20 populations (species / subspecies) and three outgroups. To begin with we are interested in analyzing a population subset (~12 populations). My question is if one should build trees using only the subset or just as well use all of the populations to then subset the trees for the analyses. Also, we have derived the ancestral states based on the outgroups. Should they be excluded from tree building? Is there a best practice way forward? Cheers, Per |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 15 replies
-
Hi Per - good to hear you are trying out tsinfer. @benjeffery has kindly turned on "GitHub discussions" and moved this question there, so hopefully others will also use this as a forum for In answer to your question, it's early days for working out "best practice", but normally we would advise putting in as much data as is available. I think this would include the outgrips too, as long as you are reasonably sure that the ancestral state for the entire clade (including the outgroups) is correct. It would be sensible to exclude groups that are poorly sequenced, but I suspect that it will be individual sites that will be poorly known (or perhaps where the ancestral state is uncertain). These can be excluded from the entire inference process by setting |
Beta Was this translation helpful? Give feedback.
Hi Per - good to hear you are trying out tsinfer. @benjeffery has kindly turned on "GitHub discussions" and moved this question there, so hopefully others will also use this as a forum for
tsinfer
questions.In answer to your question, it's early days for working out "best practice", but normally we would advise putting in as much data as is available. I think this would include the outgrips too, as long as you are reasonably sure that the ancestral state for the entire clade (including the outgroups) is correct.
It would be sensible to exclude groups that are poorly sequenced, but I suspect that it will be individual sites that will be poorly known (or perhaps where the ancestral state i…