You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Post-processing is supposed to remove the virtual-root-like ancestor (0) and then split the ultimate ancestor (1) into multiple roots whenever its children change. However, it seems that this process misses some cases. For example:
importmsprimeimporttsinferimportnumpyasnpsim_ts=msprime.sim_ancestry(
100,
sequence_length=1e6,
random_seed=2,
recombination_rate=1e-8,
population_size=1e4,
)
sim_mts=msprime.sim_mutations(sim_ts, rate=1.29e-8, random_seed=2)
sample_data=tsinfer.SampleData.from_tree_sequence(sim_mts)
base_ts=tsinfer.infer(sample_data, post_process=False)
ts=tsinfer.post_process(base_ts)
roots= []
edges_right=ts.edges_rightedges_parent=ts.edges_parentproblem_trees= []
fortreeints.trees():
iflen(tree.roots) ==1: #skip first and last treesroot=tree.rootroots.append(root)
right=edges_right[edges_parent==root]
ifnotnp.all(right==right[0]): #some children change at different positionsproblem_trees.append(tree.index)
print(f"{len(problem_trees)} root changes are missed out of {len(roots)} roots") #116 out 1244
Here is one of the problematic trees:
tree=ts.at_index(351)
print(f'Root: {tree.root}, children = {tree.children(tree.root)}')
#Root: 1744, children = (1683, 1151)tree.prev()
print(f'Root: {tree.root}, children = {tree.children(tree.root)}')
#Root: 1744, children = (1683, 1027, 1335)
I modified post_process to return the root_breaks, and they match the children_changes I calculated above exactly. So I think the problem might be with the step to modify the tables to add new roots, but I'm not sure where the bug is yet.
The text was updated successfully, but these errors were encountered:
I just found #850 and tskit-dev/tsdate#452 which discuss this. I'm confused, since aren't all the local root nodes created by splitting up the ultimate ancestor (1) whenever its children change? I don't understand why we would expect to have local roots with children that change over multiple trees after post-processing. @hyanwong, could you please clarify?
I had made a silly error here: I forgot that simplification will remove the ultimate root nodes where there is a younger MRCA, and there is obviously no guarantee that these new roots will have the same children over multiple trees. Even when the ultimate root node is present in the simplified tree, the edges with that node as parent will not always share the same left and right coordinates, because simplification will also merge adjacent edges with the same parent and child. Since it's not a bug, I'm closing the issue.
Post-processing is supposed to remove the virtual-root-like ancestor (0) and then split the ultimate ancestor (1) into multiple roots whenever its children change. However, it seems that this process misses some cases. For example:
Here is one of the problematic trees:
The calculation of root breaks in
split_ultimate_ancestor
seems to be correct in this example:I modified
post_process
to return theroot_breaks
, and they match thechildren_changes
I calculated above exactly. So I think the problem might be with the step to modify the tables to add new roots, but I'm not sure where the bug is yet.The text was updated successfully, but these errors were encountered: