Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defining reference path on graph #532

Open
ManuelTgn opened this issue Sep 28, 2023 · 3 comments
Open

Defining reference path on graph #532

ManuelTgn opened this issue Sep 28, 2023 · 3 comments

Comments

@ManuelTgn
Copy link

Hi team,

thank you for developing this amazing tool!

I have a couple of questions regarding odgi graphs manipulation through the Python API.

I created an odgi graph from a gfa containing > 1000 haplotypes + 1 reference sequence. Is there a way to understand which path belongs to the reference sequence? All paths seem to point to an haplotype.

While traversing the paths embedded in the graph, is it possible to understand to which position correspond a nucleotide in a node, using the reference coordinates space?

Thanks for the help!

@subwaystation
Copy link
Member

Hi @ManuelTgn,

since I don't know in detail how you created the graphs, it is hard to tell. In ODGI's variation graph model all paths are equal, there is no sense of a reference path.

If you want to use ODGI to translate path positions to your path of choice, please take a look at https://odgi.readthedocs.io/en/latest/rst/tutorials/navigating_and_annotating_graphs.html#path-to-path-position-mapping.

@ManuelTgn
Copy link
Author

Hi @subwaystation,

thank you for the reply. That was very helpful!

I created my input graph using the vg toolkit, enriching a chromosome with some variants stored in a phased VCF. Are there python functions in ODGI's API to replicate the command line functionalities showed in the documentation page you shared?

@subwaystation
Copy link
Member

subwaystation commented Oct 11, 2023

Hi @ManuelTgn,

I see, so no crazy variants with crazy repeats. That's good :)

As far as I know, the Python API only exposes basic ODGI functionality. You would have to build such code on your own. Take a look at

#pragma omp parallel for schedule(dynamic,1)
for (auto& path_pos : path_positions) {
// TODO we need a better input format
pos_t pos;
step_handle_t step_handle_graph_pos;
// handle the lift into the target graph
if (lifting) {
lift_result_t source_result;
pos_t _pos = get_graph_pos(source_graph, path_pos, step_handle_graph_pos);
if (id(_pos) && get_position(source_graph, lift_path_set_source, _pos, source_result, step_handle_graph_pos, true)) {
pos = get_graph_pos(target_graph,
{ target_graph.get_path_handle(
source_graph.get_path_name(
source_graph.get_path_handle_of_step(
source_result.ref_hit))),
(uint64_t)source_result.path_offset,
source_result.is_rev_vs_ref },
step_handle_graph_pos);
} else {
pos = make_pos_t(0,false,0); // couldn't lift
}
} else {
//path_pos = _path_pos;
pos = get_graph_pos(target_graph, path_pos, step_handle_graph_pos);
}
lift_result_t result;
//std::cerr << "Got graph pos " << id(pos) << std::endl;
if (id(pos)) {
if (give_graph_pos) {
#pragma omp critical (cout)
std::cout << "#source.path.pos\ttarget.graph.pos" << std::endl
<< (lifting ? source_graph.get_path_name(path_pos.path) : target_graph.get_path_name(path_pos.path))
<< "," << path_pos.offset << "," << (path_pos.is_rev ? "-" : "+")
<< "\t" << id(pos) << "," << offset(pos) << "," << (is_rev(pos) ? "-" : "+") << std::endl;
} else if (get_position(target_graph, ref_path_set, pos, result, step_handle_graph_pos, true)) {
bool ref_is_rev = false;
path_handle_t p = target_graph.get_path_handle_of_step(result.ref_hit);
#pragma omp critical (cout)
std::cout << "#source.path.pos\ttarget.path.pos\tdist.to.ref\tstrand.vs.ref" << std::endl
<< (lifting ? source_graph.get_path_name(path_pos.path) : target_graph.get_path_name(path_pos.path)) << ","
<< path_pos.offset << "," << (path_pos.is_rev ? "-" : "+") << "\t"
<< target_graph.get_path_name(p) << "," << result.path_offset << "," << (ref_is_rev ? "-" : "+") << "\t"
<< result.walked_to_hit_ref << "\t" << (result.is_rev_vs_ref ? "-" : "+") << std::endl;
}
}
}
as in inspiration.

Best,
Simon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants