Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing maps shouldn't require a ligand #382

Closed
pslacerda opened this issue Feb 2, 2025 · 16 comments · Fixed by #384 · May be fixed by #383
Closed

Writing maps shouldn't require a ligand #382

pslacerda opened this issue Feb 2, 2025 · 16 comments · Fixed by #384 · May be fixed by #383

Comments

@pslacerda
Copy link
Contributor

pslacerda commented Feb 2, 2025

Hi,

I'm optimizing my workflow for speed and observed that I should use precomputed vina/vinardo maps.

The task is done by first computing maps for the target receptor, then following by application of maps in batch mode. The problem is that the command vina requires the user (me) to provide the --ligand or --batch option if I want to --write-maps. This isn't reasonable.

I'm using v1.2.5 by command line (not Python) because I want Windows support and can't find v1.2.6 on Anaconda builds.

A limitation I observed in the --batch mode is that it requires to pass many arguments, one for each ligand to be tried. This easily reaches the command line length limit, which are 2097152 and 830473 characters on Fedora Linux 41 and Windows, respectively. It sould be supported another way to enable batch processing which don't relies on command line, maybe iterating over contents of a folder.

I'm blaming the if-elif at

if (vm.count("ligand")) {
and
} else if (vm.count("batch")) {
not having one more elif clause with a single v.write_maps(out_maps); call. The v.set_ligand_from_file(ligand_names); shouldn't be required for such cases. Also, there are multiple calls to v.write_maps(out_maps); indicating a confusing logic (in my opinion).

I'd like to work on a pull request on this issue, it's feasible?

Best Regards,
Pedro

@diogomart
Copy link
Member

Hi,

I agree that --write_maps shouldn't require a ligand. If you open a PR for that it would be appreciated.

About --batch, it could take either the directory name and then find the files, or explicit filenames. This functionality exists in AutoDock GPU. It is also possible to use the python bindings for batch docking, just loop over the ligands and dock one at a time. If you open a PR for this please make it separate from the write_maps PR.

Thank you for looking into this.

@pslacerda
Copy link
Contributor Author

Just found a Github bug! It will close this issue as soon as any one of these PR get accepted. In this particular occasion both should got into to close this issue.

Image

@pslacerda
Copy link
Contributor Author

Sorry, my bad, I mixed the PRs. I'll touch this in a few days.

@rwxayheee
Copy link
Collaborator

rwxayheee commented Feb 10, 2025

Hi @pslacerda

Looking back at this issue. I remember that I discussed with other users here before, on the same topic, whether to compute maps or initialize ligands first. One of the discussions I could find is: #320

You might want to take a quick look at the documentation:
https://autodock-vina.readthedocs.io/en/latest/docking_python.html

Specifically the documentation mentioned that:

There is a small subility here, the behavior of the compute_vina_maps() function changes if the ligand was loaded before or after computing the vina maps. If no ligand was initialized, compute_vina_maps() will compute the affinity map for each atom types defined in the Vina forcefield (22 in total). 

Based on the description I think by design the compute maps funtionality wants to know what atom types are needed, although I never checked closely the codes.

Regarding this issue you brought up:

The problem is that the command vina requires the user (me) to provide the --ligand or --batch option 

There was a time I also wanted to retrieve the Vina maps (to make some figures showing druggability of a binding site) without docking a specific ligand. For that, I initialize with a mock ligand, not a real one but just enough atom types to get the maps I wanted. This is one workaround to get maps without a real ligand.

@pslacerda
Copy link
Contributor Author

Hey @rwxayheee,

How did you made the druggability figures? It's relevant for me.

But I didn't understood the workflow of computing the maps with the ligand first and then doing the dock with the same ligand. Why compute the map and to dock in different Vina calls?

@rwxayheee
Copy link
Collaborator

rwxayheee commented Feb 11, 2025

Hi @pslacerda

How did you made the druggability figures?

The maps contain volumetric data. The easiest way to process from our maps and to visualize them in PyMOL would be to convert it to DX format. See these references:
https://www.mdanalysis.org/GridDataFormats/gridData/formats/OpenDX.html
https://apbs.readthedocs.io/en/latest/formats/opendx.html

But I didn't understood the workflow of computing the maps with the ligand first and then doing the dock with the same ligand. Why compute the map and to dock in different Vina calls?

I was suggesting this as a workaround to write maps without a ligand, the condition you were trying to handle in this issue.

Why compute the map and to dock in different Vina calls?

I have never done this, but potentially useful when someone wants to manipulate the maps with an additional step (for biased search?).

@pslacerda
Copy link
Contributor Author

Cool!

However I don't understand the maps file format. It is full of zeroes, I suppose that density is zero on such voxels. And there's a map file for each atom type, these types are for the receptor grid, alright? How are these grids related to ligands?

Another related question is the --force_even_voxels option that seems to be an exposed implementation detail. I'd rather remove this option and force all maps to be even.

@rwxayheee
Copy link
Collaborator

Hi @pslacerda

It is full of zeroes,

There can be many zeros (especially if your grid box is large) when the grid points are far away from the receptor. Near the receptors, we will see some non-zero values. You can re-format it into DX file. Actually you can just override the header part and PyMOL will be able to visualize the DX file.

And there's a map file for each atom type, these types are for the receptor grid, alright? How are these grids related to ligands?

// Check that all the affinity map are present for ligands/flex residues (if initialized already)
if (m_ligand_initialized) {
atom_type::t atom_typing = m_scoring_function->get_atom_typing();
szv atom_types = m_model.get_movable_atom_types(atom_typing);

These are atom types in the ligand. This is also why it could be better if ligand atoms types are known before computing maps, to compute just enough types of maps.

Not sure on the --force_even_voxels option.. Because I didn't write this program, I would prefer not to change the behavior unless it's a bug that impacts my outputs.

@pslacerda
Copy link
Contributor Author

pslacerda commented Feb 11, 2025 via email

@rwxayheee
Copy link
Collaborator

Hi @pslacerda

So are the grid points how specific ligand atom types will interact with
the receptor? If a atom is positioned in non-zero point, it will interact
with the specified "strength"? How the point value is used on the score
function?

Yes, the grid values indicate how specific ligand atom types will interact with the receptor.
Currently the grid values are used if energies are computed from --local_only and --search_only. But docking calculation outputs direct interactions, not using the grid values. Please see this discussion:
#232 (comment)
When AutoDock4 maps are supplied, the maps are the only reference for the potential and the grid values are always used to compute the energies.

For batch runs probably full maps should be compiled because ligands aren't
know at first.

Yes I would just write all possible maps at the beginning. This should be already supported in Python, as described in the documentation.

Pretty much like to parse and write maps
simultaneously have no practical use

Agree, but to me it's not a big problem to have them as non-conflicting flags. It's because we have only one entry point, but have to support many functions :'(

@pslacerda
Copy link
Contributor Author

But docking calculation outputs direct interactions, not using the grid values.

Did you get confused? If the grid is used, when a atom recurrently is positioned on such a point while searching, there isn't need to compute pairwise interactions. Am I wrong?

When AutoDock4 maps are supplied, the maps are the only reference for the potential and the grid values are always used to compute the energies.

Vina and Vinardo maps aren't used?

Agree, but to me it's not a big problem to have them as non-conflicting flags. It's because we have only one entry point, but have to support many functions :'(

Not a big problem, just a bit less friendlier.

@rwxayheee
Copy link
Collaborator

rwxayheee commented Feb 11, 2025

Hi @pslacerda

If the grid is used, when a atom recurrently is positioned on such a point while searching, there isn't need to compute pairwise interactions. Am I wrong?

Yes, what you said was incorrect. Please read the linked issue. Grid maps have limited precision.

Vina and Vinardo maps aren't used?

When AD4 maps are supplied, which means scoring function is chosen to be ad4. In this case, direct interactions are never computed.

@pslacerda
Copy link
Contributor Author

@rwxayheee I'm confused.

If --local_only is set, v.optimize() is done without interaction maps as they are imprecise.

} else if (local_only) {
std::vector<double> energies;
energies = v.optimize();
v.write_pose(out_name);
v.show_score(energies);

If a global search is done (not --score_only or neither --local_only), the computation uses the interaction maps.

} else if (!m_map_initialized) {
throw vina_runtime_error("Cannot do the global search. Affinity maps were not initialized.");

Don't local optimization should be applied after the global search (seems it isn't done by main.cpp)?

@rwxayheee
Copy link
Collaborator

rwxayheee commented Feb 11, 2025

Hi @pslacerda
It depends on whether the optimize or the score function uses maps. You can find them in vina.cpp, and honestly, I don't remember..

@rwxayheee
Copy link
Collaborator

rwxayheee commented Feb 11, 2025

I was under the impression only global_search does the full eval of ligand conformation (model) independently of grids. This is also what I read from the previous issues & comments.

You can take a closer look at vina.cpp and model.cpp and tell us if that's correct. In model.cpp, there are eval functions that either compute interactions from atom pairs, or pre-calculated per atoms. It's not very apparent from the docking workflow (main.cpp) and can't just be interpreted by names which ones are using maps, which ones are not.

@pslacerda
Copy link
Contributor Author

Yes @rwxayheee, it does optimize each pose by quasi_newton_par after docking.

VINA_FOR_IN(i, poses){
VINA_FOR(p, 5){
m_non_cache.slope = 100 * std::pow(10.0, 2.0*p);
quasi_newton_par(m_model, m_precalculated_byatom, m_non_cache, poses[i], g, authentic_v, evalcount);
if(m_non_cache.within(m_model))
break;
}

But I don't know if m_precalculated_byatom indicates if the model is being precalculated by atom using maps.

Anyway, I'm giving up.

Do you know if the related PRs will be reviewed and maybe merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants