Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support importing (some) virtual sites in Interchange.from_openmm #1081

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

mattwthompson
Copy link
Member

@mattwthompson mattwthompson commented Oct 21, 2024

Description

Resolves #1063

Checklist

Stuff to do in later PRs

  • Process LocalCoordinatesSites
  • Handle getting positions of virtual sites mixes from OpenMM and SMIRNOFF sources
  • Export virtual sites from OpenMM to GROMACS
  • Refactor VirtualSiteType to be more explicit
  • Consider storing virtual site and particle maps directly on Interchange objects
  • Prototype subclassing Topology

@mattwthompson mattwthompson added this to the 0.4.2 milestone Oct 21, 2024
Copy link

codecov bot commented Oct 23, 2024

Codecov Report

Attention: Patch coverage is 97.98995% with 4 lines in your changes missing coverage. Please review.

Project coverage is 93.69%. Comparing base (fcf4399) to head (356718f).

Additional details and impacted files

@mattwthompson mattwthompson marked this pull request as ready for review October 24, 2024 17:21
Copy link
Collaborator

@Yoshanuikabundi Yoshanuikabundi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this on a hand-built OpenMM Simulation featuring a single water molecule with made up parameters, and also on a single Tip4P molecule. Things mostly worked, and errors were mostly clear. I do have a few notes:

  • export INTERCHANGE_EXPERIMENTAL=1 seems to be required, but this does not seem to be documented and the error message doesn't make this clear. This was the only time I was confused by an error message.
  • At the moment, the number of atoms in the imported topology must match the number of particles in the system. I think we should consider if topologies matching the number of real atoms in the system should be accepted as well. This would allow OpenFF Topology objects to work when importing systems with supported virtual sites, as they currently do for systems without virtual sites. This makes particular sense given that it appears that virtual sites are stripped out of the OpenFF topology that's available from interchange.topology.
  • The exported OpenMM topology via to_openmm_simulation() had its virtual sites in a separate chain and residue - I think it'd be better if they were in the same chain and residue as the parent atom.
  • This is unrelated to the PR, but I tried a hand-built system without bonds (single water molecule with constraints, but no bonds force), and Interchange complained:
    136 # TODO: Does this run through the Interchange.box validator?
    137 interchange.box = _box_vectors
--> 139 if interchange.topology.n_bonds > len(interchange.collections["Bonds"].key_map):
    140     # There are probably missing (physics) bonds from rigid waters. The topological
    141     # bonds are probably processed correctly.
    142     _fill_in_rigid_water_bonds(interchange)
    144 return interchange

KeyError: 'Bonds'

Since this code currently attempts to account for bond parameters missing from the bond parameters, maybe we should account for this possibility?

A full review for such a big patch is gonna take me some time to work through, especially since I'm gonna need to get my head around how everything works, but those are my notes from some independent testing.

But yeah this is great! Things seem to be working pretty robustly once you come up with a system and topology that works - I was able to swap the topology of simple molecules out for a "real" openff topology after the Interchange was constructed and things kept working. Round trips work, energy minimization works from Interchange land, and visualizations work perfectly.


### Virtual sites must be listed after heavy atoms each molecule

It's assumed that, in each molecule in an OpenMM topology, all heavy atoms are listed before any virtual sites. This includes the case of all virtual sites being listed after all heavy atoms, i.e. not collated into molecules/residues. There are no community standards areound particle ordering, but virtual sites are typically listed after heavy atoms in each molecule or residue.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It's assumed that, in each molecule in an OpenMM topology, all heavy atoms are listed before any virtual sites. This includes the case of all virtual sites being listed after all heavy atoms, i.e. not collated into molecules/residues. There are no community standards areound particle ordering, but virtual sites are typically listed after heavy atoms in each molecule or residue.
It's assumed that, in each molecule in an OpenMM topology, all heavy atoms are listed before any virtual sites. This includes the case of all virtual sites being listed after all heavy atoms, i.e. not collated into molecules/residues. There are no community standards around particle ordering, but virtual sites are typically listed after heavy atoms in each molecule or residue.

Comment on lines 7 to 11
_UNSUPPORTED_VIRTUAL_SITE_TYPES: set[type] = {
openmm.TwoParticleAverageSite,
openmm.OutOfPlaneSite,
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My instinct would be to flip this and explicitly list the supported virtual site types, so that if OpenMM adds a new virtual site type this won't need to be updated.

If there's a nice place/name for it, making this constant public would allow the supported virtual sites to be documented automatically directly from the code.

openff/interchange/components/toolkit.py Outdated Show resolved Hide resolved
Comment on lines +190 to +200
topology._molecule_virtual_site_map = defaultdict(list)

# TODO: This iteration strategy scales horribly with system size - need to refactor -
# since looking up topology atom indices is slow. It's probably repetitive to
# look up the molecule index over and over again
for particle in virtual_sites:
molecule_index = topology.molecule_index(topology.atom(particle.orientation_atom_indices[0]).molecule)

topology._molecule_virtual_site_map[molecule_index].append(particle)

topology._particle_map = openmm_openff_particle_map
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be reasonable to squish topology._particle_map and topology._molecule_virtual_site_map into atom metadata rather than monkey patching? I think the answer is probably "no", and I see this information is moved to the Interchange, and I've tested that Interchange.topology seems to be able to be swapped out for a true topology (with full molecular graphs etc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly; I don't love the current design but it has the benefit of not interacting with anything the user might do in the public API

@mattwthompson
Copy link
Member Author

Thanks for the review

export INTERCHANGE_EXPERIMENTAL=1 seems to be required

This shouldn't be the case in the 0.4.x line, possibly your local setup is out of date. I've removed some obsolete stuff in tests now that Interchange.from_openmm is not "experimental."

I think we should consider if topologies matching the number of real atoms in the system should be accepted as well

Agree, but something for future work. I want to get this functionality over the finish line in its current scope as soon as I can

virtual sites in a separate chain and residue

Right, this was the result of #1051

we should account for this possibility

Probably yeah, topological edges being defined different ways is pretty frustrating to work with

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support of virtual sites for creating an Interchange from_openmm
2 participants