Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initialize guess_atoms function PR #264

Merged
merged 11 commits into from
Aug 12, 2023
Merged

initialize guess_atoms function PR #264

merged 11 commits into from
Aug 12, 2023

Conversation

cadeduckworth
Copy link
Contributor

  • add helper function to correctly guess atom/element types from topology
  • helper function guess_atoms used in rdkit_conversion function
  • temporary changes pending MDAnalysis guessers class bug fixes

@codecov
Copy link

codecov bot commented Jul 30, 2023

Codecov Report

Merging #264 (3f90ad2) into develop (10922e8) will increase coverage by 0.17%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #264      +/-   ##
===========================================
+ Coverage    81.12%   81.29%   +0.17%     
===========================================
  Files           15       15              
  Lines         1960     1978      +18     
  Branches       296      297       +1     
===========================================
+ Hits          1590     1608      +18     
  Misses         278      278              
  Partials        92       92              
Files Changed Coverage Δ
mdpow/workflows/base.py 83.07% <100.00%> (+5.52%) ⬆️
mdpow/workflows/dihedrals.py 95.97% <100.00%> (+0.04%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@cadeduckworth
Copy link
Contributor Author

@orbeckst please review

I did not try it out on any cases outside of the notebook during development, however, I did incorporate usage into the dihedrals module and also added the test for the base module. Everything passed locally, so hopefully there are no issues with the CI tests.

If you see any issues please let me know. If everything looks good I will update my SAMPL9 environment, do a test run, and add documentation once we are sure everything is working as intended.

If you would like more detail on the new loop added for creating the boolean array used for identifying problem cases I can upload the updated notebook here.. just let me know!

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start, especially having a test right away. But I'd like the code to be nicer (see comments) and documented.

Please also add a test case that includes dummies and/or other problematic inputs that you encountered.

mdpow/tests/test_workflows_base.py Outdated Show resolved Hide resolved
mdpow/tests/test_workflows_base.py Outdated Show resolved Hide resolved
mdpow/tests/test_workflows_base.py Outdated Show resolved Hide resolved
mdpow/tests/test_workflows_base.py Outdated Show resolved Hide resolved
@@ -173,3 +178,34 @@ def automated_project_analysis(project_paths, ensemble_analysis, **kwargs):

logger.info('all analyses completed')
return

def guess_elements(universe_atom_group):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to the simpler

Suggested change
def guess_elements(universe_atom_group):
def guess_elements(atoms):

and then add documentation that explains parameters and returns. It really can be any atomgroup that contains correct masses.

# universe_atom_group = universe.atoms
# universe from MDPOW FEP project results
names = universe_atom_group.names
masses = universe_atom_group.masses
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally I would say "we need to add checks that we can get masses etc" but within MDPOW we now that we use TPRs so I am fine with just documenting it.

Comment on lines 183 to 185
# input from dihedrals module
# universe_atom_group = universe.atoms
# universe from MDPOW FEP project results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comments, the function is more general

Comment on lines 193 to 201
i=0
while i < len(masses):
if masses[i] != 0:
value = math.isclose(guessed_masses[i], masses[i], rel_tol=0.001)
value = not(value)
else:
value = False
problem_cases.append(value)
i+=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ugly. Can't we get numpy.isclose() get to work? And don't use math if we already have numpy.

If you provide me with a minimal example where the old code didn't work, I'll have a look. (Basically, add a test that fails with the old numpy-based code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like

problems = np.logical_not(np.isclose(masses, guessed_masses))
problems[np.isclose(masses, 0)] = False

should set any entries where masses are zero to False, i.e., they are not an issue.

Comment on lines 184 to 187
from .base import guess_elements
# temporary
# keep here to avoid circular imports?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently don't see that base imports dihedrals in any way, or did I overlook something? What's your rationale for not putting the imports at the top of the file?

- make sure that we always get a list, even if we didn't have any problem elements
- add documentation
@orbeckst
Copy link
Member

This needs some debugging. Ideally you can put whatever kills it at the moment in a test case.

GROMACS masses are not exactly identical to MDAnalysis masses.
We need a relative tolerance of at least 1e-3 to reliably
match them.
- improved guess_elements() by avoiding slow loops: use array look ups
- guess ALL elements
- updated CHANGES
- removed ALL loops and only use array lookups
- manually set all zero masses to DUMMY (MW is detected as DUMMY, DUMMY as D, so
  this makes all dummies DUMMY for consistency)
- hard code ATOL=1e-6 for detecting dummies with very small discrepancy from 0
- updated docs
- tests for approximate masses
@orbeckst orbeckst self-assigned this Aug 12, 2023
@orbeckst orbeckst merged commit b8842a1 into develop Aug 12, 2023
11 checks passed
@orbeckst orbeckst deleted the guess-atoms branch August 12, 2023 10:24
@orbeckst
Copy link
Member

@cadeduckworth see the bug MDAnalysis/mdanalysis#4167 for LigParGen atom types and the fix PR MDAnalysis/mdanalysis#4168 which is now in the latest released version of MDA. Do we still need our own fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants