Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiency when searching for SMIRKS matches #12

Open
bannanc opened this issue Apr 26, 2018 · 1 comment
Open

Efficiency when searching for SMIRKS matches #12

bannanc opened this issue Apr 26, 2018 · 1 comment
Labels
long term Caitlin did not have time to address

Comments

@bannanc
Copy link
Member

bannanc commented Apr 26, 2018

This will be the first of a number of issues that I am migrating form the smarty repo. That is problems around chemical perception searching that we discussed, but chose not to directly address in that code.
I'm not actually sure if this will end up being as big of a problem here as it was with smarty/smirky.

Essentially the looping through molecules and smirks patterns was the cause of smirky's slowness. Discussion available at smarty issue#261. Here is the original text:

I wanted to get an issue started for this that can be used for documentation. I continue to believe that it isn't worth investing significant time into speeding up SMIRKY now, but certain functions from this code will likely carry over to future move proposal engines.

This issue will focus on places that the code can be more efficient, not making "smarter" chemical moves, though that is also going to be important.

At the OFF meeting this week, Daniel Smith has been helping me diagnose what is causing the code to be slow. We've identified the get_typed_molecules as a particularly problematic part of the code, it scales as N^4.

@bannanc
Copy link
Member Author

bannanc commented Jul 2, 2019

I did not bring this up in the preprint, but I think its something worth thinking about. We don't want making SMIRKS to take significant computing time, it should be the fast step. Its worth thinking about if there is a way to flatten the loops currently required for typing molecules.

This was an issue Daniel and I found really early on. However, I think its possible its a bigger openforcefield question. Basically, the problem is that looping over all molecules and all SMIRKS patterns doesn't scale well when you have hundreds of molecules. In testing ChemPer I found that running 5,000 Reducer steps for torsions on the 20 amino acid polypeptide required 5ish hours on my laptop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
long term Caitlin did not have time to address
Projects
None yet
Development

No branches or pull requests

1 participant