-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Integration for CP2K #383
Open
nwinner
wants to merge
55
commits into
hackingmaterials:main
Choose a base branch
from
nwinner:cp2k
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
55 commits
Select commit
Hold shift + click to select a range
c856382
Fixed import.
nwinner 8ff38d1
A lot of clean-up. Had not been addressed in a while by the devs it l…
nwinner 87ea810
Added lammps_input_set so that atom_style can be retained as a variable
nwinner 29d7e94
Playing with the NEB workflow. Might try to figure out a way to
nwinner 6c845e6
Currently thinking it would be best to use env_chk with vasp_neb_cmd …
nwinner 02776ff
fix
nwinner 68e5465
fix
nwinner 2600dca
Minor correction to glue_tasks. If reading in the WAVECAR (which coul…
nwinner f294cf3
Lammps should support env_chk
nwinner 2665de6
Lammps should support env_chk
nwinner c2a17fa
Exploring a new firetask.
nwinner 6c3872d
Debug.
nwinner 5108d19
Lammps should support env_chk
nwinner 21b9fb2
Trying something different. Adding a seperate file for my user task.
nwinner 17a466a
Added a brief function to transmute after relaxing.
nwinner 9b1cd17
Working on this new task.
nwinner 914b9cd
Working on Lammps run_calc
nwinner 7005625
Debug
nwinner 1d2e29e
Lammps Run
nwinner e3d670f
Working on it.
nwinner 748e0c9
debug
nwinner 9bc9563
Lammps should support env_chk
nwinner e46e77a
Lammps should support env_chk
nwinner 31830ba
Debug
nwinner 39d04a7
debug
nwinner 33e6987
debug
nwinner e298e49
debug
nwinner d71f00c
debug
nwinner f76d774
Debug.
nwinner b087ea2
First design of cp2k module for atomate.
nwinner 6262805
First commit for the cp2k module in atomate. Pretty rough, but to start
nwinner d22d72d
Continuing testing and refinement. Added database and drone functions…
nwinner b137ee9
File copying is a little messy right now. The change of file names ac…
nwinner 9168197
Wavefunction files are bytes files. File copying needs to handle that.
nwinner ab1bd74
File copying is now to the point where things will at least run
nwinner 7051dd7
Merge with master
nwinner 10f846e
Merge https://github.com/hackingmaterials/atomate into cp2k
nwinner 229a187
user_tasks.py not needed, now in MPMorph
nwinner 808cb60
Database was missing kwargs, led to errors.
nwinner 58f2152
Cleanup.
nwinner 361cad5
Beginning to integrate the ability to do defect workflows by building
nwinner c0e7acd
Debugging last commit on the cluster.
nwinner 226eabe
Minor refinements
nwinner 0eaded2
Minor refinements
nwinner b8867c5
Typo.
nwinner 2dd859a
Testing from very simple cell_opt wfs on the cluster.
nwinner b801820
Database manipulation for CalcDb task
nwinner 0bac0d9
Missing import statement.
nwinner f0cc95a
Drones: fft can only apply to array, not array of arrays
nwinner 6edb5db
Incorrect hartree file parsing. Works now, but might change again later.
nwinner c67165e
Bug.
nwinner 6d9c4e1
Corrections to the drone assimilation.
nwinner 2b3b579
Written incorrectly. It was working because the underlying sets were not
nwinner 60c15a9
Fixing write inputs.py.
nwinner 4201338
Quick commit. Saving state.
nwinner File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,281 @@ | ||
# coding: utf-8 | ||
|
||
from monty.json import MontyEncoder | ||
from monty.serialization import loadfn | ||
|
||
""" | ||
This module defines the database classes. | ||
""" | ||
|
||
import zlib | ||
import json | ||
from bson import ObjectId | ||
|
||
from pymatgen.electronic_structure.bandstructure import ( | ||
BandStructure, | ||
BandStructureSymmLine, | ||
) | ||
from pymatgen.electronic_structure.dos import CompleteDos | ||
|
||
import gridfs | ||
from pymongo import ASCENDING, DESCENDING | ||
|
||
from atomate.utils.database import CalcDb | ||
from atomate.utils.utils import get_logger | ||
|
||
__author__ = "Nicholas Winner" | ||
|
||
logger = get_logger(__name__) | ||
|
||
|
||
class Cp2kCalcDb(CalcDb): | ||
""" | ||
Class to help manage database insertions of cp2k drones | ||
""" | ||
|
||
def __init__( | ||
self, | ||
host="localhost", | ||
port=27017, | ||
database="cp2k", | ||
collection="tasks", | ||
user=None, | ||
password=None, | ||
**kwargs | ||
): | ||
super(Cp2kCalcDb, self).__init__( | ||
host, port, database, collection, user, password, **kwargs | ||
) | ||
|
||
def build_indexes(self, indexes=None, background=True): | ||
""" | ||
Build the indexes. | ||
|
||
Args: | ||
indexes (list): list of single field indexes to be built. | ||
background (bool): Run in the background or not. | ||
|
||
TODO: make sure that the index building is sensible and check for | ||
existing indexes. | ||
""" | ||
_indices = ( | ||
indexes | ||
if indexes | ||
else [ | ||
"formula_pretty", | ||
"formula_anonymous", | ||
"output.energy", | ||
"output.energy_per_atom", | ||
"dir_name", | ||
] | ||
) | ||
self.collection.create_index( | ||
"task_id", unique=True, background=background | ||
) | ||
# build single field indexes | ||
for i in _indices: | ||
self.collection.create_index(i, background=background) | ||
# build compound indexes | ||
for formula in ("formula_pretty", "formula_anonymous"): | ||
self.collection.create_index( | ||
[ | ||
(formula, ASCENDING), | ||
("output.energy", DESCENDING), | ||
("completed_at", DESCENDING), | ||
], | ||
background=background, | ||
) | ||
self.collection.create_index( | ||
[ | ||
(formula, ASCENDING), | ||
("output.energy_per_atom", DESCENDING), | ||
("completed_at", DESCENDING), | ||
], | ||
background=background, | ||
) | ||
|
||
def insert_task(self, task_doc, use_gridfs=False): | ||
""" | ||
Inserts a task document (e.g., as returned by Drone.assimilate()) into the database. | ||
Handles putting DOS, band structure and charge density into GridFS as needed. | ||
During testing, a percentage of runs on some clusters had corrupted AECCAR files when even if everything else about the calculation looked OK. | ||
So we do a quick check here and only record the AECCARs if they are valid | ||
|
||
Args: | ||
task_doc: (dict) the task document | ||
use_gridfs (bool) use gridfs for bandstructures and DOS | ||
Returns: | ||
(int) - task_id of inserted document | ||
""" | ||
dos = None | ||
|
||
# move dos BS and CHGCAR from doc to gridfs | ||
if use_gridfs and "calcs_reversed" in task_doc: | ||
|
||
if ( | ||
"dos" in task_doc["calcs_reversed"][0] | ||
): # only store idx=0 (last step) | ||
dos = json.dumps( | ||
task_doc["calcs_reversed"][0]["dos"], cls=MontyEncoder | ||
) | ||
del task_doc["calcs_reversed"][0]["dos"] | ||
|
||
# insert the task document | ||
t_id = self.insert(task_doc) | ||
|
||
# insert the dos into gridfs and update the task document | ||
if dos: | ||
dos_gfs_id, compression_type = self.insert_gridfs( | ||
dos, "dos_fs", task_id=t_id | ||
) | ||
self.collection.update_one( | ||
{"task_id": t_id}, | ||
{ | ||
"$set": { | ||
"calcs_reversed.0.dos_compression": compression_type | ||
} | ||
}, | ||
) | ||
self.collection.update_one( | ||
{"task_id": t_id}, | ||
{"$set": {"calcs_reversed.0.dos_fs_id": dos_gfs_id}}, | ||
) | ||
|
||
return t_id | ||
|
||
def retrieve_task(self, task_id): | ||
""" | ||
Retrieves a task document and unpacks the band structure and DOS as dict | ||
|
||
Args: | ||
task_id: (int) task_id to retrieve | ||
|
||
Returns: | ||
(dict) complete task document with BS + DOS included | ||
|
||
""" | ||
task_doc = self.collection.find_one({"task_id": task_id}) | ||
calc = task_doc["calcs_reversed"][0] | ||
if "dos_fs_id" in calc: | ||
dos = self.get_dos(task_id) | ||
calc["dos"] = dos.as_dict() | ||
return task_doc | ||
|
||
def insert_gridfs( | ||
self, d, collection="fs", compress=True, oid=None, task_id=None | ||
): | ||
""" | ||
Insert the given document into GridFS. | ||
|
||
Args: | ||
d (dict): the document | ||
collection (string): the GridFS collection name | ||
compress (bool): Whether to compress the data or not | ||
oid (ObjectId()): the _id of the file; if specified, it must not already exist in GridFS | ||
task_id(int or str): the task_id to store into the gridfs metadata | ||
Returns: | ||
file id, the type of compression used. | ||
""" | ||
oid = oid or ObjectId() | ||
compression_type = None | ||
|
||
if compress: | ||
d = zlib.compress(d.encode(), compress) | ||
compression_type = "zlib" | ||
|
||
fs = gridfs.GridFS(self.db, collection) | ||
if task_id: | ||
# Putting task id in the metadata subdocument as per mongo specs: | ||
# https://github.com/mongodb/specifications/blob/master/source/gridfs/gridfs-spec.rst#terms | ||
fs_id = fs.put( | ||
d, | ||
_id=oid, | ||
metadata={"task_id": task_id, "compression": compression_type}, | ||
) | ||
else: | ||
fs_id = fs.put( | ||
d, _id=oid, metadata={"compression": compression_type} | ||
) | ||
|
||
return fs_id, compression_type | ||
|
||
def get_band_structure(self, task_id): | ||
m_task = self.collection.find_one( | ||
{"task_id": task_id}, {"calcs_reversed": 1} | ||
) | ||
fs_id = m_task["calcs_reversed"][0]["bandstructure_fs_id"] | ||
fs = gridfs.GridFS(self.db, "bandstructure_fs") | ||
bs_json = zlib.decompress(fs.get(fs_id).read()) | ||
bs_dict = json.loads(bs_json.decode()) | ||
if bs_dict["@class"] == "BandStructure": | ||
return BandStructure.from_dict(bs_dict) | ||
elif bs_dict["@class"] == "BandStructureSymmLine": | ||
return BandStructureSymmLine.from_dict(bs_dict) | ||
else: | ||
raise ValueError( | ||
"Unknown class for band structure! {}".format(bs_dict["@class"]) | ||
) | ||
|
||
def get_dos(self, task_id): | ||
m_task = self.collection.find_one( | ||
{"task_id": task_id}, {"calcs_reversed": 1} | ||
) | ||
fs_id = m_task["calcs_reversed"][0]["dos_fs_id"] | ||
fs = gridfs.GridFS(self.db, "dos_fs") | ||
dos_json = zlib.decompress(fs.get(fs_id).read()) | ||
dos_dict = json.loads(dos_json.decode()) | ||
return CompleteDos.from_dict(dos_dict) | ||
|
||
def reset(self): | ||
self.collection.delete_many({}) | ||
self.db.counter.delete_one({"_id": "taskid"}) | ||
self.db.counter.insert_one({"_id": "taskid", "c": 0}) | ||
self.db.dos_fs.files.delete_many({}) | ||
self.db.dos_fs.chunks.delete_many({}) | ||
self.build_indexes() | ||
|
||
# TODO: This become part of CalcDb, VASP/CP2K specific Db methods dont make sense anyway | ||
@classmethod | ||
def from_db_file(cls, db_file, admin=True, user_settings={}): | ||
""" | ||
Create MMDB from database file. File requires host, port, database, | ||
collection, and optionally admin_user/readonly_user and | ||
admin_password/readonly_password | ||
|
||
Args: | ||
db_file (str): path to the file containing the credentials | ||
admin (bool): whether to use the admin user | ||
user_settings (dict): User settings to overwrite those in the db file. | ||
Example: db_file is used to acquire all credentials, but | ||
{'collection': 'test'} is used to overwrite the default DB insertion | ||
collection to something else. | ||
|
||
Returns: | ||
MMDb object | ||
""" | ||
creds = loadfn(db_file) | ||
if user_settings: | ||
creds.update(user_settings) | ||
|
||
if admin and "admin_user" not in creds and "readonly_user" in creds: | ||
raise ValueError("Trying to use admin credentials, " | ||
"but no admin credentials are defined. " | ||
"Use admin=False if only read_only " | ||
"credentials are available.") | ||
|
||
if admin: | ||
user = creds.get("admin_user") | ||
password = creds.get("admin_password") | ||
else: | ||
user = creds.get("readonly_user") | ||
password = creds.get("readonly_password") | ||
|
||
kwargs = creds.get("mongoclient_kwargs", {}) # any other MongoClient kwargs can go here ... | ||
|
||
if "authsource" in creds: | ||
kwargs["authsource"] = creds["authsource"] | ||
else: | ||
kwargs["authsource"] = creds["database"] | ||
|
||
return cls(creds["host"], int(creds.get("port", 27017)), creds["database"], creds["collection"], | ||
user, password, **kwargs) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this different to
VASPCalcDb
at all? If not, we should probably renameVASPCalcDb
something more general and then it can be used in the CP2K module also.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not different at all I think, a more general VASPCalcDb would be a good idea.