-
Notifications
You must be signed in to change notification settings - Fork 1
Integration with MINT
Deborah Khider edited this page Sep 27, 2021
·
13 revisions
We need to register the outputs generated by the Data Transformation and Model Execution. This is going to be a workflow
Data Download (Data/update) -> Data registration (update registration) -> Data Transformation -> Data Registration -> Model Execution -> Data Registration
- The ensemble manager (who does the execution) can call the data registration script. The ensemble is going to pass a list of path to the files and dataset_id.
- The files are published as resources on the Data Catalog
$ python <the_script> --dataset-id <id> [file(s)] [geospatial] [temporal] [variables] [variable IDs]
The files are local paths and not URLs. (yet) The script can read files
Define what will be done and what will not be done as part of this project.
- Extraction of geospatial and temporal information + variables for NetCDF files
In this case the parameters for geospatial, temporal and variable will be contained in the NetCDF file.
- Registration for other formats will be without the previous extraction
- Ensemble manager will need to pass the necessary information (geo/temporal) to the registration script.
- Model catalog will need to pass information about the variables to the registration script.
- Extraction of geospatial and temporal information for NetCDF files - Done
- If the dataset doesn't exist, creation of the dataset and Extraction of svo for NetCDF files.
- Register the variables in the data catalog of the variables do not already exist.
- Registration of the resources.
- For Topoflow, one folder output is a dataset with each file represented as a resource.
- The dataset id is a parameter. The Data Registration should not generate it