Hub Validations Scope #3
annakrystalli
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hub repositories: https://github.com/Infectious-Disease-Modeling-Hubs
3 levels of validation:
Files on the way in
Configs are correct
Q: should this functionality move to hubValidation? Currently in hubUtils
A: yes
Data in hub consistent with configs
Schema validation,
So that when the consortium publishes schema, they are correct
Code in hubValidation, schema, or hubDev?
Working decision: hubValidation
Tasks:
General thoughts about architecture/framework
hub-validations.json
metadata file that contains info about which validations to implement in a CI setting:hub-config
folder- validate_model_metadata
- Validate_model_data_file
The specific validation groupings are defined inside
hub-validations.json
hubValidations
package:Scope is to write many functions that each address one test specified on the validations list. All tests implemented by this package are for model output submission file contents, file name, and placement of file in a hub structure.
Possibly define a generic “validate” function
Arguments:
Or separated in multiples functions:
Acquire information from the hub metadata files (possibly via hubUtils hub_connection function)
E.g. Specification of columns and values in the data files, format of the columns, etc.
Implement functions for specific tests that have standard
- files/data objects from environment
- Hub metadata specifications
- Auxiliary data files (or pointers to such files if not in hub metadata)
- Name: Name of the test
- Result: Boolean validation result. Expect this to be TRUE if there were no errors or warnings, FALSE if there were any errors or warnings
- Errors: Collection of errors, warnings, and informational messages
Open questions
Collect results from multiple specific tests into a list
Consider implementing a concept of a group of tests
Idea being that common
Example functions:
“hubCI” package:
scope is to wrap tests from
hubValidations
for integration in a CI server, and implement validation steps that are specific to storage of files in version control.Acquire information from the hub metadata files (possibly via hubUtils hub_connection function)
Acquire information from a pull request with model output submissions
Run each validation step, collect outputs
Output report, messages
Example tests
Process thoughts
Longer term?: retain ability for hubs to write their own hub specific validation functions (say in R) that could be combined with a wrapper framework
Validation Process
a. Local validation
b. Validation on PR
a. Use either:
1. Input variables, or
2. Metadata
b. Define validation structure (flow of function calls and validation checks)
Scope:
For today: just the first point, validation of submissions?
What is required:
Validation development details/plan
Checks should be cross-referenced with elements in JSON file
Current practices
SMH hub validations,
Repo: https://github.com/midas-network/SMHvalidation
US Scenario Modeling Hub (covid, flu):
R package: https://github.com/midas-network/covid19SMHvalidation
validation-checks
).flu_update
)Limitations:
US Forecast hubs (covid, flu, west nile virus)
Validations are implemented in python: https://github.com/reichlab/covid19-forecast-hub-validations
Documentation: https://docs.google.com/document/d/1OAL2pcWmfssJlE6wIbV3PduU689ZctKJNFGv257W0Vk/edit#heading=h.o8oxo12w6bl8
Files modified:
File contents are correct. Things like:
Limitations:
EU Hubs:
Python library, inherited from the DE/PL hub, who inherited it from the US flu forecast hub. However, this succession of maintainers, and successive patches done in a rush made it completely unmaintainable. It took us months to even realize that some tests were not running.
As a reaction, we developed the
HubValidations
R package: https://github.com/covid19-forecast-hub-europe/HubValidationsPros:
100% configurable by users. Changes in formats shouldn’t require changes in the package code
Cons:
Zoltar
Tests for the Zoltar repository: https://github.com/reichlab/forecast-repository/tree/master/forecast_app/tests
Beta Was this translation helpful? Give feedback.
All reactions