You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 22, 2020. It is now read-only.
As discussed in #131 it would be helpful to have a consistent pipeline to evaluate prediction models. This way we get to know how well the currently implemented models are, which ones need to be improved and how well a new model performs. The pipeline should calculate the appropriate metrics that have been specified in #221 while some of the are already available here.
Acceptance Criteria
all evaluation methods take a model and a parameter whether to use 5-fold validation or a test set (default)
for all of them calculate data IO, disk space usage, memory usage, prediction time (if not (easily) possible, specify why and how to manually measure it)
for all of them calculate the training time
Please refer to the PR template for further explanations of the metrics.
The text was updated successfully, but these errors were encountered:
For the both identification and classification (false positive reduction) tasks was proposed a handy evaluation framework by the LUNA16 authors.
They employed Free-Response Receiver Operating Characteristic (FROC) and competition performance metric (CPM). It computes an average of the seven sensitivities measured at several false positives per scan (FPPS) thresholds, more concretely, at each FPPS ∈ {0.125, 0.25, 0.5, 1, 2, 4, 8} true positive rate was computed. Mean of which forms the CPM. From my point of view, it worth to pay attention to the CPM neither the logloss.
I can work on that to adjust their pipeline, if no one mind.
As discussed in #131 it would be helpful to have a consistent pipeline to evaluate prediction models. This way we get to know how well the currently implemented models are, which ones need to be improved and how well a new model performs. The pipeline should calculate the appropriate metrics that have been specified in #221 while some of the are already available here.
Acceptance Criteria
Please refer to the PR template for further explanations of the metrics.
The text was updated successfully, but these errors were encountered: