-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add F-measure to the computed model metrics, and include the raw confusion matrix in the output #179
Comments
_get_aggregate_metrics() now calls these core library functions.
This function is now simple enough that we can just inline it in the one place where it's called.
- tp, tn, fp, fn are easy to type but look a little too similar to be easily readable. - true_positives, true_negatives, false_positives, false_negatives are really explicit but difficult to type.
I think this is nice because it disentangles the core library from numpy. But it does mean that we have to explicitly convert NaNs to numpy.nan in model exploration. So it's a bit messy.
This lets us handle math.nan when aggregating threshold metrics results. It keeps np.nan more contained to the code that actually cares about Pandas and Numpy.
This required fixing a bug in core.model_metrics.f_measure where it errored out instead of returning NaN when its denominator was 0.
By pulling the mean and stdev calculation code out into its own function, we can reduce some of the duplication. And in this case catching a StatisticsError seems simpler than checking for certain conditions to be met before calling the statistics functions.
I also renamed the existing columns to remove the "_test" part, since we aren't computing "_train" versions of these metrics anymore.
I'm not really sure how to add the confusion matrix in to the thresholded metrics data frame. Since it aggregates the computed metrics from the |
F-measure is another helpful model metric, which can be computed in terms of precision and recall:
If you plug in the definitions of precision and recall in terms of true positives (
tp
), false positives (fp
), and false negatives (fn
), you getIn addition to providing this metric, we should include the raw confusion matrix so that users can compute their own additional metrics if they would like to.
The text was updated successfully, but these errors were encountered: