Add F-measure to the computed model metrics, and include the raw confusion matrix in the output #179

riley-harper · 2024-12-11T18:22:39Z

F-measure is another helpful model metric, which can be computed in terms of precision and recall:

f-measure = 2 * ((precision * recall) / (precision + recall))

If you plug in the definitions of precision and recall in terms of true positives (tp), false positives (fp), and false negatives (fn), you get

f-measure = (2 * tp) / (2 * tp + fp + fn)

In addition to providing this metric, we should include the raw confusion matrix so that users can compute their own additional metrics if they would like to.

The text was updated successfully, but these errors were encountered:

_get_aggregate_metrics() now calls these core library functions.

This function is now simple enough that we can just inline it in the one place where it's called.

- tp, tn, fp, fn are easy to type but look a little too similar to be easily readable. - true_positives, true_negatives, false_positives, false_negatives are really explicit but difficult to type.

I think this is nice because it disentangles the core library from numpy. But it does mean that we have to explicitly convert NaNs to numpy.nan in model exploration. So it's a bit messy.

This lets us handle math.nan when aggregating threshold metrics results. It keeps np.nan more contained to the code that actually cares about Pandas and Numpy.

This required fixing a bug in core.model_metrics.f_measure where it errored out instead of returning NaN when its denominator was 0.

By pulling the mean and stdev calculation code out into its own function, we can reduce some of the duplication. And in this case catching a StatisticsError seems simpler than checking for certain conditions to be met before calling the statistics functions.

I also renamed the existing columns to remove the "_test" part, since we aren't computing "_train" versions of these metrics anymore.

…cs df

riley-harper · 2024-12-13T14:21:19Z

I'm not really sure how to add the confusion matrix in to the thresholded metrics data frame. Since it aggregates the computed metrics from the ThresholdTestResults, it's not clear how to handle the counts of true/false positives/negatives. One idea is to include several array columns with the data. I'm not a big fan of aggregating the confusion matrix data, since the point of including it is giving users the raw, unchanged data.

riley-harper added the component: model exploration label Dec 11, 2024

riley-harper added a commit that referenced this issue Dec 11, 2024

[#179] Create a new core.model_metrics module and move _calc_mcc() there

c166ace

riley-harper added a commit that referenced this issue Dec 11, 2024

[#179] Create precision() and recall() functions in core.model_metrics

df9b463

_get_aggregate_metrics() now calls these core library functions.

riley-harper added a commit that referenced this issue Dec 11, 2024

[#179] Factor away _get_aggregate_metrics()

7817ed5

This function is now simple enough that we can just inline it in the one place where it's called.

riley-harper added a commit that referenced this issue Dec 11, 2024

[#179] Add hypothesis and some property tests for core.model_metrics

b93ab6f

riley-harper added a commit that referenced this issue Dec 11, 2024

[#179] Add a library function for F-measure, also known as F1-score

8604767

riley-harper added a commit that referenced this issue Dec 11, 2024

[#179] Add .hypothesis/ to .gitignore

fd40c35

riley-harper added a commit that referenced this issue Dec 12, 2024

[#179] Filter with math.isnan() instead of is not np.nan

1ecef81

This lets us handle math.nan when aggregating threshold metrics results. It keeps np.nan more contained to the code that actually cares about Pandas and Numpy.

riley-harper added a commit that referenced this issue Dec 12, 2024

[#179] Include F-measure in ThresholdTestResults

7f0c48c

This required fixing a bug in core.model_metrics.f_measure where it errored out instead of returning NaN when its denominator was 0.

riley-harper added a commit that referenced this issue Dec 12, 2024

[#179] Put the raw confusion matrix counts in the ThresholdTestResults

a53c120

riley-harper added a commit that referenced this issue Dec 12, 2024

[#179] Add F-measure to the output thresholded metrics data frame

74a7dd9

I also renamed the existing columns to remove the "_test" part, since we aren't computing "_train" versions of these metrics anymore.

riley-harper added a commit that referenced this issue Dec 12, 2024

[#179] Return math.nan from core.model_metrics.mcc where it makes sense

b454276

riley-harper added a commit that referenced this issue Dec 12, 2024

[#179] Don't automatically add or drop columns from thresholded metri…

bd934f5

…cs df

riley-harper added a commit that referenced this issue Dec 13, 2024

[#179] Add documentation to core.model_metrics and refactor a bit

b2cf14c

riley-harper mentioned this issue Dec 13, 2024

Add the F-measure model metric, restructure for clarity #180

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add F-measure to the computed model metrics, and include the raw confusion matrix in the output #179

Add F-measure to the computed model metrics, and include the raw confusion matrix in the output #179

riley-harper commented Dec 11, 2024

riley-harper commented Dec 13, 2024

Add F-measure to the computed model metrics, and include the raw confusion matrix in the output #179

Add F-measure to the computed model metrics, and include the raw confusion matrix in the output #179

Comments

riley-harper commented Dec 11, 2024

riley-harper commented Dec 13, 2024