Replies: 3 comments
-
Regarding options 2 and 3 above for the "ideas" section: It wasn't clear to me if normalizing would solve things, given the example you provide above. It seems like the simplest thing for us to do at hubverse would be idea 1 (just return the errors). How much would that be then putting back on users to correct/normalize things? And is that work feasible. In general, it seems like idea 1 is the simpler and more ideal solution (less for hubverse to maintain down the road). |
Beta Was this translation helpful? Give feedback.
-
It seems like it's worth bringing this up with the scoringRules maintainers, especially since you cannot guarantee that a normalized vector will sum to one. From what I understand the |
Beta Was this translation helpful? Give feedback.
-
First a question. Have we established that submissions files (i.e. outputs of most models) are more likely than not to fail the Having said that, I feel the standard R equality test tolerance should be an acceptable tolerance for R stats packages. As such I also vote for contacting the |
Beta Was this translation helpful? Give feedback.
-
For hubEvals, we're using the scoringutils package, which in turn uses scoringRules to calculate the ranked probability score (rps). Between hubValidations, scoringutils, and scoringRules, there are three different checks that are being done for whether or not class probabilities sum to 1:
all.equal
in R), which depends on machine architecture, but for me is about 1.5e-8.There are two issues here:
So I think we should file an issue about this at scoringRules, but I'm not knowledgeable enough about this to be confident in what to suggest that they change their check to. Should it just be
sqrt(.Machine$double.eps)
?Three ideas:
scoringutils
and/orscoringRules
to throw errors if their more-stringent criteria are not met.scoringrules
function, but catch errors. If errors are thrown, normalize the class probabilities on behalf of the user and issue a warning instead of an error.Beta Was this translation helpful? Give feedback.
All reactions