Generating pairs that are too close in score harms accurate final rankings #1040

ionparticle · 2022-10-04T20:07:41Z

This is more of a design consideration for future development. Our current pair generator tries to minimize the difference between scores when generating pairs. This actually enhances the ranking reliability when we have expert judges, as they are able to finely tell the difference between answers. But with untrained judges, it hurts ranking reliability as it's harder for them to tell answers apart.

For untrained judges, having a gap between the scores of two different answers makes it easier for them to tell the answers apart and also give 'incorrectly' judged answers more of a chance to climb back up. We should consider implementing this gap for our pair generator.

There's two additional factors for consideration due to the nature of ComPAIR as a learning tool rather than an assessment tool:

There might be more pedagogical benefit in having students try to distinguish between two very similar quality answers.
Even with this score gap, it's recommended that we have around 12-15 rounds of comparisons for a reliable ranking. This is far more comparisons than the usual 3 rounds that is ComPAIR's default.

So perhaps the size of the gap could be made configurable.

Thanks to Peter Thwaites (UCLouvain) for bringing this up and providing the papers below:

Paper 1 provides recommendations for the score gap size. Papers 2 & 3 details the issue with 'highly adaptive' pair generators like ComPAIR's.

ionparticle added the enhancement label Oct 4, 2022

ionparticle added this to the Future Versions milestone Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating pairs that are too close in score harms accurate final rankings #1040

Generating pairs that are too close in score harms accurate final rankings #1040

ionparticle commented Oct 4, 2022 •

edited

Loading

Generating pairs that are too close in score harms accurate final rankings #1040

Generating pairs that are too close in score harms accurate final rankings #1040

Comments

ionparticle commented Oct 4, 2022 • edited Loading

ionparticle commented Oct 4, 2022 •

edited

Loading