-
Notifications
You must be signed in to change notification settings - Fork 4
Analyze Results
The correctness of judgments along with the question frequencies can then be used to plot precision and ROC curves.
themis analyze collate <qa-pairs.csv> <answers.system1.csv> <answers.system2.csv> <answers.system3.csv> --judgments <judgments.csv> > <collate_agree.csv>
here qa-pairs.csv
is question frequency file generated by the 'question extract' command. answers.system1.csv
,answers.system2.csv
, answers.system3.csv
are answer file generated by querying respective systems(can be multiple files from multiple systems). Optional argument judgement.csv
is Q&A pair judgments generated by the 'judge interpret' command. output is 'collate_agree.csv'.
This command collate system answer confidences and annotator judgments by question-answer pair. If annotation is not completed for whole or subset of the question list there might be possibility that judgement.csv
will not be the subset of qa-pairs.csv
. If multiple systems are being judged, there may be Q/A pairs in the
judgements that don't appear in the system answers.