Skip to content

Evaluation, Reproducibility, Benchmarks Meeting 11

AReinke edited this page May 26, 2021 · 1 revision

Minutes of meeting 11

Date: 26th May 2021

Present: Lena, Annika, David, Jens, Keyvan


TOP 1: General

  • Before a meeting, an email with a reminder and the agenda will sent to all members

TOP 2: Delphi process on metrics

  • Annika presented (preliminary) results of round 3 (30/31 replies):
    • The updated inclusion criteria were accepted by >= 80% of the members
    • The terminology was voted to:
      • Image-level classification (may be a tie with whole-image classification; depending on missing reply)
      • Object detection
      • Semantic segmentation
      • Instance segmentation
    • The updated version of the task mapping was accepted by 87% of the members. A few points were discussed
      • The terms “input” and “labels” will be replaced by “algorithm input” and “reference annotation”
      • The formulation “ultimate interest in …” was criticized. The following suggestions were made:
        • Interest ultimately in …
        • Interest in the end result …
        • Interest in …
      • “Global or local?” may be replaced by “whole or partial?”
      • The term “underlying (clinical) problem” will be replaced by “driving clinical question”. In the caption, we will replace “grouping problems” by “grouping questions”
    • Metric pools:
      • TPs, TNs, FPs and FNs should not be listed as metrics but will be discussed in the paper to define the metrics (recommendation: state those numbers in addition to the derived metrics)
      • We need to vote on whether all metrics should be described, listed in the appendix etc.
      • The metrics will be grouped into primary metrics (overlap metrics (based on TPs etc. and others) and boundary metrics) and secondary metrics (e.g. AUC or AP)
      • We will present a matrix with the metrics relationships at the beginning. The metrics should be grouped into clusters: Closely related metrics should appear next to each other
    • Problem characteristics:
      • Memory consumption: Should be made clear that this was meant for metric computation
      • Should be discussed cross task
      • Nicola may help us here
Clone this wiki locally