Skip to content

Commit

Permalink
n-stop instead of stop-if
Browse files Browse the repository at this point in the history
  • Loading branch information
jteijema committed Jan 9, 2025
1 parent cec5968 commit 3fe3ec6
Show file tree
Hide file tree
Showing 12 changed files with 24 additions and 19 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci-workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
ruff check .
- name: Render makita templates
run: |
asreview makita template basic -p basic -d .github/workflows/test_data/ --classifier nb --feature_extractor tfidf --query_strategy max --n_runs 1 --prior_seed 1 --model_seed 2 --skip_wordclouds --overwrite --instances_per_query 2 --stop_if min --balance_strategy double | tee output_basic.txt
asreview makita template basic -p basic -d .github/workflows/test_data/ --classifier nb --feature_extractor tfidf --query_strategy max --n_runs 1 --prior_seed 1 --model_seed 2 --skip_wordclouds --overwrite --instances_per_query 2 --n-stop min --balance_strategy double | tee output_basic.txt
grep -q "ERROR" output_basic.txt && exit 1 || true
asreview makita template arfi -p arfi -d .github/workflows/test_data/ | tee output_arfi.txt
grep -q "ERROR" output_arfi.txt && exit 1 || true
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ optional arguments:
--query_strategy QUERY_STRATEGY Query strategy to use. Default: max.
--balance_strategy BALANCE_STRATEGY Balance strategy to use. Default: double.
--instances_per_query INSTANCES_PER_QUERY Number of instances per query. Default: 1.
--stop_if STOP_IF The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
--n-stop n-stop The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
```

### ARFI template
Expand Down Expand Up @@ -225,7 +225,7 @@ optional arguments:
--query_strategy QUERY_STRATEGY Query strategy to use. Default: max.
--balance_strategy BALANCE_STRATEGY Balance strategy to use. Default: double.
--instances_per_query INSTANCES_PER_QUERY Number of instances per query. Default: 1.
--stop_if STOP_IF The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
--n-stop n-stop The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
```

### Multimodel template
Expand All @@ -251,7 +251,7 @@ optional arguments:
--skip_wordclouds Disables the generation of wordclouds.
--overwrite Automatically accepts all overwrite requests.
--instances_per_query INSTANCES_PER_QUERY Number of instances per query. Default: 1.
--stop_if STOP_IF The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
--n-stop n-stop The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
--classifiers CLASSIFIERS Classifiers to use Default: ['logistic', 'nb', 'rf', 'svm']
--feature_extractors FEATURE_EXTRACTOR Feature extractors to use Default: ['doc2vec', 'sbert', 'tfidf']
--query_strategies QUERY_STRATEGY Query strategies to use Default: ['max']
Expand Down Expand Up @@ -312,7 +312,7 @@ optional arguments:
--query_strategy QUERY_STRATEGY Query strategy to use. Default: max.
--balance_strategy BALANCE_STRATEGY Balance strategy to use. Default: double.
--instances_per_query INSTANCES_PER_QUERY Number of instances per query. Default: 1.
--stop_if STOP_IF The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
--n-stop n-stop The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
```

#### Example usage
Expand Down Expand Up @@ -382,7 +382,7 @@ use it, use `-s` (source) and `-o` (output) to tweak paths.
Adding a legend to the plot can be done with the `-l` or `--show_legend` flag,
with the labels clustered on any of the following: `'filename', 'model',
'query_strategy', 'balance_strategy', 'feature_extraction', 'n_instances',
'stop_if', 'n_prior_included', 'n_prior_excluded', 'model_param', 'query_param',
'n-stop', 'n_prior_included', 'n_prior_excluded', 'model_param', 'query_param',
'feature_param', 'balance_param'`

#### Available scripts
Expand Down
9 changes: 7 additions & 2 deletions asreviewcontrib/makita/entrypoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ def execute(self, argv): # noqa: C901
help="Number of instances per query.",
)
parser_template.add_argument(
"--stop_if",
"--n-stop",
type=str,
default="min",
help="The number of label actions to simulate.",
Expand Down Expand Up @@ -150,6 +150,11 @@ def execute(self, argv): # noqa: C901
nargs="+",
help="Model combinations to exclude.",
)
parser_template.add_argument(
"--no-balance-strategy",
nargs="+",
help="Do not use a balance strategy.",
)

parser_template.set_defaults(func=self._template_cli)

Expand Down Expand Up @@ -316,7 +321,7 @@ def _get_template_args(self):
"balance_strategies",
"impossible_models",
"instances_per_query",
"stop_if",
"n-stop",
]
return {
key: vars(self.args).get(key)
Expand Down
2 changes: 1 addition & 1 deletion asreviewcontrib/makita/template_arfi.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def get_template_specific_params(self, params):
"query_strategy": query_strategy,
"balance_strategy": balance_strategy,
"instances_per_query": self.instances_per_query,
"stop_if": self.stop_if,
"n-stop": self.n-stop,
"prior_seed": self.prior_seed,
"output_folder": self.paths.output_folder,
"scripts_folder": self.paths.scripts_folder,
Expand Down
4 changes: 2 additions & 2 deletions asreviewcontrib/makita/template_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def __init__(
model_seed,
balance_strategy,
instances_per_query,
stop_if,
n-stop,
**kwargs,
):
self.datasets = datasets
Expand All @@ -34,7 +34,7 @@ def __init__(
self.model_seed = model_seed
self.balance_strategy = balance_strategy
self.instances_per_query = instances_per_query
self.stop_if = stop_if
self.n-stop = n-stop
self.file_handler = file_handler
self.__version__ = '.'.join(
str(part)
Expand Down
2 changes: 1 addition & 1 deletion asreviewcontrib/makita/template_basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def get_template_specific_params(self, params):
"datasets": params,
"skip_wordclouds": self.skip_wordclouds,
"instances_per_query": self.instances_per_query,
"stop_if": self.stop_if,
"n-stop": self.n-stop,
"output_folder": self.paths.output_folder,
"scripts_folder": self.paths.scripts_folder,
"version": self.__version__,
Expand Down
2 changes: 1 addition & 1 deletion asreviewcontrib/makita/template_multimodel.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def get_template_specific_params(self, params):
"datasets": params,
"skip_wordclouds": self.skip_wordclouds,
"instances_per_query": self.instances_per_query,
"stop_if": self.stop_if,
"n-stop": self.n-stop,
"output_folder": self.paths.output_folder,
"scripts_folder": self.paths.scripts_folder,
"n_runs": n_runs,
Expand Down
2 changes: 1 addition & 1 deletion asreviewcontrib/makita/template_prior.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ def get_template_specific_params(self, params):
"datasets": params,
"skip_wordclouds": self.skip_wordclouds,
"instances_per_query": self.instances_per_query,
"stop_if": self.stop_if,
"n-stop": self.n-stop,
"output_folder": self.paths.output_folder,
"scripts_folder": self.paths.scripts_folder,
"version": self.__version__,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ python -m asreview wordcloud {{ dataset.input_file }} -o {{ output_folder }}/fig
# Simulate runs, collect metrics and create plots
mkdir {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files
{% for prior in dataset.priors %}
python -m asreview simulate {{ dataset.input_file }} -s {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}_{{ prior[0] }}.asreview --prior_record_id {{ " ".join(prior) }} --seed {{ dataset.model_seed }} -m {{ classifier }} -e {{ feature_extractor }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --stop_if {{ stop_if }}
python -m asreview simulate {{ dataset.input_file }} -s {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}_{{ prior[0] }}.asreview --prior_record_id {{ " ".join(prior) }} --seed {{ dataset.model_seed }} -m {{ classifier }} -e {{ feature_extractor }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --n-stop {{ n-stop }}
python -m asreview metrics {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}_{{ prior[0] }}.asreview -o {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/metrics/metrics_sim_{{ dataset.input_file_stem }}_{{ prior[0] }}.json
{% endfor %}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ python -m asreview wordcloud {{ dataset.input_file }} -o {{ output_folder }}/fig
# Simulate runs
mkdir {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files
{% for run in range(n_runs) %}
python -m asreview simulate {{ dataset.input_file }} -o {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview --prior-seed {{ dataset.prior_seed + run }} --seed {{ dataset.model_seed + run }} -m {{ classifier }} -e {{ feature_extractor }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --stop_if {{ stop_if }}
python -m asreview simulate {{ dataset.input_file }} -o {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview --prior-seed {{ dataset.prior_seed + run }} --seed {{ dataset.model_seed + run }} -m {{ classifier }} -e {{ feature_extractor }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --n-stop {{ n-stop }}
python -m asreview metrics {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview -o {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/metrics/metrics_sim_{{ dataset.input_file_stem }}{{ "_{}".format(run) if n_runs > 1 else "" }}.json
{% endfor %}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ mkdir {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files
# Skipped {{ classifier }} + {{ feature_extraction }} + {{ query_strategy}} model
{% else %}# Classifier = {{ classifier }}, Feature extractor = {{ feature_extraction }}, Query strategy = {{ query_strategy }}, Balance strategy = {{balance_strategy}}
{% for run in range(n_runs) %}
python -m asreview simulate {{ dataset.input_file }} -s {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}_{{ classifier }}_{{ feature_extraction }}_{{ query_strategy }}_{{ balance_strategy }}{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview --model {{ classifier }} --query_strategy {{query_strategy}} --feature_extraction {{ feature_extraction }} --prior_seed {{ dataset.prior_seed + run }} --seed {{ dataset.model_seed }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --stop_if {{ stop_if }}
python -m asreview simulate {{ dataset.input_file }} -s {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}_{{ classifier }}_{{ feature_extraction }}_{{ query_strategy }}_{{ balance_strategy }}{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview --model {{ classifier }} --query_strategy {{query_strategy}} --feature_extraction {{ feature_extraction }} --prior_seed {{ dataset.prior_seed + run }} --seed {{ dataset.model_seed }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --n-stop {{ n-stop }}
python -m asreview metrics {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/state_files/sim_{{ dataset.input_file_stem }}_{{ classifier }}_{{ feature_extraction }}_{{ query_strategy }}_{{ balance_strategy }}{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview -o {{ output_folder }}/simulation/{{ dataset.input_file_stem }}/metrics/metrics_sim_{{ dataset.input_file_stem }}_{{ classifier }}_{{ feature_extraction }}_{{ query_strategy }}_{{ balance_strategy }}{{ "_{}".format(run) if n_runs > 1 else "" }}.json
{% endfor %}{% endif %}
{% endfor %}
Expand Down
4 changes: 2 additions & 2 deletions asreviewcontrib/makita/templates/template_prior.txt.template
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ python -m asreview wordcloud {{ filepath_without_priors }} -o {{ output_folder }
{% endif %}

{% for run in range(n_runs) %}
python -m asreview simulate {{ filepath_with_priors }} -s {{ output_folder }}/simulation/state_files/sim_custom_priors{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview --seed {{ model_seed + run }} -m {{ classifier }} -e {{ feature_extractor }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --stop_if {{ stop_if }} --prior_idx {{ prior_idx }}
python -m asreview simulate {{ filepath_with_priors }} -s {{ output_folder }}/simulation/state_files/sim_custom_priors{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview --seed {{ model_seed + run }} -m {{ classifier }} -e {{ feature_extractor }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --n-stop {{ n-stop }} --prior_idx {{ prior_idx }}
python -m asreview metrics {{ output_folder }}/simulation/state_files/sim_custom_priors{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview -o {{ output_folder }}/simulation/metrics/metrics_sim_custom_priors{{ "_{}".format(run) if n_runs > 1 else "" }}.json

python -m asreview simulate {{ filepath_without_priors }} -s {{ output_folder }}/simulation/state_files/sim_minimal_priors{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview --prior_seed {{ prior_seed + run }} --seed {{ model_seed + run }} -m {{ classifier }} -e {{ feature_extractor }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --stop_if {{ stop_if }}
python -m asreview simulate {{ filepath_without_priors }} -s {{ output_folder }}/simulation/state_files/sim_minimal_priors{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview --prior_seed {{ prior_seed + run }} --seed {{ model_seed + run }} -m {{ classifier }} -e {{ feature_extractor }} -q {{ query_strategy }} -b {{ balance_strategy }} --n_instances {{ instances_per_query }} --n-stop {{ n-stop }}
python -m asreview metrics {{ output_folder }}/simulation/state_files/sim_minimal_priors{{ "_{}".format(run) if n_runs > 1 else "" }}.asreview -o {{ output_folder }}/simulation/metrics/metrics_sim_minimal_priors{{ "_{}".format(run) if n_runs > 1 else "" }}.json

{% endfor %}
Expand Down

0 comments on commit 3fe3ec6

Please sign in to comment.