AB4TSV is a hybrid architecture that combines BERT with analogies for tackling Target Sense Verification (TSV). In this repository we provide scripts for training and evaluating AB4TSV on the WiC-TSV evaluation benchmark.
pip install -r requirements.txt
The WiC-TSV dataset as well as training and evaluation scripts for HyperBertCLS and HyperBert3 are available here. Make sure to copy the contents of wic-tsv/data/en
to ab4tsv/data
There are two equivalent ways to finetune AB4TSV on WiC-TSV.
Initialize the training parameters inside scripts/train.sh
encoding=swap_fc # -e
target_fc=$ # -tfc
hyps_start_fc=$ # -hfc1
hyps_end_fc=$ # -hfc2
permutation_invariance=False # -pi
seed=42 # --seed
num_epochs=5 # --num_epochs
batch_size_train=16 # --batch_size_train
batch_size_val=8 # --batch_size_val
output_dir=results # --output_dir
python src/train.py -a $A $B $C $D -e $encoding -pi $permutation_invariance
Then simply run the following command:
bash ./scripts/train.sh
Alternatively, pass the arguments of interest directly to src/train.py
python src/train.py \
--analogy tgt hyps def hyps \
--encoding swap_fc \
--permutation_invariance False
Like training, there are two ways to evaluate the performance of your AB4TSV model. Note that performance results are obtained only on the development set since the test set is private. For test set results submit your predictions at codalab.
Initialize the training parameters inside scripts/eval.sh
dataset=dev # -d
encoding=swap_fc # -e
target_fc=$ # -tfc
hyps_starting_fc=$ # -hfc1
hyps_ending_fc=$ # -hfc2
permutation_invariance=False # -pi
save_preds=True # --save_preds
out_binary_preds=False # --out_binary_preds
output_dir=results # --output_dir
python src/eval.py -a $A $B $C $D -d $dataset -e $encoding -pi $permutation_invariance
Then simply run the following command:
bash ./scripts/eval.sh
Alternatively, pass the arguments of interest directly to src/eval.py
python src/eval.py \
--analogy tgt hyps def hyps \
--encoding swap_fc \
--permutation_invariance False
--save_preds True
To reproduce the experimental results of the analogical proportions optimization, you need to train AB4TSV 4 times (using 42, 100, 142, 200 as seeds) for each encoding and each relation included in analogical_relations.txt