Skip to content

Latest commit

 

History

History

liv4ever-mt-re-eval

Re-evaluating Liv4ever-MT

We found that Liv4ever-MT has been underestimated due to Unicode inconsistency problem. We detailed this problem in our system description paper. Here we provide the scripts to reproduce the experiments.

By the way, it is easy to normalize Unicode with python (norm_unicode.py).

Dependency

python3 -m pip install -U prettytable

Normalize references to NFKC

sh norm-ref.sh

Download Liv4ever-MT model

See Preparation section in the main README.md.

Generate translations

sh gen.sh

Evaluate

python3 score.py

Outputs:

+---------------------------------------------------------------------------------------------------------------------------+
|                                           Before normalizing references to NFKC                                           |
+-------------+-------+-------+--------+-------+-------+--------+-------+-------+--------+--------+--------+--------+-------+
|    subset   | et-en | lv-en | liv-en | en-et | lv-et | liv-et | en-lv | et-lv | liv-lv | en-liv | et-liv | lv-liv |  avg. |
+-------------+-------+-------+--------+-------+-------+--------+-------+-------+--------+--------+--------+--------+-------+
|     Full    | 25.90 | 17.94 | 18.90  | 19.28 | 22.31 | 22.86  | 20.20 | 23.31 | 24.88  | 10.90  | 16.62  | 17.69  | 20.07 |
|   Facebook  | 28.43 | 13.95 | 19.44  | 25.32 | 22.97 | 24.89  | 26.60 | 28.21 | 33.14  | 13.93  | 19.26  | 21.23  | 23.11 |
| livones.net | 25.80 | 18.85 | 19.73  | 21.05 | 20.54 | 18.39  | 26.98 | 30.16 | 29.73  | 15.09  | 19.93  | 23.82  | 22.51 |
|  dictionary | 16.06 | 12.12 |  7.94  | 13.30 | 25.27 | 36.73  |  7.89 | 28.63 | 26.75  | 10.61  | 32.01  | 28.51  | 20.48 |
|   trilium   | 32.36 | 17.93 | 18.89  | 27.02 | 28.66 | 26.71  | 21.08 | 30.79 | 27.78  | 14.42  | 20.59  | 20.00  | 23.85 |
|    stalte   | 21.86 | 12.59 | 13.81  | 12.69 | 24.83 | 29.34  | 10.86 | 24.53 | 31.84  |  9.38  | 25.25  | 24.63  | 20.14 |
|    esuka    | 14.94 | 24.31 |  7.26  | 11.15 | 11.21 | 14.67  | 32.31 | 13.71 |  7.58  |  5.15  | 10.40  |  5.71  | 13.20 |
|  satversme  | 27.50 | 19.77 | 24.68  | 16.69 | 20.22 | 18.68  | 16.05 | 15.10 | 19.38  |  7.58  |  7.18  |  9.23  | 16.84 |
+-------------+-------+-------+--------+-------+-------+--------+-------+-------+--------+--------+--------+--------+-------+
+---------------------------------------------------------------------------------------------------------------------------+
|                                            After normalizing references to NFKC                                           |
+-------------+-------+-------+--------+-------+-------+--------+-------+-------+--------+--------+--------+--------+-------+
|    subset   | et-en | lv-en | liv-en | en-et | lv-et | liv-et | en-lv | et-lv | liv-lv | en-liv | et-liv | lv-liv |  avg. |
+-------------+-------+-------+--------+-------+-------+--------+-------+-------+--------+--------+--------+--------+-------+
|     Full    | 26.20 | 18.06 | 19.26  | 20.72 | 24.28 | 24.42  | 24.10 | 27.77 | 29.33  | 14.31  | 20.51  | 22.35  | 22.61 |
|   Facebook  | 28.43 | 13.95 | 19.44  | 25.32 | 22.97 | 24.89  | 26.60 | 28.21 | 33.14  | 13.93  | 19.26  | 21.23  | 23.11 |
| livones.net | 25.80 | 18.85 | 19.73  | 21.05 | 20.54 | 18.39  | 26.98 | 30.16 | 29.73  | 15.09  | 19.93  | 23.82  | 22.51 |
|  dictionary | 16.06 | 12.12 |  7.94  | 13.30 | 25.27 | 36.73  |  7.89 | 28.63 | 26.75  | 10.61  | 32.01  | 28.51  | 20.48 |
|   trilium   | 32.36 | 17.93 | 18.89  | 27.02 | 28.66 | 26.71  | 21.08 | 30.79 | 27.78  | 14.42  | 20.59  | 20.00  | 23.85 |
|    stalte   | 21.86 | 12.59 | 13.81  | 12.69 | 24.83 | 29.34  | 10.86 | 24.53 | 31.84  |  9.38  | 25.25  | 24.63  | 20.14 |
|    esuka    | 14.94 | 24.31 |  7.26  | 11.15 | 11.21 | 14.67  | 32.31 | 13.71 |  7.58  |  5.15  | 10.40  |  5.71  | 13.20 |
|  satversme  | 28.45 | 20.21 | 25.76  | 21.41 | 26.74 | 23.75  | 29.10 | 29.82 | 33.56  | 18.23  | 19.87  | 24.15  | 25.09 |
+-------------+-------+-------+--------+-------+-------+--------+-------+-------+--------+--------+--------+--------+-------+