Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The IWSLT test sets have moved, yaml files no longer available. #4

Open
bhaddow opened this issue Feb 22, 2023 · 5 comments
Open

The IWSLT test sets have moved, yaml files no longer available. #4

bhaddow opened this issue Feb 22, 2023 · 5 comments

Comments

@bhaddow
Copy link

bhaddow commented Feb 22, 2023

Hi

The IWSLT test sets are no longer in the location given in the README, and they no longer have a yaml file. I presume
that this has been replaced by the xml files containing the transcription/translation.

The IWSLT test sets are at http://i13pc106.ira.uka.de/~jniehues/IWSLT-SLT/data/eval/en-de/

best
Barry

@bhaddow
Copy link
Author

bhaddow commented Feb 22, 2023

Hi

Actually, the segmented versions contain the yaml files, and they are available for 2019 and 2020. To prepare the tsv, run

python $IWSLT_ROOT/scripts/prepare_iwslt_tst.py --test-dir-root $IWSLT_TEST_ROOT/IWSLT.tst2020 

The arguments are slightly different to the README

best
Barry

@gegallego
Copy link
Member

Hi Barry,

Thanks for taking the time to submit this issue. I have solved it in 3f03765.

By the way, what happened with the segmented version of tst2021? It was available on the previous website, right?

Best,
Gerard

@bhaddow
Copy link
Author

bhaddow commented Feb 24, 2023

Hi Gerard

Thanks! Yes, I noticed that tst2021 was missing, but I do not know what has happened to it. Jan Niehues may know - it looks like he hosts the test sets.

best
Barry

@gegallego
Copy link
Member

Thanks, Barry!

@gegallego
Copy link
Member

Hi @jniehues-kit,

Do you have the segmented version of tst2021? Could you make it available on the new website?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants