Skip to content

Commit

Permalink
No longer making both .in and .gold files, so we need to check the .g…
Browse files Browse the repository at this point in the history
…old files only
  • Loading branch information
AngledLuffa committed Nov 18, 2024
1 parent 215c69e commit 1a53dbc
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions stanza/utils/datasets/prepare_mwt_treebank.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,8 @@ def process_treebank(treebank, model_type, paths, args):
if language in KNOWN_COMPOSABLE_MWTS:
print("Language %s is known to have all MWT composed of exactly its word pieces. Checking..." % language)
check_mwt_composition(f"{mwt_dir}/{short_name}.train.in.conllu")
check_mwt_composition(f"{mwt_dir}/{short_name}.dev.in.conllu")
check_mwt_composition(f"{mwt_dir}/{short_name}.test.in.conllu")
check_mwt_composition(f"{mwt_dir}/{short_name}.dev.gold.conllu")
check_mwt_composition(f"{mwt_dir}/{short_name}.test.gold.conllu")

contract_mwt(f"{mwt_dir}/{short_name}.dev.gold.conllu",
f"{mwt_dir}/{short_name}.dev.in.conllu")
Expand Down

0 comments on commit 1a53dbc

Please sign in to comment.