KeyError: '1' when I ran ""make_train_from_ranking.py" #2

XY2323819551 · 2022-05-10T01:33:47Z

Hello, thanks for your amazing work, I really want to reproduce it. However， I met an issue when I run the code, could you help me?

command line:
python make_train_from_ranking.py --ranking-file /home/zhangxy/QA/ANCE-PRF/pyserini/runs/run.msmarco-passage.ance.bf.tsv --model-type ANCE --query-file /home/zhangxy/QA/ANCE-PRF-main/data/marco_raw_data/queries.train.tsv --collection-file ./data/msmarco_passage/collection/collection.tsv --pair-file /home/zhangxy/QA/ANCE-PRF-main/data/marco_raw_data/qrels.train.tsv --output data/hard/negative.result --encoder /home/zhangxy/QA/pyserini_for_ance-prf/pyserini/encoders/ance-msmarco-passage

processing:
Load Query: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 808731/808731 [00:00<00:00, 1140903.16it/s]
Load Collection: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 8841823/8841823 [00:16<00:00, 521248.96it/s]
Load Q-D Pair: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 532761/532761 [00:00<00:00, 989247.88it/s]
Load Ranking: 0%| | 0/808731000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "make_train_from_ranking.py", line 94, in
rankings, topk = read_ranking(args.ranking_file, pair, args.prf_k, args.from_top)
File "make_train_from_ranking.py", line 35, in read_ranking
targets = pair[qid].keys()
KeyError: '1'

hanglics · 2022-05-10T06:43:45Z

Hi sorry about this, you need this file for training. train_query_passage_pair.tsv for the --pair-file arg.

XY2323819551 · 2022-05-10T07:58:03Z

/home/zhangxy/QA/ANCE-PRF-main/data/marco_raw_data/qrels.train.tsv

I tried the new pair file but failed. I noticed that the "queries.train.tsv" for the "--query-file" arg I used has 808731 examples, however, "train_query_passage_pair.tsv" for the "--pair-file" has 532751 examples, which is less than "queries.train.tsv". I guess this issue was caused by the mismatches between two files. So, is it convenient for you to provide me the file with the "--query-file " arg? Thank you very much!

XY2323819551 · 2022-05-10T08:03:08Z

Hi sorry about this, you need this file for training. train_query_passage_pair.tsv for the --pair-file arg.

I had this problem before in this issue, I mistakenly thought I found the correct file, but it seems I didn't.

hanglics · 2022-05-11T03:26:07Z

For the --query-file arg, please use this file train_query_judged.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError: '1' when I ran ""make_train_from_ranking.py" #2

KeyError: '1' when I ran ""make_train_from_ranking.py" #2

XY2323819551 commented May 10, 2022

hanglics commented May 10, 2022 •

edited

Loading

XY2323819551 commented May 10, 2022

XY2323819551 commented May 10, 2022

hanglics commented May 11, 2022

KeyError: '1' when I ran ""make_train_from_ranking.py" #2

KeyError: '1' when I ran ""make_train_from_ranking.py" #2

Comments

XY2323819551 commented May 10, 2022

hanglics commented May 10, 2022 • edited Loading

XY2323819551 commented May 10, 2022

XY2323819551 commented May 10, 2022

hanglics commented May 11, 2022

hanglics commented May 10, 2022 •

edited

Loading