-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError: '1' when I ran ""make_train_from_ranking.py" #2
Comments
Hi sorry about this, you need this file for training. train_query_passage_pair.tsv for the |
I tried the new pair file but failed. I noticed that the "queries.train.tsv" for the "--query-file" arg I used has 808731 examples, however, "train_query_passage_pair.tsv" for the "--pair-file" has 532751 examples, which is less than "queries.train.tsv". I guess this issue was caused by the mismatches between two files. So, is it convenient for you to provide me the file with the "--query-file " arg? Thank you very much! |
I had this problem before in this issue, I mistakenly thought I found the correct file, but it seems I didn't. |
For the |
Hello, thanks for your amazing work, I really want to reproduce it. However, I met an issue when I run the code, could you help me?
command line:
python make_train_from_ranking.py --ranking-file /home/zhangxy/QA/ANCE-PRF/pyserini/runs/run.msmarco-passage.ance.bf.tsv --model-type ANCE --query-file /home/zhangxy/QA/ANCE-PRF-main/data/marco_raw_data/queries.train.tsv --collection-file ./data/msmarco_passage/collection/collection.tsv --pair-file /home/zhangxy/QA/ANCE-PRF-main/data/marco_raw_data/qrels.train.tsv --output data/hard/negative.result --encoder /home/zhangxy/QA/pyserini_for_ance-prf/pyserini/encoders/ance-msmarco-passage
processing:
Load Query: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 808731/808731 [00:00<00:00, 1140903.16it/s]
Load Collection: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 8841823/8841823 [00:16<00:00, 521248.96it/s]
Load Q-D Pair: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 532761/532761 [00:00<00:00, 989247.88it/s]
Load Ranking: 0%| | 0/808731000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "make_train_from_ranking.py", line 94, in
rankings, topk = read_ranking(args.ranking_file, pair, args.prf_k, args.from_top)
File "make_train_from_ranking.py", line 35, in read_ranking
targets = pair[qid].keys()
KeyError: '1'
The text was updated successfully, but these errors were encountered: