getting error with vbert #41

somnath-banerjee · 2021-09-01T23:22:22Z

While using vbert, I am getting the error. Please help.

vbert = onir_pt.reranker('vanilla_transformer', 'bert', text_field='abstract', vocab_config={'train': True})
vbert_pipeline = (
pt.BatchRetrieve(index,wmodel='BM25',metadata=["docno", "text"]) % 1000
>>pt.text.get_text(index,"text")
>>vbert
)
df_res= vbert_pipeline.search("can vitamin d cure covid 19")

[2021-09-02 01:10:08,346][onir_pt][DEBUG] using GPU (deterministic)
[2021-09-02 01:10:11,481][onir_pt][DEBUG] [starting] batches
[2021-09-02 01:10:11,485][onir][CRITICAL] Uncaught exception
Traceback (most recent call last):
File "vbert_baseline.py", line 123, in
df_res= vbert_pipeline.search("can vitamin d cure covid 19")
File "/home/sbanerjee/miniconda3/envs/mytorch/lib/python3.8/site-packages/pyterrier/transformer.py", line 177, in search
rtr = self.transform(queryDf)
File "/home/sbanerjee/miniconda3/envs/mytorch/lib/python3.8/site-packages/pyterrier/transformer.py", line 807, in transform
topics = m.transform(topics)
File "/home/sbanerjee/miniconda3/envs/mytorch/lib/python3.8/site-packages/onir_pt/init.py", line 277, in transform
for count, batch in _logger.pbar(batches, desc='batches', tqdm=pyterrier.tqdm, total=math.ceil(len(dataframe) / self.config['batch_size'])):
File "/home/sbanerjee/miniconda3/envs/mytorch/lib/python3.8/site-packages/onir/log.py", line 110, in pbar
yield from pbar
File "/home/sbanerjee/miniconda3/envs/mytorch/lib/python3.8/site-packages/tqdm/std.py", line 1185, in iter
for obj in iterable:
File "/home/sbanerjee/miniconda3/envs/mytorch/lib/python3.8/site-packages/onir_pt/init.py", line 417, in _iter_batches
batch[f].append(len(doc_tok))
TypeError: object of type 'NoneType' has no len()

seanmacavaney · 2021-09-09T14:29:22Z

Hi @somnath-banerjee,

Sorry for the delay. It looks like the vbert model is trying to re-rank based on the "abstract" field (text_field='abstract'), whereas only a "text" field is available (metadata=["docno", "text"]). I think switching to text_field='text' should resolve your problem!

somnath-banerjee · 2021-09-09T20:41:20Z

Hi @seanmacavaney,
Thanks. It worked with changing the text_field = 'text'.
I am getting some scores that are negative. I am new to IR. I wonder if you kindly let me know how can I interpret this from a theoretical point of view.
Thanks in advance.

seanmacavaney · 2021-09-09T20:46:10Z

Yes, so the query-document relevance scores produced by the model are only valuable with respect to other query-document relevance scores. In other words, the only thing that matters is that document A's score is greater or less than document B's -- this determines the order of the two documents in the rankings.

Some other models make stronger claims about the meaning of the scores produced. For instance, probabilistic models frame the scores as a probability.

somnath-banerjee · 2021-09-09T20:52:45Z

Thanks a lot for your answer.
But if the vbert model produces a negative score for a query-document, what does this mean? How it differs from a query-document for which it gives the positive score?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting error with vbert #41

getting error with vbert #41

somnath-banerjee commented Sep 1, 2021

seanmacavaney commented Sep 9, 2021

somnath-banerjee commented Sep 9, 2021

seanmacavaney commented Sep 9, 2021

somnath-banerjee commented Sep 9, 2021 •

edited

Loading

getting error with vbert #41

getting error with vbert #41

Comments

somnath-banerjee commented Sep 1, 2021

seanmacavaney commented Sep 9, 2021

somnath-banerjee commented Sep 9, 2021

seanmacavaney commented Sep 9, 2021

somnath-banerjee commented Sep 9, 2021 • edited Loading

somnath-banerjee commented Sep 9, 2021 •

edited

Loading