You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
100% 1061/1061 [2:04:03<00:00, 3.45s/it]Traceback (most recent call last):
File "/content/mm-cot/main.py", line 395, in
T5Trainer(
File "/content/mm-cot/main.py", line 284, in T5Trainer
metrics = trainer.evaluate(eval_dataset = test_set, max_length=args.output_len)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer_seq2seq.py", line 159, in evaluate
return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3043, in evaluate
output = eval_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3343, in evaluation_loop
metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
File "/content/mm-cot/main.py", line 215, in compute_metrics_rougel
preds = tokenizer.batch_decode(preds, skip_special_tokens=True, clean_up_tokenization_spaces=True)
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3469, in batch_decode
return [
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3470, in
self.decode(
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3509, in decode
return self._decode(
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py", line 546, in _decode
text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
OverflowError: out of range integral type conversion attempted
This happens after those 1061 iterations are completed. As a consequence it doesn't generate experiments/rationale_declare-lab-flan-alpaca-large_vit_QCM-E_lr5e-05_bs8_op512_ep50/predictions_ans_eval.json which is expected by answer inference phase for inference
The text was updated successfully, but these errors were encountered:
I suspect it's an error caused by the tokenizer not being able to decode it, as preds contain a value of -100,
It is related to this issue (huggingface/transformers#22634)
I get the following error:
You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the
__call__
method is faster than using a method to encode the text followed by a call to thepad
method to get a padded encoding.100% 1061/1061 [2:04:03<00:00, 3.45s/it]Traceback (most recent call last):
File "/content/mm-cot/main.py", line 395, in
T5Trainer(
File "/content/mm-cot/main.py", line 284, in T5Trainer
metrics = trainer.evaluate(eval_dataset = test_set, max_length=args.output_len)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer_seq2seq.py", line 159, in evaluate
return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3043, in evaluate
output = eval_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3343, in evaluation_loop
metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
File "/content/mm-cot/main.py", line 215, in compute_metrics_rougel
preds = tokenizer.batch_decode(preds, skip_special_tokens=True, clean_up_tokenization_spaces=True)
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3469, in batch_decode
return [
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3470, in
self.decode(
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3509, in decode
return self._decode(
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py", line 546, in _decode
text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
OverflowError: out of range integral type conversion attempted
when I run the inference for rationale generation
This happens after those 1061 iterations are completed. As a consequence it doesn't generate
experiments/rationale_declare-lab-flan-alpaca-large_vit_QCM-E_lr5e-05_bs8_op512_ep50/predictions_ans_eval.json
which is expected by answer inference phase for inferenceThe text was updated successfully, but these errors were encountered: