OverflowError: out of range integral type conversion attempted #69

1-sf · 2023-12-31T18:21:19Z

I get the following error:

You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
100% 1061/1061 [2:04:03<00:00, 3.45s/it]Traceback (most recent call last):
File "/content/mm-cot/main.py", line 395, in
T5Trainer(
File "/content/mm-cot/main.py", line 284, in T5Trainer
metrics = trainer.evaluate(eval_dataset = test_set, max_length=args.output_len)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer_seq2seq.py", line 159, in evaluate
return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3043, in evaluate
output = eval_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3343, in evaluation_loop
metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
File "/content/mm-cot/main.py", line 215, in compute_metrics_rougel
preds = tokenizer.batch_decode(preds, skip_special_tokens=True, clean_up_tokenization_spaces=True)
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3469, in batch_decode
return [
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3470, in
self.decode(
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 3509, in decode
return self._decode(
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py", line 546, in _decode
text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
OverflowError: out of range integral type conversion attempted

when I run the inference for rationale generation

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py \
  --data_root data/ScienceQA/data \
  --caption_file data/instruct_captions.json \
  --model declare-lab/flan-alpaca-large \
  --user_msg rationale --img_type vit \
  --bs 2 --eval_bs 4  --epoch 50 --lr 5e-5 --output_len 512 \
  --use_caption --use_generate --prompt_format QCM-E \
  --output_dir experiments \
  --evaluate_dir models/mm-cot-large-rationale

This happens after those 1061 iterations are completed. As a consequence it doesn't generate experiments/rationale_declare-lab-flan-alpaca-large_vit_QCM-E_lr5e-05_bs8_op512_ep50/predictions_ans_eval.json which is expected by answer inference phase for inference

The text was updated successfully, but these errors were encountered:

SanghyeokSon · 2024-01-14T04:39:09Z

I have the same problem.

I suspect it's an error caused by the tokenizer not being able to decode it, as preds contain a value of -100,
It is related to this issue (huggingface/transformers#22634)

1-sf · 2024-01-14T05:14:27Z

I tried huggingface/transformers#24433 (comment) and it seems to have worked

cooelf · 2024-05-19T06:32:02Z

This issue may be due to the update of the transformers library. The solution above seems to be effective.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OverflowError: out of range integral type conversion attempted #69

OverflowError: out of range integral type conversion attempted #69

1-sf commented Dec 31, 2023

SanghyeokSon commented Jan 14, 2024

1-sf commented Jan 14, 2024

cooelf commented May 19, 2024

OverflowError: out of range integral type conversion attempted #69

OverflowError: out of range integral type conversion attempted #69

Comments

1-sf commented Dec 31, 2023

SanghyeokSon commented Jan 14, 2024

1-sf commented Jan 14, 2024

cooelf commented May 19, 2024