During prediction batch seems to miss 'src_token_ids' key #1

LeHarter · 2022-05-14T09:26:42Z

I've tried to parse our test set of Spanish AMR with your xl-amr cross-lingual parser with this command:

python -u -m xlamr_stog.commands.predict --archive-file C:/Users/user/Documents/AMR/SpanishAMR/xl-amr/models/xl-amr_bilingual_en_es_trans_amr --weights-file C:/Users/user/Documents/AMR/SpanishAMR/xl-amr/models/xl-amr_bilingual_en_es_trans_amr/best.th --input-file C:/Users/user/Documents/AMR/SpanishAMR/Training_t5wtense/test_es.txt.features.input_clean.recategorize --batch-size 32 --use-dataset-reader --output-file C:/Users/user/Documents/AMR/SpanishAMR/Training_t5wtense/test_output.txt --silent --beam-size 5 --predictor STOG

Unfortunately, I've got this Error:
Original exception was:
Traceback (most recent call last):
File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\commands\predict.py", line 275, in
_predict(args)
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\commands\predict.py", line 227, in _predict
manager.run()
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\commands\predict.py", line 200, in run
for model_input_instance, result in zip(batch, self._predict_instances(batch)):
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\commands\predict.py", line 158, in _predict_instances
results, encoder_last_state_seq = self._predictor.predict_batch_instance(batch_data)
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\predictors\stog.py", line 39, in predict_batch_instance
_outputs, encoder_last_state_seq = super(STOGPredictor, self).predict_batch_instance(instances)
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\predictors\predictor.py", line 62, in predict_batch_instance
outputs, encoder_last_state_seq = self._model.forward_on_instances(instances)
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\models\model.py", line 149, in forward_on_instances
encoder_outputs = self(model_input)
File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\models\stog.py", line 350, in forward
encoder_outputs = self.encode(
File "C:\Users\user\Documents\AMR\SpanishAMR\xl-amr\xlamr_stog\models\stog.py", line 437, in encode
bert_mask = bert_tokens.ne(0)
AttributeError: 'NoneType' object has no attribute 'ne'

It seems, that the bert_tokens are of type None, because the Batch tensor dicitonary is missing the "src_token_ids" key:
def prepare_batch_input(self, batch):
# [batch, num_tokens]
bert_token_inputs = batch.get('src_token_ids', None)
if bert_token_inputs is not None:
bert_token_inputs = bert_token_inputs.long()
encoder_token_subword_index = batch.get('src_token_subword_index', None)
if encoder_token_subword_index is not None:
encoder_token_subword_index = encoder_token_subword_index.long()
encoder_token_inputs = batch['src_tokens']['encoder_tokens']
encoder_pos_tags = batch['src_pos_tags']
encoder_must_copy_tags = batch['src_must_copy_tags']
# [batch, num_tokens, num_chars]
encoder_char_inputs = batch['src_tokens']['encoder_characters']
# [batch, num_tokens]
encoder_mask = get_text_field_mask(batch['src_tokens'])

How could I fix that?

GerlinGreen · 2022-07-19T03:47:02Z

Hello, we've met similar issue while using cktp-amr-2.0 archive model. It was because the bert vocaburary file is missing, and we solved this problem by correcting parameters of the config file in our archive-file path.
Maybe you can check if the bert model paths in config.json under your archive-file path are correct. Espically the word_spliter object which directs to the bert vocab file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

During prediction batch seems to miss 'src_token_ids' key #1

During prediction batch seems to miss 'src_token_ids' key #1

LeHarter commented May 14, 2022

GerlinGreen commented Jul 19, 2022

During prediction batch seems to miss 'src_token_ids' key #1

During prediction batch seems to miss 'src_token_ids' key #1

Comments

LeHarter commented May 14, 2022

GerlinGreen commented Jul 19, 2022