Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Could not interpret optimizer identifier: <keras.src.optimizers.adam.Adam object at 0x79d9071160e0> #19262

Closed
YikunHan42 opened this issue Mar 7, 2024 · 13 comments
Assignees
Labels
stale stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.

Comments

@YikunHan42
Copy link

import tensorflow as tf
from datasets import load_dataset
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification, DataCollatorWithPadding
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.optimizers.schedules import PolynomialDecay
from tensorflow.keras.losses import SparseCategoricalCrossentropy

def prepare_imdb_dataset(tokenizer):
    """
    Prepares the IMDB dataset for training and validation.

    Args:
        tokenizer: The tokenizer to use for text tokenization.

    Returns:
        A tuple containing the tokenized training and validation datasets.
    """
    imdb = load_dataset("imdb")
    train_set = imdb['train'].map(lambda x: tokenizer(x['text'], truncation=True), batched=True)
    test_set = imdb['test'].map(lambda x: tokenizer(x['text'], truncation=True), batched=True)
    return train_set, test_set

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=2)

train_set, test_set = prepare_imdb_dataset(tokenizer)

data_collator = DataCollatorWithPadding(tokenizer=tokenizer, return_tensors="tf")

tf_train_dataset = train_set.to_tf_dataset(
    columns=["attention_mask", "input_ids"],
    label_cols=["label"],
    shuffle=True,
    collate_fn=data_collator,
    batch_size=8,
)

tf_validation_dataset = test_set.to_tf_dataset(
    columns=["attention_mask", "input_ids"],
    label_cols=["label"],
    shuffle=False,
    collate_fn=data_collator,
    batch_size=8,
)

batch_size = 16
num_epochs = 1
num_train_steps = len(tf_train_dataset) * num_epochs
lr_scheduler = PolynomialDecay(
    initial_learning_rate=5e-5, end_learning_rate=0.0, decay_steps=num_train_steps
)

optimizer = Adam(learning_rate=lr_scheduler)
loss = SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=optimizer, loss=loss, metrics=["accuracy"])

model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=5)
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFDistilBertForSequenceClassification: ['vocab_layer_norm.weight', 'vocab_transform.weight', 'vocab_projector.bias', 'vocab_transform.bias', 'vocab_layer_norm.bias']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
Some weights or buffers of the TF 2.0 model TFDistilBertForSequenceClassification were not initialized from the PyTorch model and are newly initialized: ['pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Map: 100%
 25000/25000 [00:23<00:00, 1086.84 examples/s]
Map: 100%
 25000/25000 [00:20<00:00, 1304.86 examples/s]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-17-ac80246ded67>](https://localhost:8080/#) in <cell line: 55>()
     53 optimizer = Adam(learning_rate=lr_scheduler)
     54 loss = SparseCategoricalCrossentropy(from_logits=True)
---> 55 model.compile(optimizer=optimizer, loss=loss, metrics=["accuracy"])
     56 
     57 model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=5)

2 frames
[/usr/local/lib/python3.10/dist-packages/tf_keras/src/optimizers/__init__.py](https://localhost:8080/#) in get(identifier, **kwargs)
    332         )
    333     else:
--> 334         raise ValueError(
    335             f"Could not interpret optimizer identifier: {identifier}"
    336         )

ValueError: Could not interpret optimizer identifier: <keras.src.optimizers.adam.Adam object at 0x79d9071160e0>
@SuryanarayanaY SuryanarayanaY added the To investigate Looks like a bug. It needs someone to investigate. label Mar 7, 2024
@SuryanarayanaY
Copy link
Contributor

Hi @YikunHan42 ,

Could you please confirm which TF version do you have? Please try with latest TF Version i.e 2.16.0rc0 which also uses Keras3.

I observed similar error with Embedding layer which have changes in args in keras3. May please refer #62317.

Since this is generating from Transformer Model, You need to use same TF versions that this model was built on. Could you find out which TF version this model was built upon?

@YikunHan42
Copy link
Author

Hi @YikunHan42 ,

Could you please confirm which TF version do you have? Please try with latest TF Version i.e 2.16.0rc0 which also uses Keras3.

I observed similar error with Embedding layer which have changes in args in keras3. May please refer #62317.

Since this is generating from Transformer Model, You need to use same TF versions that this model was built on. Could you find out which TF version this model was built upon?

Hi, I just upgraded both tensorflow and transformers

!pip show tensorflow transformers
Name: tensorflow
Version: 2.16.0rc0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: [[email protected]](mailto:[email protected])
License: Apache 2.0
Location: /usr/local/lib/python3.10/dist-packages
Requires: absl-py, astunparse, flatbuffers, gast, google-pasta, grpcio, h5py, keras, libclang, ml-dtypes, numpy, opt-einsum, packaging, protobuf, requests, setuptools, six, tensorboard, tensorflow-io-gcs-filesystem, termcolor, typing-extensions, wrapt
Required-by: dopamine-rl
---
Name: transformers
Version: 4.38.2
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: [[email protected]](mailto:[email protected])
License: Apache 2.0 License
Location: /usr/local/lib/python3.10/dist-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by:

Now the original code returns

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning: 
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
  warnings.warn(
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py](https://localhost:8080/#) in _get_module(self, module_name)
   1389         try:
-> 1390             return importlib.import_module("." + module_name, self.__name__)
   1391         except Exception as e:

28 frames
AttributeError: module 'tensorflow._api.v2.compat.v2.__internal__' has no attribute 'register_load_context_function'

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/transformers/utils/import_utils.py](https://localhost:8080/#) in _get_module(self, module_name)
   1390             return importlib.import_module("." + module_name, self.__name__)
   1391         except Exception as e:
-> 1392             raise RuntimeError(
   1393                 f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its"
   1394                 f" traceback):\n{e}"

RuntimeError: Failed to import transformers.models.distilbert.modeling_tf_distilbert because of the following error (look up to see its traceback):
module 'tensorflow._api.v2.compat.v2.__internal__' has no attribute 'register_load_context_function'

Seemingly it's facing new compatibility issues now

@SuryanarayanaY
Copy link
Contributor

Hi @YikunHan42 ,
I observed the same. As I suspect you might need to use the same TF version that this Transformer model was built. The error trace is from Transformer model itself. You might need to report the issue there itself. May be downgrading the TF version might work but not sure which TF version it works.

I believe this has to be fixed at Transformers repo.

@arnoldvialfont
Copy link

In this discussion, setting os.environ['TF_USE_LEGACY_KERAS'] = '1' solved the problem associated to Keras recent update to v3.

@MigeoDaSelva
Copy link

What @arnoldvialfont mentioned worked for me as a palliative...

@LeonHecht
Copy link

In this discussion, setting os.environ['TF_USE_LEGACY_KERAS'] = '1' solved the problem associated to Keras recent update to v3.

I didn't work for me in Colab, still the same error.
This is what I do:
!pip install tf-keras
import os
os.environ['TF_USE_LEGACY_KERAS'] = '1'

And am using TFDistilBertForSequenceClassification

@SuryanarayanaY
Copy link
Contributor

TF_USE_LEGACY_KERAS

That means the Transformer model being used is built upon Keras2. In order to make this model work with Keras3 it has to be taken care by the concern model developer. Thanks!

@SuryanarayanaY SuryanarayanaY added stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. and removed To investigate Looks like a bug. It needs someone to investigate. labels Apr 5, 2024
@Harshith-H-K
Copy link

TF_USE_LEGACY_KERAS

That means the Transformer model being used is built upon Keras2. In order to make this model work with Keras3 it has to be taken care by the concern model developer. Thanks!

is there any way i could use keras2 ? like rollback.

@SuryanarayanaY
Copy link
Contributor

TF_USE_LEGACY_KERAS

That means the Transformer model being used is built upon Keras2. In order to make this model work with Keras3 it has to be taken care by the concern model developer. Thanks!

is there any way i could use keras2 ? like rollback.

To use keras2 you need to install tf_keras package and set the environment variable TF_USE_LEGACY_KERAS to '1'.

os.environ["TF_USE_LEGACY_KERAS"] ="1"

Copy link

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label Apr 25, 2024
Copy link

github-actions bot commented May 9, 2024

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

@github-actions github-actions bot closed this as completed May 9, 2024
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@liqi6811
Copy link

liqi6811 commented Aug 4, 2024

!pip install transformers==4.36.0
!pip install keras==2.15.0
os.environ['TF_USE_LEGACY_KERAS'] = '1'

optimizer = keras.optimizers.Adam(learning_rate=LEARNING_RATE)
model.compile(optimizer=optimizer)

the above code works in google colab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale stat:awaiting response from contributor type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited.
Projects
None yet
Development

No branches or pull requests

7 participants