-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError: 'Cache only has 0 layers, attempted to access layer with index 0' #25
Comments
using transformers==4.33.0 solved it. |
Hi, @Kyriection, as I mentioned, one workaround to avoid this error is to downgrade transformers to something < 3.36. BUT, I am trying to test H2O on some of the of the latest LLM. For that, I need to use an upgraded version of transformer (>=4.37.2) Do you have any suggestion? the error is causing here in the forward function of H2OLlamaAttention_streaming --
|
Hi @hasanibnarif, Huggingface update their cache implementation since version 3.36. Previously the past_key_value are a list of tensors that contain key and value embeddings while now they use a cache instance to maintain the kv cache. The definition of kv cache is located in https://github.com/huggingface/transformers/blob/main/src/transformers/cache_utils.py#L76. I have a initial version of the new h2o kv cache implementation based on the cache class (https://github.com/Kyriection/llama-recipes/blob/main/research/long-context-llama/H2O/utils/cache.py#L342), Please note that this version is still under developed and I will release it once finished. |
How do I reproduce?
The text was updated successfully, but these errors were encountered: