Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 9 but got size 3 for tensor number 1 in the list. #26

Open
captainzero93 opened this issue Sep 22, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@captainzero93
Copy link

Finish loading model caption_coco_opt6.7b!
generating...
Traceback (most recent call last):
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\extensions\sd-webui-blip2\scripts\main.py", line 120, in prepare
caption = gen_caption(raw, process_type, caption_type, length_penalty, repetition_penalty, temperature)
File "B:\ASSD16\stable-diffusion-webui\extensions\sd-webui-blip2\scripts\main.py", line 143, in gen_caption
caption = model.generate(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\lavis\models\blip2_models\blip2_opt.py", line 220, in generate
outputs = self.opt_model.generate(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\transformers\generation\utils.py", line 1611, in generate
return self.beam_search(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\transformers\generation\utils.py", line 2909, in beam_search
outputs = self(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\lavis\models\blip2_models\modeling_opt.py", line 1037, in forward
outputs = self.model.decoder(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\lavis\models\blip2_models\modeling_opt.py", line 703, in forward
inputs_embeds = torch.cat([query_embeds, inputs_embeds], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 25 but got size 5 for tensor number 1 in the list.
generating...
Traceback (most recent call last):
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\extensions\sd-webui-blip2\scripts\main.py", line 120, in prepare
caption = gen_caption(raw, process_type, caption_type, length_penalty, repetition_penalty, temperature)
File "B:\ASSD16\stable-diffusion-webui\extensions\sd-webui-blip2\scripts\main.py", line 149, in gen_caption
caption = model.generate(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\lavis\models\blip2_models\blip2_opt.py", line 220, in generate
outputs = self.opt_model.generate(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\transformers\generation\utils.py", line 1572, in generate
return self.sample(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\transformers\generation\utils.py", line 2619, in sample
outputs = self(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\lavis\models\blip2_models\modeling_opt.py", line 1037, in forward
outputs = self.model.decoder(
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "B:\ASSD16\stable-diffusion-webui\venv\lib\site-packages\lavis\models\blip2_models\modeling_opt.py", line 703, in forward
inputs_embeds = torch.cat([query_embeds, inputs_embeds], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 9 but got size 3 for tensor number 1 in the list.

@antis0007
Copy link

Having this issue as well.

@flankechen
Copy link

same issue

1 similar comment
@jiayev
Copy link

jiayev commented Dec 17, 2023

same issue

@Tps-F Tps-F added the bug Something isn't working label Dec 17, 2023
@Jdbye
Copy link

Jdbye commented Apr 11, 2024

Same issue as well. No matter what image or model I use, it gives the error "RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 25 but got size 5 for tensor number 1 in the list."

@NoobToNirvana
Copy link

Hi, this bug is still persisting as of writing this post. Any idea how to fix this? Thanks

@Jdbye
Copy link

Jdbye commented May 9, 2024

I ended up installing LAVIS in python and using BLIP2 through that, which worked just fine.
That is what this extension does under the hood - but something must be wrong with the way it sets up the Python environment or with the packages installed, since I didn't encounter this error when installing and using LAVIS directly. I don't see anything wrong with the code of the extension, it looks the same as the working Python script I used with LAVIS.
I switched to Transformers instead though, as I found it easier to set up and use and since it's universal, it works across a lot more captioning models, including Microsoft GIT (GenerativeImage2Text). Although BLIP2 seems to produce the best results.

@NoobToNirvana
Copy link

I managed to use blip2 in another way. Many thanks for pointing me towards the right direction. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants