Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Input image size (490*490) doesn't match model (336*336). #461

Open
ZTWHHH opened this issue Dec 25, 2024 · 4 comments
Open
Assignees

Comments

@ZTWHHH
Copy link

ZTWHHH commented Dec 25, 2024

When I ran the example inference code for model xcomposer2-vl-7b provided in the huggingface page:

import torch
from transformers import AutoModel, AutoTokenizer

torch.set_grad_enabled(False)

# init model and tokenizer
model = AutoModel.from_pretrained('internlm/internlm-xcomposer2-vl-7b', trust_remote_code=True).cuda().eval()
tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2-vl-7b', trust_remote_code=True)

query = '<ImageHere>Please describe this image in detail.'
image = 'Our image path'

with torch.cuda.amp.autocast():
  response, _ = model.chat(tokenizer, query=query, image=image, history=[], do_sample=False)
print(response)

I got an error:
ValueError: Input image size (490*490) doesn't match model (336*336)

@dle666
Copy link

dle666 commented Dec 30, 2024

I had the same problem, did you solve it?

@ZTWHHH
Copy link
Author

ZTWHHH commented Jan 11, 2025

I had the same problem, did you solve it?

I haven't solved it. But XComposer-2.5 can work.

@Moshindeiru
Copy link

Moshindeiru commented Jan 11, 2025 via email

@stickydream
Copy link

This is due to the excessive version of the transformer package. Downgrading it to version 4.40.0 allows the code to run normally.
transformers包的版本太高了,降低到4.40.0版本,代码能正常运行

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants