Support llama3.2vl(WIP). #5555

marko1616 · 2024-09-26T13:46:28Z

🚀 What does this PR do?

Support Llama-3.2-11B-Vision.

✅ Before submitting

Did you read the contributor guideline?
Did you write any new necessary tests?

🔗 Linked issues

#5549

⚠️ IMPORTANT

bitsandbytes 8 bits quantization is not functional. 4 bits is okay but not 8 bits.

src/llamafactory/model/loader.py

src/llamafactory/data/mm_plugin.py

hiyouga · 2024-09-26T14:21:57Z

src/llamafactory/data/mm_plugin.py

+            images = [Image.open(image) if isinstance(image, str) else image for image in images]
+            image_features = processor.image_processor(images)
+            _ = image_features.pop("num_tiles")
+        image_features = {k: v if isinstance(v, torch.Tensor) else torch.tensor(v) for k, v in image_features.items()}


we need to add cross attention masks https://github.com/huggingface/transformers/blob/0a21381ba3047882ffe1b95c639aec28974b2c7e/src/transformers/models/mllama/processing_mllama.py#L322-L333

That is because we can't get text at get_mm_inputs how do you think to fix this? Like add a new stage or add text input to get_mm_inputs.

yep, we should do some work here

src/llamafactory/data/template.py

src/llamafactory/data/mm_plugin.py

marko1616 and others added 3 commits September 26, 2024 09:43

Support llama3.2vl.

d6b3d10

Merge branch 'main' into feat/llama3.2vl

ee6a185

Update readme.

b0e6f98

hiyouga reviewed Sep 26, 2024

View reviewed changes

src/llamafactory/model/loader.py Outdated Show resolved Hide resolved

marko1616 force-pushed the feat/llama3.2vl branch from a3a23ac to b0e6f98 Compare September 26, 2024 14:17

hiyouga reviewed Sep 26, 2024

View reviewed changes

src/llamafactory/data/mm_plugin.py Show resolved Hide resolved

hiyouga reviewed Sep 26, 2024

View reviewed changes

src/llamafactory/data/template.py Outdated Show resolved Hide resolved

src/llamafactory/data/mm_plugin.py Outdated Show resolved Hide resolved

marko1616 changed the title ~~Support llama3.2vl.~~ Support llama3.2vl(WIP). Sep 26, 2024

marko1616 added 2 commits September 26, 2024 11:06

Tiny fix.

88cfd94

Linter.

e637498

hiyouga added the pending This problem is yet to be addressed label Sep 29, 2024

marko1616 marked this pull request as draft October 7, 2024 08:40

hiyouga mentioned this pull request Oct 23, 2024

请问现在支持 Llama-3.2-11B-Vision-Instruct 吗？ #5796

Open

hiyouga force-pushed the main branch from 5569125 to b4c7dd3 Compare October 29, 2024 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support llama3.2vl(WIP). #5555

Support llama3.2vl(WIP). #5555

marko1616 commented Sep 26, 2024 •

edited

Loading

hiyouga Sep 26, 2024

marko1616 Sep 26, 2024 •

edited

Loading

hiyouga Sep 26, 2024

Support llama3.2vl(WIP). #5555

Are you sure you want to change the base?

Support llama3.2vl(WIP). #5555

Conversation

marko1616 commented Sep 26, 2024 • edited Loading

🚀 What does this PR do?

✅ Before submitting

🔗 Linked issues

⚠️ IMPORTANT

hiyouga Sep 26, 2024

Choose a reason for hiding this comment

marko1616 Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

hiyouga Sep 26, 2024

Choose a reason for hiding this comment

marko1616 commented Sep 26, 2024 •

edited

Loading

marko1616 Sep 26, 2024 •

edited

Loading