Fix broken image inference for Fuyu model #39915

Isotr0py · 2025-08-05T12:24:54Z

What does this PR do?

When updating Transformers version in vLLM CI, we found fuyu's image inference is broken with gibberish:
https://buildkite.com/vllm/ci/builds/25473#01985c4d-d267-408b-87f5-5d77ae09d90c

This PR fixed broken fuyu models to make sure image features are image_patches handled correctly.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Signed-off-by: Isotr0py <2037008807@qq.com>

HuggingFaceDocBuilderDev · 2025-08-05T12:38:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp

Thanks! Just noticed that Fuyu has no multimodal tests and thus we are more likely to break it. If you can add multimodal test class, would be nice

In this file similar to llava tests where dummy image inputs are also prepared in self.prepare_config_and_inputs

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

Isotr0py · 2025-08-06T14:20:09Z

tests/models/fuyu/test_modeling_fuyu.py

+        inputs = processor(images=image, text=text_prompt_coco_captioning, return_tensors="pt").to(
+            torch_device, torch.float16
+        )


Seems there has been a generation test for Fuyu, but I found it was running on CPU and also failing on main branch:

> self.assertEqual(generated_text, "A blue bus parked on the side of a road.") E AssertionError: 'image shows what \n\n\n' != 'A blue bus parked on the side of a road.' E + A blue bus parked on the side of a road. E - image shows what E - E - E - tests/models/fuyu/test_modeling_fuyu.py:254: AssertionError

So I just move this test to run on GPU.

Yep, we have slow Integration tests. I meant fast test with dummy inputs, which are run always under each PR

Isotr0py · 2025-08-06T14:20:36Z

run-slow: fuyu

Signed-off-by: Isotr0py <2037008807@qq.com>

github-actions · 2025-08-08T07:10:07Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: fuyu

* fix fuyu Signed-off-by: Isotr0py <2037008807@qq.com> * oops Signed-off-by: Isotr0py <2037008807@qq.com> * run test on GPU Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * clean unused Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * revert Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> * add fuyu multimodal test Signed-off-by: Isotr0py <2037008807@qq.com> * fix Signed-off-by: Isotr0py <2037008807@qq.com> --------- Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

Isotr0py added 2 commits August 5, 2025 20:20

fix fuyu

cd6d392

Signed-off-by: Isotr0py <2037008807@qq.com>

oops

6b4f347

Signed-off-by: Isotr0py <2037008807@qq.com>

Isotr0py requested a review from zucchini-nlp August 5, 2025 12:24

zucchini-nlp approved these changes Aug 5, 2025

View reviewed changes

Isotr0py and others added 4 commits August 6, 2025 22:15

run test on GPU

cc7414b

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

Merge branch 'main' into fix-fuyu

7ee251b

clean unused

f3e98c1

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

revert

2d5036b

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

Isotr0py commented Aug 6, 2025

View reviewed changes

zucchini-nlp added the for patch Tag issues / labels that should be included in the next patch label Aug 7, 2025

Isotr0py and others added 2 commits August 8, 2025 00:53

add fuyu multimodal test

116706d

Signed-off-by: Isotr0py <2037008807@qq.com>

Merge branch 'main' into fix-fuyu

05a3a5c

Isotr0py enabled auto-merge (squash) August 8, 2025 05:38

Isotr0py disabled auto-merge August 8, 2025 07:06

fix

76f794a

Signed-off-by: Isotr0py <2037008807@qq.com>

Isotr0py enabled auto-merge (squash) August 8, 2025 07:09

Isotr0py merged commit b374c3d into huggingface:main Aug 8, 2025
18 checks passed

Isotr0py deleted the fix-fuyu branch August 8, 2025 07:34

ZJY0516 mentioned this pull request Aug 28, 2025

[CI] enable idefics3 and fuyu-8b test in multimodal test vllm-project/vllm#23790

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix broken image inference for Fuyu model #39915

Fix broken image inference for Fuyu model #39915

Uh oh!

Isotr0py commented Aug 5, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 5, 2025

Uh oh!

zucchini-nlp left a comment •

edited

Loading

Uh oh!

Isotr0py Aug 6, 2025 •

edited

Loading

Uh oh!

zucchini-nlp Aug 6, 2025

Uh oh!

Isotr0py commented Aug 6, 2025

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix broken image inference for Fuyu model #39915

Fix broken image inference for Fuyu model #39915

Uh oh!

Conversation

Isotr0py commented Aug 5, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Aug 5, 2025

Uh oh!

zucchini-nlp left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Isotr0py Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

Isotr0py commented Aug 6, 2025

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zucchini-nlp left a comment •

edited

Loading

Isotr0py Aug 6, 2025 •

edited

Loading