KEMBAR78
Fix broken image inference for Fuyu model by Isotr0py · Pull Request #39915 · huggingface/transformers · GitHub
Skip to content

Conversation

@Isotr0py
Copy link
Collaborator

@Isotr0py Isotr0py commented Aug 5, 2025

What does this PR do?

When updating Transformers version in vLLM CI, we found fuyu's image inference is broken with gibberish:
https://buildkite.com/vllm/ci/builds/25473#01985c4d-d267-408b-87f5-5d77ae09d90c

  • This PR fixed broken fuyu models to make sure image features are image_patches handled correctly.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
@Isotr0py Isotr0py requested a review from zucchini-nlp August 5, 2025 12:24
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just noticed that Fuyu has no multimodal tests and thus we are more likely to break it. If you can add multimodal test class, would be nice

In this file similar to llava tests where dummy image inputs are also prepared in self.prepare_config_and_inputs

Isotr0py and others added 4 commits August 6, 2025 22:15
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Comment on lines +247 to +249
inputs = processor(images=image, text=text_prompt_coco_captioning, return_tensors="pt").to(
torch_device, torch.float16
)
Copy link
Collaborator Author

@Isotr0py Isotr0py Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems there has been a generation test for Fuyu, but I found it was running on CPU and also failing on main branch:

>       self.assertEqual(generated_text, "A blue bus parked on the side of a road.")
E       AssertionError: 'image shows what \n\n\n' != 'A blue bus parked on the side of a road.'
E       + A blue bus parked on the side of a road.
E       - image shows what 
E       - 
E       - 
E       -

tests/models/fuyu/test_modeling_fuyu.py:254: AssertionError

So I just move this test to run on GPU.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we have slow Integration tests. I meant fast test with dummy inputs, which are run always under each PR

@Isotr0py
Copy link
Collaborator Author

Isotr0py commented Aug 6, 2025

run-slow: fuyu

@zucchini-nlp zucchini-nlp added the for patch Tag issues / labels that should be included in the next patch label Aug 7, 2025
Isotr0py and others added 2 commits August 8, 2025 00:53
Signed-off-by: Isotr0py <2037008807@qq.com>
@Isotr0py Isotr0py enabled auto-merge (squash) August 8, 2025 05:38
@Isotr0py Isotr0py disabled auto-merge August 8, 2025 07:06
Signed-off-by: Isotr0py <2037008807@qq.com>
@Isotr0py Isotr0py enabled auto-merge (squash) August 8, 2025 07:09
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: fuyu

@Isotr0py Isotr0py merged commit b374c3d into huggingface:main Aug 8, 2025
18 checks passed
@Isotr0py Isotr0py deleted the fix-fuyu branch August 8, 2025 07:34
ArthurZucker pushed a commit that referenced this pull request Aug 13, 2025
* fix fuyu

Signed-off-by: Isotr0py <2037008807@qq.com>

* oops

Signed-off-by: Isotr0py <2037008807@qq.com>

* run test on GPU

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

* clean unused

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

* revert

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

* add fuyu multimodal test

Signed-off-by: Isotr0py <2037008807@qq.com>

* fix

Signed-off-by: Isotr0py <2037008807@qq.com>

---------

Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants