KEMBAR78
[Glm4.5V] fix vLLM support by zucchini-nlp · Pull Request #40696 · huggingface/transformers · GitHub
Skip to content

Conversation

@zucchini-nlp
Copy link
Member

@zucchini-nlp zucchini-nlp commented Sep 4, 2025

What does this PR do?

GLM-4.5V inference does not work in the latest release and this PR fixes it. Specifically, users can now pass in a VideoMetadata object to processor's call which is needed to backwards compatibility (adding a test for this, one sec)

And a tiny fix in image processing to make helper utility aligned with self._preprocess

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm4v

@zucchini-nlp zucchini-nlp added the for patch Tag issues / labels that should be included in the next patch label Sep 4, 2025
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing! I'm trusting you, I'm a bit out of context 😄


@dataclass
class VideoMetadata:
class VideoMetadata(Mapping):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for making it iterable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that we can use VideoMetadata in same way as dict. We allow users to pass both formats currently

patch_size = images_kwargs.get("patch_size", self.patch_size)
merge_size = images_kwargs.get("merge_size", self.merge_size)
size = images_kwargs.get("size", self.size)
size = images_kwargs.get("size", {"shortest_edge": 112 * 112, "longest_edge": 28 * 28 * 15000})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we align self.size instead?

Copy link
Member Author

@zucchini-nlp zucchini-nlp Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the defaults are already using these values. It is the model checkpoint that has another set of values saved, and for some reason we never use it. Didn't want to break BC but using config values is actually correct

Copy link
Contributor

@qubvel qubvel Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, can we update config values instead? or Hub's PR will not be merged?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can try, lemme ask one of the authors who is on slack

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asked the authors and the config values are the correct ones. So will update processor in the next PR to use config defaults and mark it as slightly breaking

Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, thanks a lot!

@Cyrilvallez Cyrilvallez merged commit 586dc5d into huggingface:main Sep 4, 2025
24 checks passed
Cyrilvallez pushed a commit that referenced this pull request Sep 4, 2025
Copy link

@ALEXANDRA-SALOMEE ALEXANDRA-SALOMEE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aryan ****

Copy link

@ALEXANDRA-SALOMEE ALEXANDRA-SALOMEE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aryan ****

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants