KEMBAR78
GLM-4.5V Model Support by zRzRzRzRzRzRzR · Pull Request #39805 · huggingface/transformers · GitHub
Skip to content

Conversation

@zRzRzRzRzRzRzR
Copy link
Contributor

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR commented Jul 31, 2025

This PR will complete two contents

  1. Modifications to default parameters for GLM-4.1V
  2. Adding GLM-4.5V Model

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! I think there are a few things we can still remove and let the modular copy. Overall very clean

Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Looks nice overall 🤗 But we absolutely want to rename ALL classes with Glm4vMoe prefixes instead of Glm4v_moe as we enforce CamelCasing! Made a few other comments as well 🤗

Comment on lines 46 to 48

class Glm4v_moeVisionConfig(Glm4vVisionConfig):
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the model name to be Glm4vMoe everywhere, not Glm4v_moe!!! We only use CamelCasing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

Comment on lines +303 to +306

def apply_multimodal_rotary_pos_emb(q, k, cos, sin, mrope_section, unsqueeze_dim=1):
"""Applies Rotary Position Embedding with Multimodal Sections to the query and key tensors (https://qwenlm.github.io/blog/qwen2-vl/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be imported, not redefined here (I see it was already mentioned as a review comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code of apply_multimodal_rotary_pos_emb inner is changed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image the line in red box had been removed in new impl

Comment on lines 376 to 377

if self.use_qk_norm: # main diff from Llama
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's always True, let's simply use it instead of having an if ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the latest model, this setting is False, and it's not yet certain if any models with True have been released, just like our GLM-4.5.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the latest model, this setting is False,
in that case we can remove the codepath! If a new model is release we'll add it for you!

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your hardwork!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! 🧼

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice and minimal! 🚀

@ArthurZucker
Copy link
Collaborator

run-slow: auto, glm4_moe, glm4v, glm4v_moe

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2025

This comment contains run-slow, running the specified jobs:

models: ['models/auto', 'models/glm4_moe', 'models/glm4v', 'models/glm4v_moe']
quantizations: [] ...

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR changed the title GLM-4.1V default parameter modifications and updates. GLM-4.5V Model Support Aug 8, 2025
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, glm4_moe, glm4v, glm4v_moe

@ArthurZucker
Copy link
Collaborator

@bot /style

@huggingface huggingface deleted a comment from github-actions bot Aug 8, 2025
@ArthurZucker ArthurZucker merged commit 7b20915 into huggingface:main Aug 8, 2025
19 of 22 checks passed
@zRzRzRzRzRzRzR zRzRzRzRzRzRzR deleted the glm-45-vl branch August 21, 2025 08:44
shimizust added a commit to linkedin/Liger-Kernel that referenced this pull request Sep 5, 2025
## Summary
<!--- This is a required section; please describe the main purpose of
this proposed code change. --->
This PR adds support for GLM4.1V (GLM-4 Vision) models to the Liger
Kernel #855
https://huggingface.co/zai-org/GLM-4.5
This model have been merged in
huggingface/transformers#39805
<!---
## Details
This is an optional section; is there anything specific that reviewers
should be aware of?
--->

## Testing Done
<!--- This is a required section; please describe how this change was
tested. --->
Found that `python3 -m pytest test/convergence/bf16/test_mini_models.py
-k 'glm4v_moe' -rF`
has `AssertionError: [Loss]Number of mismatched elements: 14`
with 
<details>
  <summary>Test result</summary>

  ```
AssertionError: [Loss]Number of mismatched elements: 14
Mismatch at index (0, 5): tensor1[(0, 5)] = 8.733983993530273,
tensor2[(0, 5)] = 8.52511215209961
Mismatch at index (0, 8): tensor1[(0, 8)] = 7.2776618003845215,
tensor2[(0, 8)] = 7.524500846862793
Mismatch at index (0, 9): tensor1[(0, 9)] = 6.917590618133545,
tensor2[(0, 9)] = 7.175967216491699
Mismatch at index (0, 13): tensor1[(0, 13)] = 5.685216426849365,
tensor2[(0, 13)] = 5.427236557006836
Mismatch at index (0, 14): tensor1[(0, 14)] = 5.337466239929199,
tensor2[(0, 14)] = 5.049449443817139
  ... and 9 more mismatched elements.
```
</details>
<!-- 
Replace BLANK with your device type. For example, A100-80G-PCIe

Complete the following tasks before sending your PR, and replace `[ ]` with
`[x]` to indicate you have done them. 
-->

- Hardware Type: <BLANK>
- [x] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [x] run `make test-convergence` to ensure convergence

---------

Co-authored-by: Shao Tang <tangshao28@gmail.com>
Co-authored-by: Steven Shimizu <shimizust@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants