KEMBAR78
Fix return typehint for decoder and annotate inv_freq by qubvel · Pull Request #39610 · huggingface/transformers · GitHub
Skip to content

Conversation

@qubvel
Copy link
Contributor

@qubvel qubvel commented Jul 23, 2025

What does this PR do?

Fix return type hint for decoder layer and annotate inv_freq

@qubvel qubvel requested a review from Cyrilvallez July 23, 2025 15:32
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qubvel qubvel marked this pull request as ready for review July 23, 2025 16:22
@qubvel qubvel added the typing label Jul 24, 2025
Comment on lines 98 to 99
inv_freq, self.attention_scaling = self.rope_init_fn(self.config, device)
self.inv_freq: torch.Tensor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this typehint? It looks like a weird pattern to annotate it alone like this without attributing a value no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cyrilvallez, later, when we use this attribute in the forward method, it is an unknown member of a class because we do not assign it as usual with self.inv_freq = ..., but instead register it with torch using self.register_buffer(...). Annotating it fixes this linting issue.

This change is not so necessary, it's just a nit, so I can remove it if you think it's redundant

Copy link
Contributor Author

@qubvel qubvel Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before:
Screenshot 2025-07-31 at 10 24 29

After:
Screenshot 2025-07-31 at 10 23 24

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, the annotation can be moved to a class level

class LlamaRotaryEmbedding(nn.Module):
    inv_freq: torch.Tensor

    def __init__(self, config: LlamaConfig, device=None):
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cyrilvallez, friendly ping 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oups sorry! Yes, would rather have it at the class level, or inline such as

self.register_buffer("inv_freq", inv_freq, persistent=False); self.inv_freq: torch.Tensor

because it feels weird as 2 successive lines IMO

But probably class level with a comment explaining that it's because linter is not able to follow register_buffer is better, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, sounds good to me, moved to a class level

@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: arcee, aria, bamba, bitnet, chameleon, cohere, cohere2, csm, dbrx, deepseek_v2, deepseek_v3, dia, diffllama, doge, dots1, efficientloftr

@qubvel qubvel requested a review from Cyrilvallez August 6, 2025 13:15
Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, makes a lot of sense! Thanks for cleaning up the types, it's not the most glamorous but super useful!!

@qubvel
Copy link
Contributor Author

qubvel commented Aug 7, 2025

Thanks for the review!

@qubvel qubvel merged commit b347e93 into huggingface:main Aug 7, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants