KEMBAR78
[EfficientLoFTR] LRU Cached embedding computation at inference by sbucaille · Pull Request #40329 · huggingface/transformers · GitHub
Skip to content

Conversation

@sbucaille
Copy link
Contributor

What does this PR do?

Reverted EfficientLoFTR RoPE computation back to inference time to allow different image sizes usage
Decorated with LRU Cache similarly to DinoV3 implementation

Who can review?

@qubvel

Copy link
Contributor

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just a question, otherwise looks good 👍

Comment on lines 74 to 75
i_indices = torch.ones(embed_height, embed_width).cumsum(0).float().unsqueeze(-1)
j_indices = torch.ones(embed_height, embed_width).cumsum(1).float().unsqueeze(-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

device / dtype is also needed, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute_embeddings is only used inside the torch.autocast context manager so I assume we don't need it ?

i_indices = torch.ones(embed_height, embed_width).cumsum(0).float().unsqueeze(-1)
j_indices = torch.ones(embed_height, embed_width).cumsum(1).float().unsqueeze(-1)

emb = torch.zeros(1, embed_height, embed_width, hidden_size // 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here for zeros

@sbucaille
Copy link
Contributor Author

@qubvel gentle bump! I answered your comment

@qubvel
Copy link
Contributor

qubvel commented Aug 26, 2025

Thanks for the ping! Missed it.

I still think we should pass inv_freq.device and inv_freq.dtype while creating tensors to make this function safe to use even outside of the autocast block

@sbucaille
Copy link
Contributor Author

Adressed in 3fa6b94

@qubvel
Copy link
Contributor

qubvel commented Aug 26, 2025

run-slow: efficientloftr

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/efficientloftr']
quantizations: [] ...

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sbucaille
Copy link
Contributor Author

You can rerun slow tests, it should be fixed, I forgot the features were aggregated before the attention, so the embeddings need to be of correct size

@qubvel
Copy link
Contributor

qubvel commented Aug 27, 2025

run-slow: efficientloftr

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/efficientloftr']
quantizations: [] ...

@qubvel qubvel self-requested a review August 27, 2025 13:41
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: efficientloftr

if self.hidden_size != self.out_features[-1]:
raise ValueError(
f"hidden_size should be equal to the last value in out_features. hidden_size = {self.hidden_size}, out_features = {self.stage_out_channels}"
f"hidden_size should be equal to the last value in out_features. hidden_size = {self.hidden_size}, out_features = {self.out_features[-1]}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix attribute name

Comment on lines +53 to +61
out_features: list[int] = [32, 32, 128],
stage_stride: list[int] = [2, 1, 2],
q_aggregation_kernel_size: int = 1,
kv_aggregation_kernel_size: int = 1,
q_aggregation_stride: int = 1,
kv_aggregation_stride: int = 1,
num_attention_layers: int = 2,
num_attention_heads: int = 8,
hidden_size: int = 64,
hidden_size: int = 128,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's for FA2 tests to pass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How didn't I catch this one before ? 🤔 But thanks for taking care of it !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's skipped locally in case you don't have FA2 installed 😄

@qubvel
Copy link
Contributor

qubvel commented Aug 27, 2025

run-slow: efficientloftr

@github-actions
Copy link
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/efficientloftr']
quantizations: [] ...

@qubvel qubvel merged commit 52aaa3f into huggingface:main Aug 27, 2025
20 checks passed
@sbucaille sbucaille deleted the eloftr-dynamic-rope branch September 2, 2025 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants