-
Notifications
You must be signed in to change notification settings - Fork 30.9k
[EfficientLoFTR] LRU Cached embedding computation at inference #40329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, just a question, otherwise looks good 👍
| i_indices = torch.ones(embed_height, embed_width).cumsum(0).float().unsqueeze(-1) | ||
| j_indices = torch.ones(embed_height, embed_width).cumsum(1).float().unsqueeze(-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
device / dtype is also needed, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute_embeddings is only used inside the torch.autocast context manager so I assume we don't need it ?
| i_indices = torch.ones(embed_height, embed_width).cumsum(0).float().unsqueeze(-1) | ||
| j_indices = torch.ones(embed_height, embed_width).cumsum(1).float().unsqueeze(-1) | ||
|
|
||
| emb = torch.zeros(1, embed_height, embed_width, hidden_size // 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here for zeros
|
@qubvel gentle bump! I answered your comment |
|
Thanks for the ping! Missed it. I still think we should pass |
|
Adressed in 3fa6b94 |
|
run-slow: efficientloftr |
|
This comment contains run-slow, running the specified jobs: models: ['models/efficientloftr'] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
You can rerun slow tests, it should be fixed, I forgot the features were aggregated before the attention, so the embeddings need to be of correct size |
|
run-slow: efficientloftr |
|
This comment contains run-slow, running the specified jobs: models: ['models/efficientloftr'] |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: efficientloftr |
| if self.hidden_size != self.out_features[-1]: | ||
| raise ValueError( | ||
| f"hidden_size should be equal to the last value in out_features. hidden_size = {self.hidden_size}, out_features = {self.stage_out_channels}" | ||
| f"hidden_size should be equal to the last value in out_features. hidden_size = {self.hidden_size}, out_features = {self.out_features[-1]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix attribute name
| out_features: list[int] = [32, 32, 128], | ||
| stage_stride: list[int] = [2, 1, 2], | ||
| q_aggregation_kernel_size: int = 1, | ||
| kv_aggregation_kernel_size: int = 1, | ||
| q_aggregation_stride: int = 1, | ||
| kv_aggregation_stride: int = 1, | ||
| num_attention_layers: int = 2, | ||
| num_attention_heads: int = 8, | ||
| hidden_size: int = 64, | ||
| hidden_size: int = 128, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's for FA2 tests to pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How didn't I catch this one before ? 🤔 But thanks for taking care of it !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's skipped locally in case you don't have FA2 installed 😄
|
run-slow: efficientloftr |
|
This comment contains run-slow, running the specified jobs: models: ['models/efficientloftr'] |
What does this PR do?
Reverted EfficientLoFTR RoPE computation back to inference time to allow different image sizes usage
Decorated with LRU Cache similarly to DinoV3 implementation
Who can review?
@qubvel