KEMBAR78
[superglue] Fixed the way batch mask was applied to the scores before match assignment computation by sbucaille · Pull Request #39968 · huggingface/transformers · GitHub
Skip to content

Conversation

@sbucaille
Copy link
Contributor

@sbucaille sbucaille commented Aug 6, 2025

What does this PR do?

Fixes the way mask is applied to the scores in SuperPoint.
Realized in some cases not covered by the tests that I end up with the following error :

self = SuperGlueImageProcessor {
  "do_grayscale": true,
  "do_rescale": true,
  "do_resize": true,
  "image_processor_type":...ssor",
  "resample": 2,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "height": 480,
    "width": 640
  }
}

outputs = ModelOutput([('matches', tensor([[[ -1,  -1,  -1,  ...,  -1,  -1,  -1],
         [ -1, 125, 137,  ...,  -1,  -1,  -1]]...0, 0.0000]]]])), ('mask', tensor([[[1, 1, 1,  ..., 1, 1, 1],
         [1, 1, 1,  ..., 0, 0, 0]]], dtype=torch.int32))])
target_sizes = [[(768, 1025), (1026, 768)]], threshold = 0.0001

    def post_process_keypoint_matching(
        self,
        outputs: "KeypointMatchingOutput",
        target_sizes: Union[TensorType, list[tuple]],
        threshold: float = 0.0,
    ) -> list[dict[str, torch.Tensor]]:
        """
        Converts the raw output of [`KeypointMatchingOutput`] into lists of keypoints, scores and descriptors
        with coordinates absolute to the original image sizes.
        Args:
            outputs ([`KeypointMatchingOutput`]):
                Raw outputs of the model.
            target_sizes (`torch.Tensor` or `list[tuple[tuple[int, int]]]`, *optional*):
                Tensor of shape `(batch_size, 2, 2)` or list of tuples of tuples (`tuple[int, int]`) containing the
                target size `(height, width)` of each image in the batch. This must be the original image size (before
                any processing).
            threshold (`float`, *optional*, defaults to 0.0):
                Threshold to filter out the matches with low scores.
        Returns:
            `list[Dict]`: A list of dictionaries, each dictionary containing the keypoints in the first and second image
            of the pair, the matching scores and the matching indices.
        """
        if outputs.mask.shape[0] != len(target_sizes):
            raise ValueError("Make sure that you pass in as many target sizes as the batch dimension of the mask")
        if not all(len(target_size) == 2 for target_size in target_sizes):
            raise ValueError("Each element of target_sizes must contain the size (h, w) of each image of the batch")
    
        if isinstance(target_sizes, list):
            image_pair_sizes = torch.tensor(target_sizes, device=outputs.mask.device)
        else:
            if target_sizes.shape[1] != 2 or target_sizes.shape[2] != 2:
                raise ValueError(
                    "Each element of target_sizes must contain the size (h, w) of each image of the batch"
                )
            image_pair_sizes = target_sizes
    
        keypoints = outputs.keypoints.clone()
        keypoints = keypoints * image_pair_sizes.flip(-1).reshape(-1, 2, 1, 2)
        keypoints = keypoints.to(torch.int32)
        results = []
        for mask_pair, keypoints_pair, matches, scores in zip(
            outputs.mask, keypoints, outputs.matches[:, 0], outputs.matching_scores[:, 0]
        ):
            mask0 = mask_pair[0] > 0
            mask1 = mask_pair[1] > 0
            keypoints0 = keypoints_pair[0][mask0]
            keypoints1 = keypoints_pair[1][mask1]
            matches0 = matches[mask0]
            scores0 = scores[mask0]
    
            # Filter out matches with low scores
            valid_matches = torch.logical_and(scores0 > threshold, matches0 > -1)
            matched_keypoints0 = keypoints0[valid_matches]
>           matched_keypoints1 = keypoints1[matches0[valid_matches]]
E           IndexError: index 561 is out of bounds for dimension 0 with size 561

src/transformers/models/superglue/image_processing_superglue.py:406: IndexError

This means that a keypoint in image 0 got assigned a match to an unexistant keypoint in image 1, here index 561 should not appear in the matches since there are at most 561 valid matches on the other image. The way the score is filled by the mask here is invalid :

if mask is not None:
mask = mask.reshape(batch_size, 2, num_keypoints)
mask0 = mask[:, 0].unsqueeze(-1).expand(-1, -1, num_keypoints)
scores = scores.masked_fill(mask0 == 0, -1e9)

In the case of a keypoints tensor of max size 5, imagine there are 2 and 4 valid keypoints in image 1 and 2 respectively, the resulting mask is the following :

1, 1, 1, 1, 0
1, 1, 1, 1, 0
1, 1, 1, 1, 0
1, 1, 1, 1, 0
0, 0, 0, 0, 0

where it should have been :

1, 1, 1, 1, 0
1, 1, 1, 1, 0
0, 0, 0, 0, 0
0, 0, 0, 0, 0
0, 0, 0, 0, 0

The following fixees the issue :

if mask is not None:
            mask = mask.reshape(batch_size, 2, num_keypoints)
            mask0 = mask[:, 0].unsqueeze(2)
            mask1 = mask[:, 1].unsqueeze(1)
            mask = torch.logical_and(mask0, mask1)
            scores = scores.masked_fill(mask == 0, torch.finfo(scores.dtype).min)

I've added tests to make sure there is not matches that are beyond the scope of the mask.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@qubvel

@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: superglue

@sbucaille sbucaille force-pushed the fix-superglue-mask branch from f1787b6 to fbbde68 Compare August 6, 2025 20:09
@sbucaille sbucaille mentioned this pull request Aug 6, 2025
5 tasks
Copy link
Contributor

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix

@qubvel
Copy link
Contributor

qubvel commented Aug 7, 2025

run-slow: superglue

@github-actions
Copy link
Contributor

github-actions bot commented Aug 7, 2025

This comment contains run-slow, running the specified jobs:

models: ['models/superglue']
quantizations: [] ...

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qubvel qubvel merged commit cdeaad9 into huggingface:main Aug 7, 2025
20 checks passed
@qubvel qubvel added the Vision label Aug 7, 2025
@sbucaille sbucaille deleted the fix-superglue-mask branch August 9, 2025 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants