KEMBAR78
[sparse] Fix semi-structured sparse shape mismatch bug by jcaip · Pull Request #110420 · pytorch/pytorch · GitHub
Skip to content

Conversation

@jcaip
Copy link
Contributor

@jcaip jcaip commented Oct 2, 2023

Stack from ghstack (oldest at bottom):

Summary:

Fixes: #110664

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:

RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

Where the size of the sparse matmul result is off because we infer the
output shape with the wrong tensor shape.

This happens because of a bug where we did not update the subclass
tensor shape when doing transpose.
For semi-structured sparsity, transposing is a no-op where we just set
the boolean flag, but we forgot to also update the tensor shape.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op and handle shape folding ourselves,
which changes the execution path.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:

python test/test_sparse_semi_structured.py -k test_mlp

Reviewers:

Subscribers:

Tasks:

Tags:

Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 2, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110420

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit e8ee6f0 with merge base a3e5ec4 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
jcaip added a commit that referenced this pull request Oct 2, 2023
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 82c08a0
Pull Request resolved: #110420
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
jcaip added a commit that referenced this pull request Oct 4, 2023
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 613fdce
Pull Request resolved: #110420
@cpuhrsch cpuhrsch requested a review from albanD October 4, 2023 02:41
@cpuhrsch
Copy link
Contributor

cpuhrsch commented Oct 4, 2023

cc @albanD since I think this is a curious Tensor subclass "edge case"

@alexsamardzic
Copy link
Collaborator

LGTM (aside for test failing at the moment because of an import).

@albanD
Copy link
Collaborator

albanD commented Oct 4, 2023

I'm confused. Is this just a big in the at::linear op? Why not just fix that instead of doing this?

@jcaip
Copy link
Contributor Author

jcaip commented Oct 4, 2023

cc @albanD I can create an issue to triage this more precisely and fix it but this issue only arises for semi-structured sparse tensors and I'm not sure if it's a bug or a special case that needs to be handled. This fix seemed easier.

Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
jcaip added a commit that referenced this pull request Oct 5, 2023
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 13ac044
Pull Request resolved: #110420
@jcaip
Copy link
Contributor Author

jcaip commented Oct 6, 2023

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict gh/jcaip/43/orig returned non-zero exit code 1

warning: skipped previously applied commit c0e6a7f34a6
hint: use --reapply-cherry-picks to include skipped commits
hint: Disable this message with "git config advice.skippedCherryPicks false"
Rebasing (1/1)
Auto-merging test/test_sparse_semi_structured.py
Auto-merging torch/sparse/semi_structured.py
CONFLICT (content): Merge conflict in torch/sparse/semi_structured.py
error: could not apply de30920a330... [sparse] Fix semi-structured sparse shape mismatch bug
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply de30920a330... [sparse] Fix semi-structured sparse shape mismatch bug

Raised by https://github.com/pytorch/pytorch/actions/runs/6434285998

Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off by a factor of 4. I'm
not sure exactly where this bug comes from, but I traced it to [this](https://github.com/pytorch/pytorch/blob/01b2f25ebda85d307b27847ad67efe2b5bb54265/aten/src/ATen/native/LinearAlgebra.cpp#L1959) function.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op

This fix overload __torch_function, specifically for the F.linear op.
The goal is to implement our own folding to 2d / unfolding code so that
we can avoid running into this issue.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off because we infer the
output shape with the wrong tensor shape.

This happens because of a bug where we did not update the subclass
tensor shape when doing transpose.
For semi-structured sparsity, transposing is a no-op where we just set
the boolean flag, but we forgot to also update the tensor shape.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op and handle shape folding ourselves,
which changes the execution path.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@jcaip
Copy link
Contributor Author

jcaip commented Oct 9, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: linux-binary-libtorch-cxx11-abi / libtorch-cpu-shared-with-deps-cxx11-abi-build / build

Details for Dev Infra team Raised by workflow job

Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off because we infer the
output shape with the wrong tensor shape.

This happens because of a bug where we did not update the subclass
tensor shape when doing transpose.
For semi-structured sparsity, transposing is a no-op where we just set
the boolean flag, but we forgot to also update the tensor shape.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op and handle shape folding ourselves,
which changes the execution path.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@jcaip
Copy link
Contributor Author

jcaip commented Oct 10, 2023

@pytorchbot merge -f "unrelated failures"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/jcaip/43/head branch October 13, 2023 14:23
jcaip added a commit that referenced this pull request Oct 23, 2023
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off because we infer the
output shape with the wrong tensor shape.

This happens because of a bug where we did not update the subclass
tensor shape when doing transpose.
For semi-structured sparsity, transposing is a no-op where we just set
the boolean flag, but we forgot to also update the tensor shape.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op and handle shape folding ourselves,
which changes the execution path.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: #110420
Approved by: https://github.com/alexsamardzic, https://github.com/cpuhrsch
pytorchmergebot pushed a commit that referenced this pull request Oct 27, 2023
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off because we infer the
output shape with the wrong tensor shape.

This happens because of a bug where we did not update the subclass
tensor shape when doing transpose.
For semi-structured sparsity, transposing is a no-op where we just set
the boolean flag, but we forgot to also update the tensor shape.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op and handle shape folding ourselves,
which changes the execution path.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: #110420
Approved by: https://github.com/alexsamardzic, https://github.com/cpuhrsch
jcaip added a commit that referenced this pull request Nov 3, 2023
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off because we infer the
output shape with the wrong tensor shape.

This happens because of a bug where we did not update the subclass
tensor shape when doing transpose.
For semi-structured sparsity, transposing is a no-op where we just set
the boolean flag, but we forgot to also update the tensor shape.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op and handle shape folding ourselves,
which changes the execution path.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: #110420
Approved by: https://github.com/alexsamardzic, https://github.com/cpuhrsch
@atalman atalman modified the milestones: 2.1.1, 2.1.2 Nov 8, 2023
jcaip added a commit that referenced this pull request Nov 27, 2023
Summary:

Currently, PyTorch incorrectly calculates the size of the returned
matrix when we pass a non-contiguous batched (>2d) input to the
semi-structured sparse subclass.

This is most common in MLP layers, where we have 2 linear layers back to back.

This will lead to an error like the following:
```
RuntimeError: shape '[20, 64, 64, 3072]' is invalid for input of size
62914560

```
Where the size of the sparse matmul result is off because we infer the
output shape with the wrong tensor shape.

This happens because of a bug where we did not update the subclass
tensor shape when doing transpose.
For semi-structured sparsity, transposing is a no-op where we just set
the boolean flag, but we forgot to also update the tensor shape.

Note that this error goes away in inference mode, since we avoid
decomposing the aten.linear op and handle shape folding ourselves,
which changes the execution path.

An alternative way to fix this issue is to set
TORCH_FLATTEN_LINEAR_3D=True, which will also fix this error.

Test Plan:
```
python test/test_sparse_semi_structured.py -k test_mlp

```

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: #110420
Approved by: https://github.com/alexsamardzic, https://github.com/cpuhrsch
@atalman atalman removed this from the 2.1.2 milestone Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: sparse release notes category topic: bug fixes topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants