KEMBAR78
[MPS] Fix conv backward pass for channels last by malfet · Pull Request #141009 · pytorch/pytorch · GitHub
Skip to content

Conversation

@malfet
Copy link
Contributor

@malfet malfet commented Nov 19, 2024

Looks like a regression caused by use of strided API, but adding the test revealed (at least in CI), that on Ventura it worked but returned garbage results, so fixed by removing all the logic about channels last (as it's irrelevant for strided API case and placeholder already turns tensor into a correct one)

This also allows one to remove mem_format_key and ns_shape_key (it was redundant even back then, as mem_format_key + getTensorsStringKey(grad_output_t) already uniquely identified the operation)

Fixes #140902

@malfet malfet requested a review from kulinseth as a code owner November 19, 2024 08:04
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 19, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141009

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (1 Unrelated Failure)

As of commit ed563b4 with merge base a440a01 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Nov 19, 2024
Looks like a regression caused by use of strided API.
Fixed by ignoring memory layout, as for strided API it shoudl not matter, shoudl it?

Fixes #140902
@malfet malfet force-pushed the malfet/fix-conv-backward-cl branch from b4c3618 to d902d9a Compare November 20, 2024 16:32
@malfet malfet added the topic: bug fixes topic category label Nov 20, 2024
@manuelcandales manuelcandales self-requested a review November 20, 2024 18:20
@malfet
Copy link
Contributor Author

malfet commented Nov 20, 2024

@pytorchbot merge -f "Lint + MPS tests are green"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
Looks like a regression caused by use of strided API, but adding the test revealed (at least in CI), that on Ventura it worked but returned garbage results, so fixed by removing all the logic about channels last (as it's irrelevant for strided API case and placeholder already turns tensor into a correct one)

This also allows one to remove `mem_format_key` and `ns_shape_key` (it was redundant even back then, as `mem_format_key` + `getTensorsStringKey(grad_output_t)` already uniquely identified the operation)

Fixes pytorch#140902

Pull Request resolved: pytorch#141009
Approved by: https://github.com/manuelcandales
@github-actions github-actions bot deleted the malfet/fix-conv-backward-cl branch December 22, 2024 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/mps Run MPS tests (subset of trunk) Merged release notes: mps Release notes category topic: bug fixes topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

View size is not compatible, using Conv1d on channels-last Tensor, reported during backward on mps.

3 participants