KEMBAR78
[MPS] Speedup torch.full for 1-byte types by malfet · Pull Request #158874 · pytorch/pytorch · GitHub
Skip to content

Conversation

@malfet
Copy link
Contributor

@malfet malfet commented Jul 22, 2025

Stack from ghstack (oldest at bottom):

By using fillBuffer:range:value: rather than MPSGraph op, which should be faster and also does not have INT_MAX limit

Which in turn fixes test_index_put_accumulate_large_tensor_mps test

[ghstack-poisoned]
@malfet malfet requested a review from kulinseth as a code owner July 22, 2025 21:59
@pytorch-bot
Copy link

pytorch-bot bot commented Jul 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158874

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit ce64e3f with merge base ddd74d1 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Jul 22, 2025
malfet added a commit that referenced this pull request Jul 22, 2025
By using [`fillBuffer:range:value:`](https://developer.apple.com/documentation/metal/mtlblitcommandencoder/fillbuffer:range:value:?language=objc)
rather than MPSGraph op, which should be faster and also does not have
INT_MAX limit

ghstack-source-id: dd69275
Pull Request resolved: #158874
@malfet malfet added the topic: improvements topic category label Jul 22, 2025
[ghstack-poisoned]
malfet added a commit that referenced this pull request Jul 22, 2025
By using [`fillBuffer:range:value:`](https://developer.apple.com/documentation/metal/mtlblitcommandencoder/fillbuffer:range:value:?language=objc)
rather than MPSGraph op, which should be faster and also does not have
INT_MAX limit

ghstack-source-id: 5332cf5
Pull Request resolved: #158874
[ghstack-poisoned]
@malfet malfet requested a review from dcci July 22, 2025 23:44
def test_index_put_accumulate_large_tensor(self, device):
if device.startswith("mps"):
raise unittest.SkipTest("Crash with max number of dimentions")
# if device.startswith("mps"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just remove it instead of commenting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the plan, but I hugely suspect I'll have to leave the skip for MacOS-13, where 4GB tensors are big taboo

[ghstack-poisoned]
malfet added a commit that referenced this pull request Jul 23, 2025
By using [`fillBuffer:range:value:`](https://developer.apple.com/documentation/metal/mtlblitcommandencoder/fillbuffer:range:value:?language=objc)
rather than MPSGraph op, which should be faster and also does not have
INT_MAX limit

ghstack-source-id: 98be968
Pull Request resolved: #158874
@malfet malfet added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 23, 2025
malfet added a commit that referenced this pull request Jul 23, 2025
By using [`fillBuffer:range:value:`](https://developer.apple.com/documentation/metal/mtlblitcommandencoder/fillbuffer:range:value:?language=objc)
rather than MPSGraph op, which should be faster and also does not have
INT_MAX limit

ghstack-source-id: 98be968
Pull Request resolved: #158874
[ghstack-poisoned]
@malfet
Copy link
Contributor Author

malfet commented Jul 23, 2025

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Jul 24, 2025
Though testing is a lie and dependent on #153835

Fixes #153789
Pull Request resolved: #158888
Approved by: https://github.com/albanD
ghstack dependencies: #158874
yangw-dev pushed a commit that referenced this pull request Aug 1, 2025
Though testing is a lie and dependent on #153835

Fixes #153789
Pull Request resolved: #158888
Approved by: https://github.com/albanD
ghstack dependencies: #158874
@github-actions github-actions bot deleted the gh/malfet/445/head branch August 23, 2025 02:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/mps Run MPS tests (subset of trunk) ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: mps Release notes category topic: improvements topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants