-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[DTensor] Support matmul in inference_mode #142197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142197
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ac1c07d with merge base 61dc5e9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except the var naming.
| dx = distribute_tensor(x, device_mesh, [Replicate()]) | ||
| dA = distribute_tensor(A, device_mesh, [Shard(0)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i remember that people complain the var name dX dA indicates gradients. Names such as x_dist A_distare preferred. cc @awgu
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I learned how to let DTensor decompose through this diff.
Fixes #142190 . The solution is to add a `decompose_handler` for `aten.matmul`, similar to how we handle `aten.linear`. With the decomposition, `aten.matmul` becomes `aten.mm` which has sharding strategy registered with DTensor. Pull Request resolved: #142197 Approved by: https://github.com/XilunWu, https://github.com/wz337
Stack from ghstack (oldest at bottom):
Fixes #142190 .
The solution is to add a
decompose_handlerforaten.matmul, similar to how we handleaten.linear.With the decomposition,
aten.matmulbecomesaten.mmwhich has sharding strategy registered with DTensor.cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o