-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Fix tensor subclass + dynamic shapes in torch.compile + aot autograd #125941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tensor subclass + dynamic shapes in torch.compile + aot autograd #125941
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125941
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c9db0c5 with merge base 3b0f393 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Fixes issue: #124619 ChangesThis PR addresses a bug in tensor subclasses and symbolic execution. Most of the changes are in the An extra field ( Test planAdd tests for two different subclasses: The set of tests is composed of functions that return a mix of subclasses |
|
What exactly is the algorithmic strategy here? |
@IvanKobzarev, did you use any code to benchmark #138498? If so, can you share it with me? |
|
|
||
| if subclass_metas is None: | ||
| xs_inner.extend(get_plain_tensors(typing.cast(Tensor, x))) | ||
| get_plain_tensors(typing.cast(Tensor, x), out_append_list=xs_inner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| subclass: Tensor, out_append_list: Optional[List[Tensor]] = None | ||
| ) -> List[Tensor]: | ||
| subclass: Tensor, out_append_list: Optional[List[Union[Tensor, int, SymInt]]] = None | ||
| ) -> List[Union[Tensor, int, SymInt]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, the type signature here is a bit confusing, since we never actually append ints/SymInts to the list in this function. I guess you needed this because in the out_append_list= case, the list we pass in might have symints in it already?
Instead, what do you think of: just refactoring this function to always accept an output list to append to, and mandating that anybody using this API must pass in their own list (from a quick grep there are only 2 call sites of this function, both within AOTAutograd)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a few more comments, but otherwise I think this is ready to land. Thanks for all the hard work!
Fixes issue: #124619 This PR addresses a bug in tensor subclasses and symbolic execution. For each subclass, it appends the sizes to the list of arguments and returns the computed shapes at runtime. Most of the changes are in the `unwrap_tensor_subclasses` function. This function now takes two extra flags: append_extra and is_runtime. While tracing, if append_extra is true and we are tracing for the forward graph, extra arguments are added. An extra field (flat_tensor_extra_sizes_offset) is introduced to SubclassCreationMeta. This field stores the offset from right to left for the sizes associated with a tensor subclass. To compute the sizes at runtime, we can use #args[#args - offset : #args - offset + #sizes], where offset is the extra field and #sizes is the number of sizes for the given subclass. Add tests for two different subclasses: TwoTensor and DoubleTensor. The latter is a wrapper that behaves as if the inner tensor were twice its original size. The set of tests is composed of functions that return a mix of subclasses and plain tensors. ghstack-source-id: 39bcc6e Pull Request resolved: #125941
Fixes issue: #124619 This PR addresses a bug in tensor subclasses and symbolic execution. For each subclass, it appends the sizes to the list of arguments and returns the computed shapes at runtime. Most of the changes are in the `unwrap_tensor_subclasses` function. This function now takes two extra flags: append_extra and is_runtime. While tracing, if append_extra is true and we are tracing for the forward graph, extra arguments are added. An extra field (flat_tensor_extra_sizes_offset) is introduced to SubclassCreationMeta. This field stores the offset from right to left for the sizes associated with a tensor subclass. To compute the sizes at runtime, we can use #args[#args - offset : #args - offset + #sizes], where offset is the extra field and #sizes is the number of sizes for the given subclass. Add tests for two different subclasses: TwoTensor and DoubleTensor. The latter is a wrapper that behaves as if the inner tensor were twice its original size. The set of tests is composed of functions that return a mix of subclasses and plain tensors. ghstack-source-id: aaa59b9 Pull Request resolved: #125941
Hi, At the moment I use for profiling : And just manual average counting of time.time_ns() in global variable of unwrap_tensor_subclasses() in runtime_wrappers.py |
Fixes issue: #124619 This PR addresses a bug in tensor subclasses and symbolic execution. For each subclass, it appends the sizes to the list of arguments and returns the computed shapes at runtime. Most of the changes are in the `unwrap_tensor_subclasses` function. This function now takes two extra flags: append_extra and is_runtime. While tracing, if append_extra is true and we are tracing for the forward graph, extra arguments are added. An extra field (flat_tensor_extra_sizes_offset) is introduced to SubclassCreationMeta. This field stores the offset from right to left for the sizes associated with a tensor subclass. To compute the sizes at runtime, we can use #args[#args - offset : #args - offset + #sizes], where offset is the extra field and #sizes is the number of sizes for the given subclass. Add tests for two different subclasses: TwoTensor and DoubleTensor. The latter is a wrapper that behaves as if the inner tensor were twice its original size. The set of tests is composed of functions that return a mix of subclasses and plain tensors. ghstack-source-id: 058fa7e Pull Request resolved: #125941
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…ytorch#125941) Pull Request resolved: pytorch#125941 Approved by: https://github.com/bdhirsh ghstack dependencies: pytorch#133337
…ytorch#125941) Pull Request resolved: pytorch#125941 Approved by: https://github.com/bdhirsh ghstack dependencies: pytorch#133337
@guilhermeleobas can that call be removed now? I think it's still there with the note that it can be removed after this PR closed. |
|
oh @mlazos our current hypothesis is that this context manager was only needed because there were some tests that did |
|
Awesome I will try that |
Stack from ghstack (oldest at bottom):
outer_size/outer_stride#133337cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @rec @XilunWu @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @tianyu-l @peterbell10