Fix tensor subclass + dynamic shapes in torch.compile + aot autograd #125941

guilhermeleobas · 2024-05-10T18:19:56Z

Stack from ghstack (oldest at bottom):

cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @rec @XilunWu @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @tianyu-l @peterbell10

[ghstack-poisoned]

pytorch-bot · 2024-05-10T18:19:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125941

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c9db0c5 with merge base 3b0f393 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 7e20cc5 Pull Request resolved: #125941

guilhermeleobas · 2024-05-10T18:24:03Z

~~Pretty much a work in progress. I just want to see what is currently breaking.~~

Fixes issue: #124619

Changes

This PR addresses a bug in tensor subclasses and symbolic execution.
For each subclass, it appends the sizes to the list of arguments and
returns the computed shapes at runtime.

Most of the changes are in the unwrap_tensor_subclasses function. It
takes two extra flags: append_extra and is_runtime. While tracing, if
append_extra is true and we are tracing for the forward graph, extra arguments
are added.

An extra field (flat_tensor_extra_sizes_offset) is introduced to SubclassCreationMeta.
This field stores the offset from right to left for the sizes associated with a
tensor subclass. To compute the sizes at runtime, we can use #args[#args - offset : #args - offset + #sizes],
where offset is the extra field and #sizes is the number of sizes for the given subclass.

Test plan

Add tests for two different subclasses: TwoTensor and DoubleTensor. The
latter is a wrapper that behaves as if the inner tensor were twice its
original size.

The set of tests is composed of functions that return a mix of subclasses
and plain tensors.

[ghstack-poisoned]

ghstack-source-id: 2628f08 Pull Request resolved: #125941

ezyang · 2024-05-17T12:49:17Z

What exactly is the algorithmic strategy here?

torch/_functorch/_aot_autograd/subclass_utils.py

[ghstack-poisoned]

ghstack-source-id: 79088d6 Pull Request resolved: #125941

[ghstack-poisoned]

ghstack-source-id: e605f8a Pull Request resolved: #125941

[ghstack-poisoned]

ghstack-source-id: 992ccbb Pull Request resolved: #125941

[ghstack-poisoned]

guilhermeleobas · 2024-10-25T16:25:59Z

ah yes that would be great. @IvanKobzarev has been looking into subclass runtime overhead, and it would be nice if we can avoid this PR making it too much worse

@IvanKobzarev, did you use any code to benchmark #138498? If so, can you share it with me?

bdhirsh · 2024-10-25T19:09:36Z

torch/_functorch/_aot_autograd/subclass_utils.py


        if subclass_metas is None:
-            xs_inner.extend(get_plain_tensors(typing.cast(Tensor, x)))
+            get_plain_tensors(typing.cast(Tensor, x), out_append_list=xs_inner)


cc @IvanKobzarev

test/dynamo/test_aot_autograd_cache.py

bdhirsh · 2024-10-25T19:19:15Z

torch/_subclasses/fake_tensor.py

-    subclass: Tensor, out_append_list: Optional[List[Tensor]] = None
-) -> List[Tensor]:
+    subclass: Tensor, out_append_list: Optional[List[Union[Tensor, int, SymInt]]] = None
+) -> List[Union[Tensor, int, SymInt]]:


hmm, the type signature here is a bit confusing, since we never actually append ints/SymInts to the list in this function. I guess you needed this because in the out_append_list= case, the list we pass in might have symints in it already?

Instead, what do you think of: just refactoring this function to always accept an output list to append to, and mandating that anybody using this API must pass in their own list (from a quick grep there are only 2 call sites of this function, both within AOTAutograd)

bdhirsh

left a few more comments, but otherwise I think this is ready to land. Thanks for all the hard work!

[ghstack-poisoned]

Fixes issue: #124619 This PR addresses a bug in tensor subclasses and symbolic execution. For each subclass, it appends the sizes to the list of arguments and returns the computed shapes at runtime. Most of the changes are in the `unwrap_tensor_subclasses` function. This function now takes two extra flags: append_extra and is_runtime. While tracing, if append_extra is true and we are tracing for the forward graph, extra arguments are added. An extra field (flat_tensor_extra_sizes_offset) is introduced to SubclassCreationMeta. This field stores the offset from right to left for the sizes associated with a tensor subclass. To compute the sizes at runtime, we can use #args[#args - offset : #args - offset + #sizes], where offset is the extra field and #sizes is the number of sizes for the given subclass. Add tests for two different subclasses: TwoTensor and DoubleTensor. The latter is a wrapper that behaves as if the inner tensor were twice its original size. The set of tests is composed of functions that return a mix of subclasses and plain tensors. ghstack-source-id: 39bcc6e Pull Request resolved: #125941

[ghstack-poisoned]

Fixes issue: #124619 This PR addresses a bug in tensor subclasses and symbolic execution. For each subclass, it appends the sizes to the list of arguments and returns the computed shapes at runtime. Most of the changes are in the `unwrap_tensor_subclasses` function. This function now takes two extra flags: append_extra and is_runtime. While tracing, if append_extra is true and we are tracing for the forward graph, extra arguments are added. An extra field (flat_tensor_extra_sizes_offset) is introduced to SubclassCreationMeta. This field stores the offset from right to left for the sizes associated with a tensor subclass. To compute the sizes at runtime, we can use #args[#args - offset : #args - offset + #sizes], where offset is the extra field and #sizes is the number of sizes for the given subclass. Add tests for two different subclasses: TwoTensor and DoubleTensor. The latter is a wrapper that behaves as if the inner tensor were twice its original size. The set of tests is composed of functions that return a mix of subclasses and plain tensors. ghstack-source-id: aaa59b9 Pull Request resolved: #125941

IvanKobzarev · 2024-10-28T15:10:35Z

ah yes that would be great. @IvanKobzarev has been looking into subclass runtime overhead, and it would be nice if we can avoid this PR making it too much worse

@IvanKobzarev, did you use any code to benchmark #138498? If so, can you share it with me?

Hi,
Sorry for delay with reply.

At the moment I use for profiling :
1/ Not landed PR #136478 which uses James's profiling and then in the test I can take the times from logger.

And just manual average counting of time.time_ns() in global variable of unwrap_tensor_subclasses() in runtime_wrappers.py

[ghstack-poisoned]

Fixes issue: #124619 This PR addresses a bug in tensor subclasses and symbolic execution. For each subclass, it appends the sizes to the list of arguments and returns the computed shapes at runtime. Most of the changes are in the `unwrap_tensor_subclasses` function. This function now takes two extra flags: append_extra and is_runtime. While tracing, if append_extra is true and we are tracing for the forward graph, extra arguments are added. An extra field (flat_tensor_extra_sizes_offset) is introduced to SubclassCreationMeta. This field stores the offset from right to left for the sizes associated with a tensor subclass. To compute the sizes at runtime, we can use #args[#args - offset : #args - offset + #sizes], where offset is the extra field and #sizes is the number of sizes for the given subclass. Add tests for two different subclasses: TwoTensor and DoubleTensor. The latter is a wrapper that behaves as if the inner tensor were twice its original size. The set of tests is composed of functions that return a mix of subclasses and plain tensors. ghstack-source-id: 058fa7e Pull Request resolved: #125941

guilhermeleobas · 2024-10-28T19:31:59Z

@pytorchbot merge

pytorchmergebot · 2024-10-28T19:33:51Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ytorch#125941) Pull Request resolved: pytorch#125941 Approved by: https://github.com/bdhirsh ghstack dependencies: pytorch#133337

mlazos · 2024-11-05T23:24:20Z

Hi @mlazos, it is. But there's one test that it is failing if I remove the maybe_enable_thunkify call. I'll sync with @bdhirsh tomorrow.

@guilhermeleobas can that call be removed now? I think it's still there with the note that it can be removed after this PR closed.

bdhirsh · 2024-11-05T23:31:30Z

oh @mlazos our current hypothesis is that this context manager was only needed because there were some tests that did tensor * nested_int compute in a compiled region, which @jbschlosser has since banned as part of #138496 (independently of this PR). So I think it's worth a try to kill that code and see if CI is green

mlazos · 2024-11-05T23:35:55Z

Awesome I will try that

Update

97bf44d

[ghstack-poisoned]

pytorch-bot bot added the ciflow/inductor label May 10, 2024

guilhermeleobas added a commit that referenced this pull request May 10, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

366d569

ghstack-source-id: 7e20cc5 Pull Request resolved: #125941

pytorchbot added the open source label May 10, 2024

Update

c144a98

[ghstack-poisoned]

Update

346afe2

[ghstack-poisoned]

guilhermeleobas added a commit that referenced this pull request May 16, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

60b85f6

ghstack-source-id: 2628f08 Pull Request resolved: #125941

guilhermeleobas requested a review from bdhirsh May 16, 2024 16:30

guilhermeleobas marked this pull request as ready for review May 16, 2024 16:30

guilhermeleobas requested review from Chillee and ezyang as code owners May 16, 2024 16:30

bdhirsh reviewed May 17, 2024

View reviewed changes

torch/_functorch/_aot_autograd/subclass_utils.py Outdated Show resolved Hide resolved

Update

42ff816

[ghstack-poisoned]

guilhermeleobas added a commit that referenced this pull request May 23, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

07f18d8

ghstack-source-id: 79088d6 Pull Request resolved: #125941

guilhermeleobas marked this pull request as draft May 23, 2024 13:30

Update

9eccab3

[ghstack-poisoned]

pytorch-bot bot added the release notes: fx release notes category label May 28, 2024

Update

167e5a1

[ghstack-poisoned]

pytorch-bot bot added the module: dynamo label May 29, 2024

guilhermeleobas added a commit that referenced this pull request May 29, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

dcd25e7

ghstack-source-id: e605f8a Pull Request resolved: #125941

Update

eb79b82

[ghstack-poisoned]

guilhermeleobas added a commit that referenced this pull request May 30, 2024

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd

9774b8e

ghstack-source-id: 992ccbb Pull Request resolved: #125941

Update

aa6323d

[ghstack-poisoned]

Update

e1dfac1

[ghstack-poisoned]

Update

9350ed1

[ghstack-poisoned]

bdhirsh reviewed Oct 25, 2024

View reviewed changes

test/dynamo/test_aot_autograd_cache.py Outdated Show resolved Hide resolved

bdhirsh reviewed Oct 25, 2024

View reviewed changes

bdhirsh approved these changes Oct 25, 2024

View reviewed changes

Update

3a4de7a

[ghstack-poisoned]

Update

fd80d32

[ghstack-poisoned]

Update

c9db0c5

[ghstack-poisoned]

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 28, 2024

pytorchmergebot added the merging label Oct 28, 2024

pytorchmergebot added the Merged label Oct 28, 2024

pytorchmergebot closed this in 8785353 Oct 28, 2024

pytorchmergebot removed the merging label Oct 28, 2024

mlazos mentioned this pull request Nov 5, 2024

[Dynamo] Disable Torch Function Subclasses with eager backend, they should've been traced #137822

Closed

mlazos mentioned this pull request Nov 6, 2024

Remove thunkify call in nested tensor torch function #139820

Closed

github-actions bot deleted the gh/guilhermeleobas/48/head branch December 6, 2024 02:12

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd #125941

Fix tensor subclass + dynamic shapes in torch.compile + aot autograd #125941

Uh oh!

Conversation

guilhermeleobas commented May 10, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125941

✅ No Failures

Uh oh!

guilhermeleobas commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Test plan

Uh oh!

ezyang commented May 17, 2024

Uh oh!

Uh oh!

guilhermeleobas commented Oct 25, 2024

Uh oh!

bdhirsh Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bdhirsh Oct 25, 2024

Choose a reason for hiding this comment

Uh oh!

bdhirsh left a comment

Choose a reason for hiding this comment

Uh oh!

IvanKobzarev commented Oct 28, 2024

Uh oh!

guilhermeleobas commented Oct 28, 2024

Uh oh!

pytorchmergebot commented Oct 28, 2024

Merge started

Uh oh!

mlazos commented Nov 5, 2024

Uh oh!

bdhirsh commented Nov 5, 2024

Uh oh!

mlazos commented Nov 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

guilhermeleobas commented May 10, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented May 10, 2024 •

edited

Loading

guilhermeleobas commented May 10, 2024 •

edited

Loading