partial attempt at stopping non-differentiable values from being materialized #110592

Chillee · 2023-10-05T08:51:48Z

Stack from ghstack (oldest at bottom):

…rialized [ghstack-poisoned]

pytorch-bot · 2023-10-05T08:51:51Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110592

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 20 New Failures, 2 Unrelated Failures

As of commit 613e1a3 with merge base 08c7dcd ():

NEW FAILURES - The following jobs have failed:

inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_torchbench, 1, 1, linux.g5.4xlarge.nvidia.gpu) (gh)
inductor-periodic / cuda12.1-py3.10-gcc9-sm86-periodic-dynamo-benchmarks / test (aot_eager_torchbench, 1, 1, linux.g5.4xlarge.nvidia.gpu) (gh)
inductor-periodic / cuda12.1-py3.10-gcc9-sm86-periodic-dynamo-benchmarks / test (dynamic_aot_eager_torchbench, 1, 1, linux.g5.4xlarge.nvidia.gpu) (gh)
pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 1, 5, linux.4xlarge.nvidia.gpu) (gh)
pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 2, 5, linux.4xlarge.nvidia.gpu) (gh)
pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 1, 5, linux.g5.4xlarge.nvidia.gpu) (gh)
pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5, linux.g5.4xlarge.nvidia.gpu) (gh)
pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 3, 5, linux.g5.4xlarge.nvidia.gpu) (gh)
pull / linux-focal-py3.11-clang10 / test (crossref, 1, 2, linux.2xlarge) (gh)
pull / linux-focal-py3.11-clang10 / test (crossref, 2, 2, linux.2xlarge) (gh)
pull / linux-focal-py3.11-clang10 / test (default, 2, 3, linux.2xlarge) (gh)
pull / linux-focal-py3.11-clang10 / test (default, 3, 3, linux.2xlarge) (gh)
pull / linux-focal-py3.8-clang10 / test (crossref, 1, 2, linux.2xlarge) (gh)
pull / linux-focal-py3.8-clang10 / test (crossref, 2, 2, linux.2xlarge) (gh)
pull / linux-focal-py3.8-clang10 / test (default, 1, 3, linux.2xlarge) (gh)
pull / linux-focal-py3.8-clang10 / test (default, 2, 3, linux.2xlarge) (gh)
pull / linux-jammy-py3.8-gcc11 / test (default, 1, 3, linux.2xlarge) (gh)
pull / linux-jammy-py3.8-gcc11 / test (default, 2, 3, linux.2xlarge) (gh)
pull / linux-jammy-py3.9-clang12-asan / test (default, 2, 6, linux.4xlarge) (gh)
pull / linux-jammy-py3.9-clang12-asan / test (default, 4, 6, linux.4xlarge) (gh)

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…rialized ghstack-source-id: a97c0da Pull Request resolved: #110592

bdhirsh · 2023-10-05T15:56:00Z

torch/_functorch/aot_autograd.py

    # If it is a Tensor, what the dynamic dims are (otherwise is None)
    dynamic_dims: Optional[Set[int]]
+    # requires_grad
+    requires_grad: bool


nit: ViewAndMutationMeta.requires_grad_info already has this info, since it is of length (# mutated inputs + # user outputs).

Any reason you prefer to have a separate object rather than have it on here? My intuition was that the less lists we have with implicitly matching order the better.

Yeah this is a reasonable q: Let me give context. There are a few other places where we need to distinguish which tensors require gradients:

(1) Affects both mutated inputs and fw outputs (code): At runtime, the compiled forward graph returns both any updated inputs, and any user fw outs. For both of these groups of tensors, we need to know which ones do not require gradients, so we can mark them as non_differentiable.

(2) Affects just mutated inputs (code): During tracing of the joint, if we have a mutated input that requires gradients, we need to clone() it in the forward, so that we can autograd.grad() w.r.t. the input pre-mutation (this is actually kind of sub-optimal since the clone() affects runtime perf and is just to appease the autograd engine, but this case should be rare).

(3) Affects just fw outputs (code): At runtime after our CompiledFunction.apply() finishes, we need to regenerate fw outs that alias inputs. We need to know if the output alias no longer requires grad, which implies that a detach() happened during tracing (so we need to reapply that detach at runtime).

If we do what you put above and move the requires_grad info directly on OutputInfo, then we'd have to either:

(a) Also put requires_grad info on InputInfo objects - but, every piece of code in AOTAutograd that care about requires grad information on inputs would have to manually filter down to the input infos that correspond to mutated inputs

(b) Add another piece of metadata that just correponds to "mutated inputs", that we tack requires_grad info on.

Let me know what you think. (a) Actually might make things easier to reason about (even if it's a bit more boilerplate), but it will be a pretty annoying refactor.

bdhirsh · 2023-10-05T16:55:29Z

torch/_functorch/aot_autograd.py

            for o, info in zip(flat_f_outs, output_info)
            if info.output_type in [OutputType.non_alias, OutputType.unsafe_view_alias, OutputType.custom_function_view]
            and issubclass(info.raw_type, torch.Tensor)
+            and info.requires_grad


I think you can equivalently check this with:

for o, info, requires_grad_info in zip(flat_f_outs, output_info, output_requires_grad_info): ... and requires_grad_info

torch/_functorch/aot_autograd.py

… being materialized" [ghstack-poisoned]

…rialized ghstack-source-id: 0acd7b0 Pull Request resolved: #110592

bdhirsh · 2023-10-06T15:03:57Z

torch/_functorch/aot_autograd.py

            return flat_fn(*unpacked_args)

-    if config.debug_assert:
+    if config.debug_assert and False:


yeah these debug asserts are a pain, but I think they're pretty useful. The idea is that after each layer (removing duplicate inputs, replacing aliased inputs with synthetic bases), the metadata from our analysis pass is slightly different.

We could just re-run the analysis pass, but that would require multiple trips through the user forward with our tracing infra (maybe... fake tensor caching is now fast enough that we can just do this? idk). But instead, there are some helper functions that try to convert the metadata manually, and then these debug asserts are used to make sure that we actually got the metadata correct.

… being materialized" [ghstack-poisoned]

…rialized ghstack-source-id: 103f2f2 Pull Request resolved: #110592

Chillee · 2023-10-06T23:12:01Z

Closed in favor of #110721

partial attempt at stopping non-differentiable values from being mate…

9b44b88

…rialized [ghstack-poisoned]

Chillee mentioned this pull request Oct 5, 2023

Add option to flop counter formula registration to get raw values #110591

Closed

pytorch-bot bot added the release notes: AO frontend label Oct 5, 2023

Chillee added a commit that referenced this pull request Oct 5, 2023

partial attempt at stopping non-differentiable values from being mate…

cdc1ea6

…rialized ghstack-source-id: a97c0da Pull Request resolved: #110592

github-actions bot requested a review from ezyang October 5, 2023 08:52

github-actions bot added the ciflow/inductor label Oct 5, 2023

bdhirsh reviewed Oct 5, 2023

View reviewed changes

torch/_functorch/aot_autograd.py Show resolved Hide resolved

Chillee mentioned this pull request Oct 6, 2023

Added strict=True to zip in aot_autograd #110668

Closed

Update on "partial attempt at stopping non-differentiable values from…

7e57614

… being materialized" [ghstack-poisoned]

Update on "partial attempt at stopping non-differentiable values from…

3b1ec29

… being materialized" [ghstack-poisoned]

Update on "partial attempt at stopping non-differentiable values from…

415fc82

… being materialized" [ghstack-poisoned]

Chillee mentioned this pull request Oct 6, 2023

Fix #110680 (requires_grad typo in decomp) #110687

Closed

Update on "partial attempt at stopping non-differentiable values from…

e6b7cac

… being materialized" [ghstack-poisoned]

Chillee added a commit that referenced this pull request Oct 6, 2023

partial attempt at stopping non-differentiable values from being mate…

743ffa2

…rialized ghstack-source-id: 0acd7b0 Pull Request resolved: #110592

Chillee mentioned this pull request Oct 6, 2023

Fix error in div lowering with integers #102809

Closed

bdhirsh reviewed Oct 6, 2023

View reviewed changes

Update on "partial attempt at stopping non-differentiable values from…

613e1a3

… being materialized" [ghstack-poisoned]

Chillee added a commit that referenced this pull request Oct 6, 2023

partial attempt at stopping non-differentiable values from being mate…

cd83102

…rialized ghstack-source-id: 103f2f2 Pull Request resolved: #110592

Chillee closed this Oct 7, 2023

facebook-github-bot deleted the gh/chillee/224/head branch November 6, 2023 15:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

partial attempt at stopping non-differentiable values from being materialized #110592

partial attempt at stopping non-differentiable values from being materialized #110592

Uh oh!

Chillee commented Oct 5, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 5, 2023 •

edited

Loading

Uh oh!

bdhirsh Oct 5, 2023

Uh oh!

Chillee Oct 5, 2023 •

edited

Loading

Uh oh!

bdhirsh Oct 6, 2023

Uh oh!

bdhirsh Oct 5, 2023

Uh oh!

Uh oh!

bdhirsh Oct 6, 2023

Uh oh!

Chillee commented Oct 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

partial attempt at stopping non-differentiable values from being materialized #110592

partial attempt at stopping non-differentiable values from being materialized #110592

Uh oh!

Conversation

Chillee commented Oct 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110592

❌ 20 New Failures, 2 Unrelated Failures

Uh oh!

bdhirsh Oct 5, 2023

Choose a reason for hiding this comment

Uh oh!

Chillee Oct 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdhirsh Oct 6, 2023

Choose a reason for hiding this comment

Uh oh!

bdhirsh Oct 5, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bdhirsh Oct 6, 2023

Choose a reason for hiding this comment

Uh oh!

Chillee commented Oct 6, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Chillee commented Oct 5, 2023 •

edited

Loading

pytorch-bot bot commented Oct 5, 2023 •

edited

Loading

Chillee Oct 5, 2023 •

edited

Loading