KEMBAR78
Remove replace_all and make VTs mutable by mlazos · Pull Request #113725 · pytorch/pytorch · GitHub
Skip to content

Conversation

@mlazos
Copy link
Contributor

@mlazos mlazos commented Nov 15, 2023

  1. Removes calls to replace_all and clone and makes VTs mutable.
  2. Properly handles Tuple Iterator mutation. Previously TupleIterator variables would only be properly reconstructed if they were advanced at least once in a frame. On calls to next, the source information would be lost (due to constructing a new iterator without using builder), which would ensure that during codegen the variable would be reconstructed from scratch. Now that VTs are mutated, the source is never lost, so we need to properly track mutation and handle it by replaying calls to next at the end of the modified bytecode.
  3. Added test for checking iadd side effects, this was missing in our unit test coverage.
  4. Fixed two incorrect sources, DelayGraphBreakVariable, and UserMethodVariable both relied on setting the source to AttrSource(parent, name) at the callsite of var_getattr.
  5. Fixed a bug in inplace adding for lists, it would set the resulting VariableTracker's source to None which would utilize a different reconstruct path in codegen. Now this is handled explicitly by reconstructing vars when allow_cache=False, so that during side effect replay, the mutated var is correctly updated.

In subsequent PRs:

  • Refactoring side effect tracking to be significantly simpler (I think we only need an is_modified flag)
  • Refactor next_variables iterator to match the signature of next
  • Remove all references to options in the code
  • Refactor VTs representing mutable collections to implement their own mutation update handling
  • Remove clone and/or make it specific to lists for creating slices
  • Add mutation tracking/replay for sets
  • Add mutation tracking/replay for iter.py
  • Removing setting source in builder (it's set at the top level after a var is returned)

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @aakhundov @kadeng

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 15, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/113725

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit a9de523 with merge base e1370ff (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@mlazos mlazos changed the title Remove replace_all and make VTs mutable [NOT READY FOR REVIEW] Remove replace_all and make VTs mutable Nov 15, 2023
@mlazos mlazos marked this pull request as ready for review November 15, 2023 00:36
@mlazos mlazos added ciflow/trunk Trigger trunk jobs on your pull request release notes: dynamo labels Nov 15, 2023
@mlazos mlazos changed the title [NOT READY FOR REVIEW] Remove replace_all and make VTs mutable Remove replace_all and make VTs mutable Nov 17, 2023
@mlazos mlazos requested a review from jansel November 17, 2023 19:48
Copy link
Contributor

@jansel jansel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How are we ensuring that there are no stray clones or manual copies left? Can we remove the clone APIs and audit usage of apply? Add some asserts?
  2. I believe this may be blocked on #113235 -- the checkpointing in higher_order_ops won't work right with this change.

StrongerXi added a commit that referenced this pull request Oct 15, 2024
…ck_replace_is_safe`

This method was no longer needed after #113725; the checking logic is
now in `SideEffects.check_allowed_side_effect`.

ghstack-source-id: 1bbdddf
Pull Request resolved: #137906
StrongerXi added a commit that referenced this pull request Oct 15, 2024
…ck_replace_is_safe`

This method was no longer needed after #113725; the checking logic is
now in `SideEffects.check_allowed_side_effect`.

ghstack-source-id: daf4c41
Pull Request resolved: #137906
StrongerXi added a commit that referenced this pull request Oct 17, 2024
…ck_replace_is_safe`

This method was no longer needed after #113725; the checking logic is
now in `SideEffects.check_allowed_side_effect`.

ghstack-source-id: 51bbdbd
Pull Request resolved: #137906
StrongerXi added a commit that referenced this pull request Oct 18, 2024
…ck_replace_is_safe`

This method was no longer needed after #113725; the checking logic is
now in `SideEffects.check_allowed_side_effect`.

ghstack-source-id: 3be7f75
Pull Request resolved: #137906
pytorchmergebot pushed a commit that referenced this pull request Oct 18, 2024
…ck_replace_is_safe` (#137906)

This method was no longer needed after #113725; the checking logic is
now in `SideEffects.check_allowed_side_effect`.

Pull Request resolved: #137906
Approved by: https://github.com/Skylion007, https://github.com/anijain2305
ghstack dependencies: #137905
pytorchmergebot pushed a commit that referenced this pull request Oct 21, 2024
…ck_replace_is_safe` (#137906)

This method was no longer needed after #113725; the checking logic is
now in `SideEffects.check_allowed_side_effect`.

Pull Request resolved: #137906
Approved by: https://github.com/Skylion007, https://github.com/anijain2305
ghstack dependencies: #137905
StrongerXi added a commit that referenced this pull request Nov 2, 2024
This effectively undoes #115095, which is not longer be needed after #113725.

Why did we need #115095? Prior to #113725, the immutable
`VariableTracker` [didn't always propagate
`source`](https://github.com/pytorch/pytorch/pull/113725/files#diff-622913fdb49db90d6f3a8ab225b4badb7996023e6498e9f7c6d03fe9f32d0986L1477-L1482)
at the `VariableTracker` level (instead we [propagated via
`mutable_local`](https://github.com/pytorch/pytorch/pull/113725/files#diff-244e66f264abc5a8fa2f4b695b33d7dfdde022d6515ab99437106cc0c03d23d1L328-L332)).
So we couldn't rely on the `if value.source is not None` check in
`PyCodegen`. Thus we need to use this `mutable_side_effects_from_source`
flag to force codegen based on `source`.

Note that both previously and now, the `SideEffects` handle this case by
explicitly calling
[cg(var.source)](https://github.com/pytorch/pytorch/blob/092fe2f4224d94db6219cc189f56b8825012bc2e/torch/_dynamo/side_effects.py#L450)
and [cg(var,
allow_cache=False)](https://github.com/pytorch/pytorch/blob/092fe2f4224d94db6219cc189f56b8825012bc2e/torch/_dynamo/side_effects.py#L552)
So `SideEffects` never needed special treatment, only `restore_stack`
did.

ghstack-source-id: ccac027
Pull Request resolved: #139538
StrongerXi added a commit that referenced this pull request Nov 4, 2024
This effectively undoes #115095, which is not longer be needed after #113725.

Why did we need #115095? I went back in history and found that [this line](https://github.com/pytorch/pytorch/pull/113725/files#diff-0bb1756725c4426408938314b0c9d3988ae5bf49994892d7038ad7746e209e9fR86)
actually fixed what #115095 fixed. Specifically, without the
`allow_cache` check for the "dup_top" optimization, we could incorrectly
codegen based on source, despite `codegen_update_mutated` requested to
codegen from value, for updates to pre-existing lists, etc.

However, #115442 introduced a `value_from_source` flag which didn't
account for the `mutable_side_effects_from_source` branch. So this patch
adds an extra check to keep existing behavior for export, and leaves a
TODO for investigating what exactly export wants from codegen, when it
comes to side effects and sources.

ghstack-source-id: 45a1454
Pull Request resolved: #139538
StrongerXi added a commit that referenced this pull request Nov 4, 2024
This patch
1. Adds documentation to `PyCodegen.__call__`, `PyCodegen.tempvars` and
   the `allow_cache` flag.
2. Merges a few existing code paths in `PyCodegen.__call__`.
3. removes the `elif var in cg.tempvars` code path in
   `codegen_save_tempvars`, because it's no longer needed after #113725,
   as we have up-to-date `VariableTracker.source` now.

ghstack-source-id: e718ea1
Pull Request resolved: #139670
StrongerXi added a commit that referenced this pull request Nov 6, 2024
This patch
1. Adds documentation to `PyCodegen.__call__`, `PyCodegen.tempvars` and
   the `allow_cache` flag.
2. Merges a few existing code paths in `PyCodegen.__call__`.
3. removes the `elif var in cg.tempvars` code path in
   `codegen_save_tempvars`, because it's no longer needed after #113725,
   as we have up-to-date `VariableTracker.source` now.

ghstack-source-id: 87bbf04
Pull Request resolved: #139670
StrongerXi added a commit that referenced this pull request Nov 6, 2024
…e code paths"

This patch
1. Adds documentation to `PyCodegen.__call__`, `PyCodegen.tempvars` and
   the `allow_cache` flag.
2. Merges a few existing code paths in `PyCodegen.__call__`.
3. removes the `elif var in cg.tempvars` code path in
   `codegen_save_tempvars`, because it's no longer needed after #113725,
   as we have up-to-date `VariableTracker.source` now.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames

[ghstack-poisoned]
StrongerXi added a commit that referenced this pull request Nov 6, 2024
This patch
1. Adds documentation to `PyCodegen.__call__`, `PyCodegen.tempvars` and
   the `allow_cache` flag.
2. Merges a few existing code paths in `PyCodegen.__call__`.
3. removes the `elif var in cg.tempvars` code path in
   `codegen_save_tempvars`, because it's no longer needed after #113725,
   as we have up-to-date `VariableTracker.source` now.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames

[ghstack-poisoned]
StrongerXi added a commit that referenced this pull request Nov 6, 2024
…th `MutableSideEffects`"


This effectively undoes #115095, which is not longer be needed after #113725.

Why did we need #115095? I went back in history and found that [this line](https://github.com/pytorch/pytorch/pull/113725/files#diff-0bb1756725c4426408938314b0c9d3988ae5bf49994892d7038ad7746e209e9fR86)
actually fixed what #115095 fixed. Specifically, without the
`allow_cache` check for the "dup_top" optimization, we could incorrectly
codegen based on source, despite `codegen_update_mutated` requested to
codegen from value, for updates to pre-existing lists, etc. Since #113725 added
the `allow_cache` check, we no longer need the `mutable_side_effects_from_source`
code path from #115095.

However, #115442 introduced a `value_from_source` flag which didn't
account for the `mutable_side_effects_from_source` branch. So this patch
adds an extra check to keep existing behavior for export, and leaves a
TODO for investigating what exactly export wants from codegen, when it
comes to side effects and sources.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames

[ghstack-poisoned]
StrongerXi added a commit that referenced this pull request Nov 6, 2024
…ffects`"


This effectively undoes #115095, which is not longer be needed after #113725.

Why did we need #115095? I went back in history and found that [this line](https://github.com/pytorch/pytorch/pull/113725/files#diff-0bb1756725c4426408938314b0c9d3988ae5bf49994892d7038ad7746e209e9fR86)
actually fixed what #115095 fixed. Specifically, without the
`allow_cache` check for the "dup_top" optimization, we could incorrectly
codegen based on source, despite `codegen_update_mutated` requested to
codegen from value, for updates to pre-existing lists, etc. Since #113725 added
the `allow_cache` check, we no longer need the `mutable_side_effects_from_source`
code path from #115095.

However, #115442 introduced a `value_from_source` flag which didn't
account for the `mutable_side_effects_from_source` branch. So this patch
adds an extra check to keep existing behavior for export, and leaves a
TODO for investigating what exactly export wants from codegen, when it
comes to side effects and sources.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Nov 7, 2024
…39538)

This effectively undoes #115095, which is not longer be needed after #113725.

Why did we need #115095? I went back in history and found that [this line](https://github.com/pytorch/pytorch/pull/113725/files#diff-0bb1756725c4426408938314b0c9d3988ae5bf49994892d7038ad7746e209e9fR86)
actually fixed what #115095 fixed. Specifically, without the
`allow_cache` check for the "dup_top" optimization, we could incorrectly
codegen based on source, despite `codegen_update_mutated` requested to
codegen from value, for updates to pre-existing lists, etc. Since #113725 added
the `allow_cache` check, we no longer need the `mutable_side_effects_from_source`
code path from #115095.

However, #115442 introduced a `value_from_source` flag which didn't
account for the `mutable_side_effects_from_source` branch. So this patch
adds an extra check to keep existing behavior for export, and leaves a
TODO for investigating what exactly export wants from codegen, when it
comes to side effects and sources.

Pull Request resolved: #139538
Approved by: https://github.com/jansel
pytorchmergebot pushed a commit that referenced this pull request Nov 7, 2024
This patch
1. Adds documentation to `PyCodegen.__call__`, `PyCodegen.tempvars` and
   the `allow_cache` flag.
2. Merges a few existing code paths in `PyCodegen.__call__`.
3. removes the `elif var in cg.tempvars` code path in
   `codegen_save_tempvars`, because it's no longer needed after #113725,
   as we have up-to-date `VariableTracker.source` now.

Pull Request resolved: #139670
Approved by: https://github.com/jansel
ghstack dependencies: #139538
atalman pushed a commit to atalman/pytorch that referenced this pull request Nov 11, 2024
…torch#139538)

This effectively undoes pytorch#115095, which is not longer be needed after pytorch#113725.

Why did we need pytorch#115095? I went back in history and found that [this line](https://github.com/pytorch/pytorch/pull/113725/files#diff-0bb1756725c4426408938314b0c9d3988ae5bf49994892d7038ad7746e209e9fR86)
actually fixed what pytorch#115095 fixed. Specifically, without the
`allow_cache` check for the "dup_top" optimization, we could incorrectly
codegen based on source, despite `codegen_update_mutated` requested to
codegen from value, for updates to pre-existing lists, etc. Since pytorch#113725 added
the `allow_cache` check, we no longer need the `mutable_side_effects_from_source`
code path from pytorch#115095.

However, pytorch#115442 introduced a `value_from_source` flag which didn't
account for the `mutable_side_effects_from_source` branch. So this patch
adds an extra check to keep existing behavior for export, and leaves a
TODO for investigating what exactly export wants from codegen, when it
comes to side effects and sources.

Pull Request resolved: pytorch#139538
Approved by: https://github.com/jansel
atalman pushed a commit to atalman/pytorch that referenced this pull request Nov 11, 2024
This patch
1. Adds documentation to `PyCodegen.__call__`, `PyCodegen.tempvars` and
   the `allow_cache` flag.
2. Merges a few existing code paths in `PyCodegen.__call__`.
3. removes the `elif var in cg.tempvars` code path in
   `codegen_save_tempvars`, because it's no longer needed after pytorch#113725,
   as we have up-to-date `VariableTracker.source` now.

Pull Request resolved: pytorch#139670
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#139538
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
…torch#139538)

This effectively undoes pytorch#115095, which is not longer be needed after pytorch#113725.

Why did we need pytorch#115095? I went back in history and found that [this line](https://github.com/pytorch/pytorch/pull/113725/files#diff-0bb1756725c4426408938314b0c9d3988ae5bf49994892d7038ad7746e209e9fR86)
actually fixed what pytorch#115095 fixed. Specifically, without the
`allow_cache` check for the "dup_top" optimization, we could incorrectly
codegen based on source, despite `codegen_update_mutated` requested to
codegen from value, for updates to pre-existing lists, etc. Since pytorch#113725 added
the `allow_cache` check, we no longer need the `mutable_side_effects_from_source`
code path from pytorch#115095.

However, pytorch#115442 introduced a `value_from_source` flag which didn't
account for the `mutable_side_effects_from_source` branch. So this patch
adds an extra check to keep existing behavior for export, and leaves a
TODO for investigating what exactly export wants from codegen, when it
comes to side effects and sources.

Pull Request resolved: pytorch#139538
Approved by: https://github.com/jansel
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
This patch
1. Adds documentation to `PyCodegen.__call__`, `PyCodegen.tempvars` and
   the `allow_cache` flag.
2. Merges a few existing code paths in `PyCodegen.__call__`.
3. removes the `elif var in cg.tempvars` code path in
   `codegen_save_tempvars`, because it's no longer needed after pytorch#113725,
   as we have up-to-date `VariableTracker.source` now.

Pull Request resolved: pytorch#139670
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#139538
fmo-mt pushed a commit to fmo-mt/pytorch that referenced this pull request Dec 11, 2024
…torch#139538)

This effectively undoes pytorch#115095, which is not longer be needed after pytorch#113725.

Why did we need pytorch#115095? I went back in history and found that [this line](https://github.com/pytorch/pytorch/pull/113725/files#diff-0bb1756725c4426408938314b0c9d3988ae5bf49994892d7038ad7746e209e9fR86)
actually fixed what pytorch#115095 fixed. Specifically, without the
`allow_cache` check for the "dup_top" optimization, we could incorrectly
codegen based on source, despite `codegen_update_mutated` requested to
codegen from value, for updates to pre-existing lists, etc. Since pytorch#113725 added
the `allow_cache` check, we no longer need the `mutable_side_effects_from_source`
code path from pytorch#115095.

However, pytorch#115442 introduced a `value_from_source` flag which didn't
account for the `mutable_side_effects_from_source` branch. So this patch
adds an extra check to keep existing behavior for export, and leaves a
TODO for investigating what exactly export wants from codegen, when it
comes to side effects and sources.

Pull Request resolved: pytorch#139538
Approved by: https://github.com/jansel
fmo-mt pushed a commit to fmo-mt/pytorch that referenced this pull request Dec 11, 2024
This patch
1. Adds documentation to `PyCodegen.__call__`, `PyCodegen.tempvars` and
   the `allow_cache` flag.
2. Merges a few existing code paths in `PyCodegen.__call__`.
3. removes the `elif var in cg.tempvars` code path in
   `codegen_save_tempvars`, because it's no longer needed after pytorch#113725,
   as we have up-to-date `VariableTracker.source` now.

Pull Request resolved: pytorch#139670
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#139538
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants