KEMBAR78
[dynamo] Track from registered tensor hooks in `prune_dead_object_new` by StrongerXi · Pull Request #140435 · pytorch/pytorch · GitHub
Skip to content

Conversation

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 12, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140435

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit b21abe9 with merge base f98c601 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@StrongerXi
Copy link
Contributor Author

Previously Dynamo would incorrectly prune away some NewCellVariable from SideEffects, which means we won't establish its source in codegen_save_tempvars, causing us to hit NewCellVariable.reconstruct down the line, which triggers Unsupported: reconstruct: NewCellVariable(), and Dynamo will restart & skip the function it was inlining.

Now this patch fixes the pruning, but thereby is exposing a new issue (previously it was silent/"fixed" by graph break and letting CPython run the failing portion).

@StrongerXi
Copy link
Contributor Author

Previously Dynamo would incorrectly prune away some NewCellVariable from SideEffects, which means we won't establish its source in codegen_save_tempvars, causing us to hit NewCellVariable.reconstruct down the line, which triggers Unsupported: reconstruct: NewCellVariable(), and Dynamo will restart & skip the function it was inlining.

Now this patch fixes the pruning, but thereby is exposing a new issue (previously it was silent/"fixed" by graph break and letting CPython run the failing portion).

This turns out to be conveniently fixed by #140436, I'll reorder the stack, add a regression test and update commit message there.

[ghstack-poisoned]
[ghstack-poisoned]
@StrongerXi
Copy link
Contributor Author

Rebase.

[ghstack-poisoned]
@StrongerXi StrongerXi requested review from jansel and zou3519 November 13, 2024 12:46
[ghstack-poisoned]
[ghstack-poisoned]
smalltalkman pushed a commit to smalltalkman/pytorch that referenced this pull request Nov 15, 2024
In addition to `NewCellVariable`, Dynamo has 3 ways of modeling cell objects:
1. For cells captured and created by the root frame, represent them as
   their contents in `root_tx.symbolic_locals`, which `LOAD_DEREF` and
   `STORE_DEREF` update directly, without going through `SideEffects`.
2. `ClosureVariable`: this is created when cells from (1) are captured
   by a newly created function Dynamo is about to inline. It's a handle
   with a name that redirects `LOAD_DEREF` and `STORE_DEREF` back (1),
   to make `root_tx.symbolic_locals` up-to-date.
3. For cells that are captured by both the root frame and some
   pre-existing function Dynamo is about to inline, represent those
   cells as contents, and do not allow writes to them.

Note that (2) and (3) are mainly to conform with (1) -- to make sure
Dynamo has a consistent modeling of cells for the same cell objects.

In this patch, we represent all of these cells as `NewCellVariable`. The
main new code paths introduced are:
- using `NewCellVariable` to model cell objects created by the root
  frame (the cells are passed in as input to `InstructionTranslator`),
  this is what allows us to get rid of all 3 legacy paths above.
- adding a new `AutoDerefLocalSource` to deal with the python-code
  level (guards) and bytecode level (codegen) auto-dereferencing
  behavior, when accessing pre-existing python cells. This also
  involves a tiny update to guard manager generation.
- plumbing some extra info into `LocalSource` and `CellVariable` so that
  we can still emit `LOAD_DEREF`, `STORE_DEREF`, `LOAD_CLOSURE` (instead
  of `make_cell`, `cell_contents` attribute access, and `LOAD_FAST`),
  which is important for readability, performance, and some
  assumptions `bytecode_transformation.py` makes.

As a result, this patch removes a lot of the now-dead code paths and
TODOs. Notably, it significantly simplified the `prune_dead_locals`
function, which was duplicating a lot of the logic from
`prune_dead_object_new`; this conveniently closes pytorch#137123.

Pull Request resolved: pytorch#140153
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435
smalltalkman pushed a commit to smalltalkman/pytorch that referenced this pull request Nov 15, 2024
…140154)

Now that all cells are modeled as `NewCellVariable` in Dynamo, we no
longer need to put cell variables into this special `closure_cells`,
rather we just merge `closure_cells` with `symbolic_locals`.

This allows us to merge and remove some code paths, notably make
`LOAD_CLOSURE` the same as `LOAD_FAST`, and `LOAD_DEREF` & `STORE_DEREF`
the same for inlining or regular `InstructionTranslator`.

Pull Request resolved: pytorch#140154
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435, pytorch#140153
smalltalkman pushed a commit to smalltalkman/pytorch that referenced this pull request Nov 15, 2024
…ytorch#140155)

This is no longer needed now that we've replaced `ClosureVariable` with
`NewCellVariable`, i.e., Dynamo now treats `LOAD_CLOSURE` the same as
`LOAD_FAST`.

Pull Request resolved: pytorch#140155
Approved by: https://github.com/jansel, https://github.com/williamwen42
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435, pytorch#140153, pytorch#140154
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
pytorch#140435)

Registed tensor hooks contain `NestedUserFunctionVariable` which might
capture a `NewCellVariable` for cell objects created during Dynamo
tracing, so we must make sure it doesn't get pruned away.

Pull Request resolved: pytorch#140435
Approved by: https://github.com/jansel, https://github.com/zou3519
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
In addition to `NewCellVariable`, Dynamo has 3 ways of modeling cell objects:
1. For cells captured and created by the root frame, represent them as
   their contents in `root_tx.symbolic_locals`, which `LOAD_DEREF` and
   `STORE_DEREF` update directly, without going through `SideEffects`.
2. `ClosureVariable`: this is created when cells from (1) are captured
   by a newly created function Dynamo is about to inline. It's a handle
   with a name that redirects `LOAD_DEREF` and `STORE_DEREF` back (1),
   to make `root_tx.symbolic_locals` up-to-date.
3. For cells that are captured by both the root frame and some
   pre-existing function Dynamo is about to inline, represent those
   cells as contents, and do not allow writes to them.

Note that (2) and (3) are mainly to conform with (1) -- to make sure
Dynamo has a consistent modeling of cells for the same cell objects.

In this patch, we represent all of these cells as `NewCellVariable`. The
main new code paths introduced are:
- using `NewCellVariable` to model cell objects created by the root
  frame (the cells are passed in as input to `InstructionTranslator`),
  this is what allows us to get rid of all 3 legacy paths above.
- adding a new `AutoDerefLocalSource` to deal with the python-code
  level (guards) and bytecode level (codegen) auto-dereferencing
  behavior, when accessing pre-existing python cells. This also
  involves a tiny update to guard manager generation.
- plumbing some extra info into `LocalSource` and `CellVariable` so that
  we can still emit `LOAD_DEREF`, `STORE_DEREF`, `LOAD_CLOSURE` (instead
  of `make_cell`, `cell_contents` attribute access, and `LOAD_FAST`),
  which is important for readability, performance, and some
  assumptions `bytecode_transformation.py` makes.

As a result, this patch removes a lot of the now-dead code paths and
TODOs. Notably, it significantly simplified the `prune_dead_locals`
function, which was duplicating a lot of the logic from
`prune_dead_object_new`; this conveniently closes pytorch#137123.

Pull Request resolved: pytorch#140153
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
…140154)

Now that all cells are modeled as `NewCellVariable` in Dynamo, we no
longer need to put cell variables into this special `closure_cells`,
rather we just merge `closure_cells` with `symbolic_locals`.

This allows us to merge and remove some code paths, notably make
`LOAD_CLOSURE` the same as `LOAD_FAST`, and `LOAD_DEREF` & `STORE_DEREF`
the same for inlining or regular `InstructionTranslator`.

Pull Request resolved: pytorch#140154
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435, pytorch#140153
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
…ytorch#140155)

This is no longer needed now that we've replaced `ClosureVariable` with
`NewCellVariable`, i.e., Dynamo now treats `LOAD_CLOSURE` the same as
`LOAD_FAST`.

Pull Request resolved: pytorch#140155
Approved by: https://github.com/jansel, https://github.com/williamwen42
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435, pytorch#140153, pytorch#140154
fmo-mt pushed a commit to fmo-mt/pytorch that referenced this pull request Dec 11, 2024
pytorch#140435)

Registed tensor hooks contain `NestedUserFunctionVariable` which might
capture a `NewCellVariable` for cell objects created during Dynamo
tracing, so we must make sure it doesn't get pruned away.

Pull Request resolved: pytorch#140435
Approved by: https://github.com/jansel, https://github.com/zou3519
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436
fmo-mt pushed a commit to fmo-mt/pytorch that referenced this pull request Dec 11, 2024
In addition to `NewCellVariable`, Dynamo has 3 ways of modeling cell objects:
1. For cells captured and created by the root frame, represent them as
   their contents in `root_tx.symbolic_locals`, which `LOAD_DEREF` and
   `STORE_DEREF` update directly, without going through `SideEffects`.
2. `ClosureVariable`: this is created when cells from (1) are captured
   by a newly created function Dynamo is about to inline. It's a handle
   with a name that redirects `LOAD_DEREF` and `STORE_DEREF` back (1),
   to make `root_tx.symbolic_locals` up-to-date.
3. For cells that are captured by both the root frame and some
   pre-existing function Dynamo is about to inline, represent those
   cells as contents, and do not allow writes to them.

Note that (2) and (3) are mainly to conform with (1) -- to make sure
Dynamo has a consistent modeling of cells for the same cell objects.

In this patch, we represent all of these cells as `NewCellVariable`. The
main new code paths introduced are:
- using `NewCellVariable` to model cell objects created by the root
  frame (the cells are passed in as input to `InstructionTranslator`),
  this is what allows us to get rid of all 3 legacy paths above.
- adding a new `AutoDerefLocalSource` to deal with the python-code
  level (guards) and bytecode level (codegen) auto-dereferencing
  behavior, when accessing pre-existing python cells. This also
  involves a tiny update to guard manager generation.
- plumbing some extra info into `LocalSource` and `CellVariable` so that
  we can still emit `LOAD_DEREF`, `STORE_DEREF`, `LOAD_CLOSURE` (instead
  of `make_cell`, `cell_contents` attribute access, and `LOAD_FAST`),
  which is important for readability, performance, and some
  assumptions `bytecode_transformation.py` makes.

As a result, this patch removes a lot of the now-dead code paths and
TODOs. Notably, it significantly simplified the `prune_dead_locals`
function, which was duplicating a lot of the logic from
`prune_dead_object_new`; this conveniently closes pytorch#137123.

Pull Request resolved: pytorch#140153
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435
fmo-mt pushed a commit to fmo-mt/pytorch that referenced this pull request Dec 11, 2024
…140154)

Now that all cells are modeled as `NewCellVariable` in Dynamo, we no
longer need to put cell variables into this special `closure_cells`,
rather we just merge `closure_cells` with `symbolic_locals`.

This allows us to merge and remove some code paths, notably make
`LOAD_CLOSURE` the same as `LOAD_FAST`, and `LOAD_DEREF` & `STORE_DEREF`
the same for inlining or regular `InstructionTranslator`.

Pull Request resolved: pytorch#140154
Approved by: https://github.com/jansel
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435, pytorch#140153
fmo-mt pushed a commit to fmo-mt/pytorch that referenced this pull request Dec 11, 2024
…ytorch#140155)

This is no longer needed now that we've replaced `ClosureVariable` with
`NewCellVariable`, i.e., Dynamo now treats `LOAD_CLOSURE` the same as
`LOAD_FAST`.

Pull Request resolved: pytorch#140155
Approved by: https://github.com/jansel, https://github.com/williamwen42
ghstack dependencies: pytorch#140330, pytorch#140152, pytorch#140436, pytorch#140435, pytorch#140153, pytorch#140154
@github-actions github-actions bot deleted the gh/StrongerXi/34/head branch December 19, 2024 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants