-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[Dynamo] Implement graph region tracking for deduplication #141381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fixes for bfs Initial region tracking tests hashing + test fixes fix for hash more tests more fixes region tracker updates Update tests Update tests2 [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141381
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit f475866 with merge base 117b6c3 ( UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Nov 22, 2024
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 [ghstack-poisoned]
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 Edge cases to add more testing for in future PRs (in progress): * multiple nodes on the same line * dynamic shapes banning * ensure we don't expand regions where it will create a cycle during subgraph replacement * ensure outputs are always tensors * out of order kwargs, unevenly nested kwargs [ghstack-poisoned]
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 Edge cases to add more testing for in future PRs (in progress): * multiple nodes on the same line * dynamic shapes banning * ensure we don't expand regions where it will create a cycle during subgraph replacement * ensure outputs are always tensors * out of order kwargs, unevenly nested kwargs [ghstack-poisoned]
zou3519
reviewed
Nov 26, 2024
zou3519
reviewed
Nov 26, 2024
zou3519
reviewed
Nov 26, 2024
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 Edge cases to add more testing for in future PRs (in progress): * multiple nodes on the same line * dynamic shapes checking (need to verify symbolic inputs are the same across subgraphs) * ensure we don't expand regions where it will create a cycle during subgraph replacement * ensure outputs are always tensors (or tuples of tensors iirc) * out of order kwargs, unevenly nested kwargs [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Nov 28, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Nov 28, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 Edge cases to add more testing for in future PRs (in progress): * multiple nodes on the same line * dynamic shapes checking (need to verify symbolic inputs are the same across subgraphs) * ensure we don't expand regions where it will create a cycle during subgraph replacement * ensure outputs are always tensors (or tuples of tensors iirc) * out of order kwargs, unevenly nested kwargs [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Nov 28, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Nov 28, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 Edge cases to add more testing for in future PRs (in progress): * multiple nodes on the same line * dynamic shapes checking (need to verify symbolic inputs are the same across subgraphs) * ensure we don't expand regions where it will create a cycle during subgraph replacement * ensure outputs are always tensors (or tuples of tensors iirc) * out of order kwargs, unevenly nested kwargs [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Dec 2, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Dec 2, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 Edge cases to add more testing for in future PRs (in progress): * ~~multiple nodes on the same line~~ (implemented) * ~~dynamic shapes checking (need to verify symbolic inputs are the same across subgraphs)~~ (implemented) * ensure we don't expand regions where it will create a cycle during subgraph replacement * ensure outputs are always tensors (or tuples of tensors iirc) * ~~out of order kwargs, unevenly nested kwargs~~ (implemented) * input aliasing - TBD, we may add support for this in `invoke_subgraph` or reuse the aliasing analysis here to not form regions with these properties * ~~all global state~~ (implemented) Other followups: * consolidate global state checking across all caching infra [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Dec 10, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Dec 10, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 Edge cases to add more testing for in future PRs (in progress): * ~~multiple nodes on the same line~~ (implemented) * ~~dynamic shapes checking (need to verify symbolic inputs are the same across subgraphs)~~ (implemented) * ensure we don't expand regions where it will create a cycle during subgraph replacement * ensure outputs are always tensors (or tuples of tensors iirc) * ~~out of order kwargs, unevenly nested kwargs~~ (implemented) * input aliasing - TBD, we may add support for this in `invoke_subgraph` or reuse the aliasing analysis here to not form regions with these properties * ~~all global state~~ (implemented) Other followups: * consolidate global state checking across all caching infra [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Dec 10, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Dec 10, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows: `GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes. Once graph capture is complete, these nodes are organized into region groups. A region group looks like this: [[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383 Edge cases to add more testing for in future PRs (in progress): * ~~multiple nodes on the same line~~ (implemented) * ~~dynamic shapes checking (need to verify symbolic inputs are the same across subgraphs)~~ (implemented) * ensure we don't expand regions where it will create a cycle during subgraph replacement * ensure outputs are always tensors (or tuples of tensors iirc) * ~~out of order kwargs, unevenly nested kwargs~~ (implemented) * input aliasing - TBD, we may add support for this in `invoke_subgraph` or reuse the aliasing analysis here to not form regions with these properties * ~~all global state~~ (implemented) Other followups: * consolidate global state checking across all caching infra [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Dec 10, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
mlazos
added a commit
that referenced
this pull request
Dec 10, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]
pytorchmergebot
pushed a commit
that referenced
this pull request
Dec 11, 2024
This PR adds debug logging for the region expansion algorithm. Pull Request resolved: #141382 Approved by: https://github.com/williamwen42 ghstack dependencies: #141381
pytorchmergebot
pushed a commit
that referenced
this pull request
Dec 11, 2024
This PR implements the deduplication pass (blocked by config currently) for dynamo where identical regions from #141381 are replaced with a common subgraph. The two phases of deduplication are explained below. **Subgraph creation**: Subgraph creation works by taking one representative region from each region group and creating a subgraph from it, which will then be used to replace all regions in the group. This is implemented by first copying all nodes of the region to the new subgraph and then finding all inputs which are not within the region and creating placeholders for them. For the outputs, all regions in a region group need to be scanned to ensure the largest set of outputs is found, and then an output node is created which returns a tuple of all outputs. **Graph replacement**: To replace each region with the extracted subgraph, the node index in the region and argument index within the node's flattened args and kwargs are recorded once during subgraph creation. This allows us to determine which (external to the region) nodes and in which order these nodes are passed as inputs. For the outputs, getitem nodes are created for each output, and all nodes in the region with external outputs are replaced by the proper getitem node. Finally, all original nodes are erased (there should be no uses of these left in the graph). Pull Request resolved: #141383 Approved by: https://github.com/zou3519 ghstack dependencies: #141381, #141382
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows:
GraphRegionTrackertracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes.Once graph capture is complete, these nodes are organized into region groups. A region group looks like this:
[[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The
get_identical_regionsfunction is the main entry point which will be used by the graph replacement algorithm in #141383Edge cases to add more testing for in future PRs (in progress):
multiple nodes on the same line(implemented)dynamic shapes checking (need to verify symbolic inputs are the same across subgraphs)(implemented)out of order kwargs, unevenly nested kwargs(implemented)invoke_subgraphor reuse the aliasing analysis here to not form regions with these propertiesall global state(implemented)Other followups:
Stack from ghstack (oldest at bottom):
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames