-
Notifications
You must be signed in to change notification settings - Fork 671
Add OptimType.NONE in SplitTBE (defuse bwd and optim) #1819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs canceled.
|
|
This pull request was exported from Phabricator. Differential Revision: D44392172 |
Summary: Pull Request resolved: pytorch#1819 This diff is the **backend** part This diff introduces `OptimType.NONE`. Unlike other `OptimType`s, `OptimType.NONE` does not perform the optimizer step during SplitTBE's backward pass. With `OptimType.NONE`, SplitTBE deduplicates output gradients in the backward pass and generates a sparse gradient tensor (PyTorch's `sparse_coo_tensor`) for the device's weight (FQN: `weights_dev`). Currently, `OptimType.NONE` only supports the case where the embedding dimensions of all embedding tables are identical. Differential Revision: D44392172 fbshipit-source-id: b1264e5a5032ebad051d5c5b739dd9ffec1d8a92
|
This pull request was exported from Phabricator. Differential Revision: D44392172 |
Summary: Pull Request resolved: pytorch#1819 This diff is the **backend** part This diff introduces `OptimType.NONE`. Unlike other `OptimType`s, `OptimType.NONE` does not perform the optimizer step during SplitTBE's backward pass. With `OptimType.NONE`, SplitTBE deduplicates output gradients in the backward pass and generates a sparse gradient tensor (PyTorch's `sparse_coo_tensor`) for the device's weight (FQN: `weights_dev`). Currently, `OptimType.NONE` only supports the case where the embedding dimensions of all embedding tables are identical. Differential Revision: D44392172 fbshipit-source-id: e01cd97b9ea0aab2e0f7004e2323d98f83751099
|
This pull request was exported from Phabricator. Differential Revision: D44392172 |
Summary: Pull Request resolved: pytorch#1819 This diff is the **backend** part This diff introduces `OptimType.NONE`. Unlike other `OptimType`s, `OptimType.NONE` does not perform the optimizer step during SplitTBE's backward pass. With `OptimType.NONE`, SplitTBE deduplicates output gradients in the backward pass and generates a sparse gradient tensor (PyTorch's `sparse_coo_tensor`) for the device's weight (FQN: `weights_dev`). Currently, `OptimType.NONE` only supports the case where the embedding dimensions of all embedding tables are identical. Differential Revision: D44392172 fbshipit-source-id: d62b11a29ab221c3a706f57a2ed146cc5c624096
|
This pull request was exported from Phabricator. Differential Revision: D44392172 |
Summary: Pull Request resolved: pytorch#1819 This diff is the **backend** part This diff introduces `OptimType.NONE`. Unlike other `OptimType`s, `OptimType.NONE` does not perform the optimizer step during SplitTBE's backward pass. With `OptimType.NONE`, SplitTBE deduplicates output gradients in the backward pass and generates a sparse gradient tensor (PyTorch's `sparse_coo_tensor`) for the device's weight (FQN: `weights_dev`). Currently, `OptimType.NONE` only supports the case where the embedding dimensions of all embedding tables are identical. Differential Revision: D44392172 fbshipit-source-id: 52d746963b772f6ddaada7630cdf4b53d1e71ed3
|
This pull request was exported from Phabricator. Differential Revision: D44392172 |
|
This pull request has been merged in edc57b1. |
Summary:
This diff is the backend part
This diff introduces
OptimType.NONE. Unlike otherOptimTypes,OptimType.NONEdoes not perform the optimizer step during SplitTBE'sbackward pass. With
OptimType.NONE, SplitTBE deduplicates outputgradients in the backward pass and generates a sparse gradient tensor
(PyTorch's
sparse_coo_tensor) for the device's weight (FQN:weights_dev).Currently,
OptimType.NONEonly supports the case where the embeddingdimensions of all embedding tables are identical.
Differential Revision: D44392172