[TP] Refactor style to make it work with torch.compile #111625

fduwjj · 2023-10-20T00:50:09Z

Stack from ghstack (oldest at bottom):

-> [TP] Refactor style to make it work with torch.compile #111625

We are refactoring parallel style to solve the following things:

To further simplifying code logic to make more readable for users.
To remove tuple check so that we can work with dynamo for now. Ideally dynamo needs to support this case and we will fix it in parallel.
Add tests for newly added parallel style in UT and torch compile test so that we can capture regression due to code change.
Move placements early return check into DTensor since it is by passed by dynamo.
Remove PairwiseParallelStyle from unit tests to use the new Col/Rowwise parallel style.

[ghstack-poisoned]

pytorch-bot · 2023-10-20T00:50:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/111625

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7718059 with merge base 0617f7f ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

wanchaol

Please see inline suggestions.

torch/distributed/tensor/parallel/style.py

test/distributed/_tensor/test_dtensor_compile.py

[ghstack-poisoned]

ghstack-source-id: 4405eb0 Pull Request resolved: #111625

wanchaol

lgtm, thanks for refactor this!

wanchaol · 2023-10-20T18:13:03Z

test/distributed/_tensor/test_dtensor_compile.py

+            if is_seq_parallel
+            else PrepareModuleInput(input_layouts=Replicate())
+        )
+        no_input_prepare_colwise_style = ColwiseParallel(input_layouts=None)


hmmm I don't think our input_layouts accept None as it's a Union[Placement, Tuple[Placement]]?

We should probably just do a check inside prepare_input_fn to assure input_layouts matches the passed in DTensor, can be done in follow up PRs

sure, let me first merge this and do in a follow-up PR.

wanchaol · 2023-10-20T18:17:14Z

test/distributed/tensor/parallel/test_tp_style.py

+            torch.empty_like(tensor) for _ in range(self.world_size)
+        ]
+        dist.all_gather(gathered_tensors, tensor)
+        gathered_tensors = torch.cat(gathered_tensors, dim=0).contiguous()


nit: can just use functional collective without manually recreate the gathered tensors. Can fix this later

oh ok, sure. Will do in follow up PR which I will try to change test.

wconstab · 2023-10-20T18:53:51Z

test/distributed/_tensor/test_dtensor_compile.py

-                "mlp_0.net2": RowwiseParallel(),
-                "mlp_1.net1": ColwiseParallel(),
-                "mlp_1.net2": RowwiseParallel(),
+                "mlp_0": module_prepare_input,


i'm not sure what this does- does it mean that input is already coming as local tensors (may be sharded or replicated) and this wraps them in DTensors? Or does this actually 'shard' the inputs (Assumes whole inputs first)

input is already coming as local tensors (may be sharded or replicated) and this wraps them in DTensors

Yes. So what I see in xlformer is that it has three/two col-wise linear. Instead of registering hook in each nn.Linear and end up calling all-gather multiple times, we will use module_prepare_input to only register once in the parent module.

@wconstab it's coming in as local tensors then in this prepareInput we mark it as DTensor and do a allgather (redistribute)

@fduwjj another follow up we should probably do is to remove the default layouts for PrepareModuleInput/Output, and requires user to set it more explicitly so that user know what things is being done under the hood

fduwjj · 2023-10-20T19:18:20Z

@pytorchbot merge

pytorchmergebot · 2023-10-20T19:20:15Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

We are refactoring parallel style to solve the following things: 1. To further simplifying code logic to make more readable for users. 2. To remove tuple check so that we can work with dynamo for now. Ideally dynamo needs to support this case and we will fix it in parallel. 3. Add tests for newly added parallel style in UT and torch compile test so that we can capture regression due to code change. 4. Move placements early return check into DTensor since it is by passed by dynamo. 5. Remove PairwiseParallelStyle from unit tests to use the new Col/Rowwise parallel style. Pull Request resolved: pytorch#111625 Approved by: https://github.com/wanchaol

[TP] Refactor style to make it work with torch.compile

59dff26

[ghstack-poisoned]

fduwjj requested review from H-Huang, awgu, d4l3k, fegin, kiukchung, kwen2501, mrshenli, rohan-varma, wanchaol, wz337 and zhaojuanmao as code owners October 20, 2023 00:50

wanchaol reviewed Oct 20, 2023

View reviewed changes

test/distributed/_tensor/test_dtensor_compile.py Show resolved Hide resolved

Update on "[TP] Refactor style to make it work with torch.compile"

7718059

[ghstack-poisoned]

fduwjj added a commit that referenced this pull request Oct 20, 2023

[TP] Refactor style to make it work with torch.compile

ea1f619

ghstack-source-id: 4405eb0 Pull Request resolved: #111625

fduwjj added module: dtensor distributed tensor tag release notes: distributed (dtensor) release notes category ciflow/trunk Trigger trunk jobs on your pull request ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR labels Oct 20, 2023

fduwjj requested a review from wanchaol October 20, 2023 05:06

wanchaol approved these changes Oct 20, 2023

View reviewed changes

wconstab reviewed Oct 20, 2023

View reviewed changes

pytorchmergebot added the merging label Oct 20, 2023

pytorchmergebot added the Merged label Oct 20, 2023

pytorchmergebot removed the merging label Oct 20, 2023

pytorchmergebot closed this in fdc29f5 Oct 20, 2023

facebook-github-bot deleted the gh/fduwjj/107/head branch October 24, 2023 14:23

[TP] Refactor style to make it work with torch.compile #111625

[TP] Refactor style to make it work with torch.compile #111625

Uh oh!

Conversation

fduwjj commented Oct 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/111625

✅ No Failures

Uh oh!

wanchaol left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wanchaol left a comment

Choose a reason for hiding this comment

Uh oh!

wanchaol Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

fduwjj Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

wanchaol Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

fduwjj Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

wconstab Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

fduwjj Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

wanchaol Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

fduwjj commented Oct 20, 2023

Uh oh!

pytorchmergebot commented Oct 20, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fduwjj commented Oct 20, 2023 •

edited

Loading

pytorch-bot bot commented Oct 20, 2023 •

edited

Loading