[DTensor] implement histc #158298

wconstab · 2025-07-14T23:38:24Z

Stack from ghstack (oldest at bottom):

-> [DTensor] implement histc #158298

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @d4l3k

[ghstack-poisoned]

pytorch-bot · 2025-07-14T23:38:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158298

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Cancelled Jobs, 1 Unrelated Failure

As of commit e98bc45 with merge base 194539e ():

CANCELLED JOBS - The following jobs were cancelled. Please retry:

trunk / linux-jammy-rocm-py3.10 / test (default, 1, 2, linux.rocm.gpu.2) (gh)
##[error]The operation was canceled.
trunk / linux-jammy-rocm-py3.10 / test (default, 2, 2, linux.rocm.gpu.2) (gh)
The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / cuda12.8-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu, unstable) (gh) (#153987)
MISSING REGRESSION TEST

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k [ghstack-poisoned]

awgu · 2025-07-15T05:29:26Z

test/distributed/tensor/test_math_ops.py

+                out_full = out_dt.full_tensor()
+                self.assertEqual(global_bins, out_full)
+
+                # TODO: support backward


I think histc might not have a backward

Well that would be convenient 😅

torch/distributed/tensor/_ops/_math_ops.py

zpcore

LGTM! One comment is that len(op_schema.args_schema) == 4 condition may be too strict.

wconstab · 2025-07-15T13:26:46Z

I did a test to confirm.

If I call histc(tensor), then len(args_schema) was 1. e.g default values do not show up.

If I called histc(tensor, min=1,max=2), len(args_schema) was 4. E.g even though 'bins' arg was not specified by the user, its default value got 'promoted' to look like it was passed by the user.

So I think ==4 is correct, not too strict. I'm not an expert on how the argument passing works though, so let me know if I missed something.

XilunWu · 2025-07-15T01:32:05Z

torch/distributed/tensor/_ops/_math_ops.py

+    # TODO what is this for?
+    # schema_info=RuntimeSchemaInfo(1),


bins, min, and max does not affect the strategy. So we can leave RuntimeSchemaInfo unspecified.

actually, min, max do affect the strategy. bins does not affect the strategy. So i think i need to set it to 4?

i am glad you raised this comment becuase I forgot about my TODO.

following up, i need to specify '2' to indicate the first non-tensor arg that needs to be hashed, not 4 to indicate the last.

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k [ghstack-poisoned]

ghstack-source-id: 2a8c263 Pull Request resolved: #158298

wconstab · 2025-07-15T20:55:47Z

@pytorchbot merge

pytorchmergebot · 2025-07-15T20:57:32Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

XilunWu

LGTM

XilunWu · 2025-07-15T20:59:20Z

test/distributed/tensor/test_math_ops.py

+                    self.assertEqual(comm_mode.get_total_counts(), 0)
+
+                out_full = out_dt.full_tensor()
+                self.assertEqual(global_bins, out_full)


can also try using the util self. _test_op_on_dtensor added in #158112

but this would change the comm_mode numbers because it adds an extra full_tensor().

pytorchmergebot · 2025-07-15T21:13:29Z

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

pull / linux-jammy-cuda12.8-py3.10-gcc11-build-distributed / build

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

wconstab · 2025-07-16T00:38:12Z

@pytorchbot merge -i

pytorchmergebot · 2025-07-16T00:40:06Z

Merge started

Your change will be merged while ignoring the following 2 checks: pull / cuda12.8-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu, unstable), trunk / linux-jammy-rocm-py3.10 / build

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-07-16T03:34:14Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-jammy-rocm-py3.10 / test (default, 1, 2, linux.rocm.gpu.2)

Details for Dev Infra team

Raised by workflow job

wconstab · 2025-07-16T04:04:00Z

@pytorchbot merge -f

pytorch-bot · 2025-07-16T04:04:03Z

❌ 🤖 pytorchbot command failed:

@pytorchbot merge: error: argument -f/--force: expected one argument

usage: @pytorchbot merge [-f MESSAGE | -i] [-ic] [-r [{viable/strict,main}]]

Try @pytorchbot --help for more info.

wconstab · 2025-07-16T04:08:51Z

@pytorchbot merge -f"broken infra for rocm?"

pytorchmergebot · 2025-07-16T04:10:20Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

[DTensor] implement histc

4cca83c

[ghstack-poisoned]

wconstab mentioned this pull request Jul 14, 2025

[DTensor] Fix grouped_mm strategy for invalid stride cases #158245

Closed

pytorch-bot bot added ciflow/inductor oncall: distributed Add this issue/PR to distributed oncall triage queue labels Jul 14, 2025

wconstab mentioned this pull request Jul 14, 2025

Add DeepSeekV3 and figure out what's needed to support it meta-pytorch/autoparallel#29

Draft

13 tasks

wconstab added the release notes: distributed (dtensor) release notes category label Jul 14, 2025

Update on "[DTensor] implement histc"

5ef0985

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k [ghstack-poisoned]

awgu reviewed Jul 15, 2025

View reviewed changes

zpcore reviewed Jul 15, 2025

View reviewed changes

torch/distributed/tensor/_ops/_math_ops.py Show resolved Hide resolved

zpcore reviewed Jul 15, 2025

View reviewed changes

torch/distributed/tensor/_ops/_math_ops.py Show resolved Hide resolved

zpcore approved these changes Jul 15, 2025

View reviewed changes

XilunWu reviewed Jul 15, 2025

View reviewed changes

Update on "[DTensor] implement histc"

5d0c920

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k [ghstack-poisoned]

Update on "[DTensor] implement histc"

e98bc45

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k [ghstack-poisoned]

wconstab added a commit that referenced this pull request Jul 15, 2025

[DTensor] implement histc

148e138

ghstack-source-id: 2a8c263 Pull Request resolved: #158298

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 15, 2025

pytorchmergebot added the merging label Jul 15, 2025

XilunWu approved these changes Jul 15, 2025

View reviewed changes

pytorchmergebot removed the merging label Jul 15, 2025

pytorchmergebot added the merging label Jul 16, 2025

pytorchmergebot removed the merging label Jul 16, 2025

pytorchmergebot added the merging label Jul 16, 2025

pytorchmergebot closed this in 0a9d450 Jul 16, 2025

pytorchmergebot added Merged and removed merging labels Jul 16, 2025

github-actions bot deleted the gh/wconstab/426/head branch August 16, 2025 02:19

[DTensor] implement histc #158298

[DTensor] implement histc #158298

Uh oh!

Conversation

wconstab commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158298

❌ 2 Cancelled Jobs, 1 Unrelated Failure

Uh oh!

awgu Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

wconstab Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zpcore left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wconstab commented Jul 15, 2025

Uh oh!

XilunWu Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

wconstab Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

wconstab Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

wconstab commented Jul 15, 2025

Uh oh!

pytorchmergebot commented Jul 15, 2025

Merge started

Uh oh!

XilunWu left a comment

Choose a reason for hiding this comment

Uh oh!

XilunWu Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

pytorchmergebot commented Jul 15, 2025

Merge failed

Uh oh!

wconstab commented Jul 16, 2025

Uh oh!

pytorchmergebot commented Jul 16, 2025

Merge started

Uh oh!

pytorchmergebot commented Jul 16, 2025

Merge failed

Uh oh!

wconstab commented Jul 16, 2025

Uh oh!

pytorch-bot bot commented Jul 16, 2025

Uh oh!

wconstab commented Jul 16, 2025

Uh oh!

pytorchmergebot commented Jul 16, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wconstab commented Jul 14, 2025 •

edited

Loading

pytorch-bot bot commented Jul 14, 2025 •

edited

Loading

zpcore left a comment •

edited

Loading