[CUDA] fix nansum in non-JIT build #158633

pzzp · 2025-07-18T09:57:30Z

This change fix crash of

import torch
a = torch.tensor([[1, 2]], dtype=torch.complex32).to('cuda')
b = torch.nansum(a, dim=0)
print(b)

This change fix crash of ``` import torch a = torch.tensor([[1, 2]], dtype=torch.complex32).to('cuda') b = torch.nansum(a, dim=0) print(b) ```

pytorch-bot · 2025-07-18T09:57:33Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158633

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 7eb797b with merge base 32aade9 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / cuda12.8-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu, unstable) (gh)
MISSING REGRESSION TEST

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2025-07-18T09:57:34Z

✅login: pzzp / (7eb797b)

The committers listed above are authorized under a signed CLA.

malfet · 2025-07-18T15:35:51Z

@pzzp can you sign the CLA please?

pzzp · 2025-07-18T16:02:41Z

@malfet ✅️

thenumberouscode · 2025-07-21T05:40:33Z

Hi, I’m just curious why changing acc_t to scalar_t fixes this bug.

pzzp · 2025-07-21T06:28:50Z

@thenumberouscode
The correct computation process is: input scalar_t (complex half), reduce using the acc_t (complex float) type, and output out_scalar_t (complex half). This bug occurred because the output_scalar_t type was incorrect, leading to miscalculated output size and resulting in unaligned memory access.

pzzp · 2025-07-27T13:45:12Z

@ngimel hi, how can I merge this change?

ngimel · 2025-07-28T04:43:15Z

@pytorchbot merge

pytorchmergebot · 2025-07-28T04:45:16Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

This change fix crash of ``` import torch a = torch.tensor([[1, 2]], dtype=torch.complex32).to('cuda') b = torch.nansum(a, dim=0) print(b) ``` Pull Request resolved: #158633 Approved by: https://github.com/ngimel

[CUDA] fix nansum in non-JIT build

7eb797b

This change fix crash of ``` import torch a = torch.tensor([[1, 2]], dtype=torch.complex32).to('cuda') b = torch.nansum(a, dim=0) print(b) ```

pzzp requested review from eqy and syed-ahmed as code owners July 18, 2025 09:57

pytorch-bot bot added the release notes: cuda release notes category label Jul 18, 2025

pytorchbot added the open source label Jul 18, 2025

pzzp mentioned this pull request Jul 18, 2025

nansum crash on complex32 cuda non-JIT #158635

Closed

ngimel approved these changes Jul 21, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 28, 2025

pytorchmergebot added the merging label Jul 28, 2025

pytorchmergebot added the Merged label Jul 28, 2025

pytorchmergebot closed this in f3913ea Jul 28, 2025

pytorchmergebot removed the merging label Jul 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CUDA] fix nansum in non-JIT build #158633

[CUDA] fix nansum in non-JIT build #158633

Uh oh!

pzzp commented Jul 18, 2025

Uh oh!

pytorch-bot bot commented Jul 18, 2025 •

edited

Loading

Uh oh!

linux-foundation-easycla bot commented Jul 18, 2025 •

edited

Loading

Uh oh!

malfet commented Jul 18, 2025

Uh oh!

pzzp commented Jul 18, 2025

Uh oh!

thenumberouscode commented Jul 21, 2025

Uh oh!

pzzp commented Jul 21, 2025

Uh oh!

pzzp commented Jul 27, 2025

Uh oh!

ngimel commented Jul 28, 2025

Uh oh!

pytorchmergebot commented Jul 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[CUDA] fix nansum in non-JIT build #158633

[CUDA] fix nansum in non-JIT build #158633

Uh oh!

Conversation

pzzp commented Jul 18, 2025

Uh oh!

pytorch-bot bot commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158633

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

linux-foundation-easycla bot commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

malfet commented Jul 18, 2025

Uh oh!

pzzp commented Jul 18, 2025

Uh oh!

thenumberouscode commented Jul 21, 2025

Uh oh!

pzzp commented Jul 21, 2025

Uh oh!

pzzp commented Jul 27, 2025

Uh oh!

ngimel commented Jul 28, 2025

Uh oh!

pytorchmergebot commented Jul 28, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pytorch-bot bot commented Jul 18, 2025 •

edited

Loading

linux-foundation-easycla bot commented Jul 18, 2025 •

edited

Loading