KEMBAR78
fix torch.linalg.norm and torch.norm for torch.complex32 datatype by jiayisunx · Pull Request #133661 · pytorch/pytorch · GitHub
Skip to content

Conversation

@jiayisunx
Copy link
Collaborator

@jiayisunx jiayisunx commented Aug 16, 2024

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Aug 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133661

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9cb69b2 with merge base 028c5d3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Aug 16, 2024
jiayisunx added a commit that referenced this pull request Aug 16, 2024
@jiayisunx jiayisunx requested a review from mingfeima August 16, 2024 06:40
@jiayisunx
Copy link
Collaborator Author

@mingfeima , could you please review this PR? Thanks!

@Skylion007
Copy link
Collaborator

We probably need to add these dtypes to our current testing framework.

Copy link
Collaborator

@mingfeima mingfeima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just remember to update test cases.

[ghstack-poisoned]
[ghstack-poisoned]
Comment on lines +1666 to +1668
for ord in vector_ords:
res = torch.linalg.norm(x, ord, keepdim=keepdim)
res_float = torch.linalg.norm(x_cfloat, ord, keepdim=keepdim)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this op tested with OpInfos? Shall we add tests for the CPU there instead?

Copy link
Collaborator Author

@jiayisunx jiayisunx Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this warning when this op tested with OpInfos, ComplexHalf seems not fully supported yet:

/home/jiayisun/pytorch/torch/testing/_creation.py:233: UserWarning: ComplexHalf support is experimental and many operators don't support it yet. (Triggered internally at /home/jiayisun/pytorch/aten/src/ATen/EmptyTensor.cpp:46.)
  result = torch.empty(shape, device=device, dtype=dtype)
Some dtypes for norm on device type cpu are only partially supported!
The following dtypes only worked on some samples during backward: {torch.complex32}.

@lezcano lezcano removed their request for review August 19, 2024 09:32
[ghstack-poisoned]
@jiayisunx jiayisunx requested a review from mruberry as a code owner August 19, 2024 14:07
jiayisunx added a commit that referenced this pull request Aug 19, 2024
[ghstack-poisoned]
[ghstack-poisoned]
jiayisunx added a commit that referenced this pull request Aug 20, 2024
@jiayisunx jiayisunx requested a review from nikitaved August 20, 2024 08:11
sample_inputs_func=sample_inputs_norm,
dtypes=floating_and_complex_types_and(torch.float16, torch.bfloat16),
dtypes=floating_and_complex_types_and(torch.float16, torch.bfloat16, torch.chalf),
dtypesIfCUDA=floating_and_complex_types_and(torch.float16, torch.bfloat16),
Copy link
Collaborator

@Skylion007 Skylion007 Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just add chalf support to CUDA too while we are at it? We already support complex types on GPU so should be an easy enough change. Just add kComplexHalf to the cuda operator as well and delete this line. :)

AT_DISPATCH_FLOATING_AND_COMPLEX_TYPES(iter.input_dtype(), "norm_cuda", [&] {

Copy link
Collaborator Author

@jiayisunx jiayisunx Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, issue #132634 also exists on CUDA, there is the same bug in GPU kernel, it runs in here: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/cuda/ReduceNormKernel.cu#L35, so we cannot fix it by modifying this line, I am not familiar with GPU kernels, maybe you can open a new PR to fix it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Skylion007 can you please review this PR again? thanks!

}

AT_DISPATCH_FLOATING_AND_COMPLEX_TYPES(iter.input_dtype(), "norm_cpu", [&] {
AT_DISPATCH_FLOATING_AND_COMPLEX_TYPES_AND3(kHalf, kBFloat16, kComplexHalf, iter.input_dtype(), "norm_cpu", [&] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the kHalf and kBfloat16 types in the AND3 redundant? It seemed to support those types before??? or it suggests we aren't testing the right thing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty complicated logic, let's update the GPU operator at the same time for chalf for parity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the kHalf and kBfloat16 types in the AND3 redundant? It seemed to support those types before??? or it suggests we aren't testing the right thing.

Because I modified the logic above, now kHalf and kBfloat16 types may also be here (the case of iter.input_dtype() == kHalf/kBFloat16 and iter.dtype(0)!= kFloat)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the UT is for this operator, but I got this warning when run the UT, ComplexHalf seems not fully supported yet:

/home/jiayisun/pytorch/torch/testing/_creation.py:233: UserWarning: ComplexHalf support is experimental and many operators don't support it yet. (Triggered internally at /home/jiayisun/pytorch/aten/src/ATen/EmptyTensor.cpp:46.)
  result = torch.empty(shape, device=device, dtype=dtype)
Some dtypes for norm on device type cpu are only partially supported!
The following dtypes only worked on some samples during backward: {torch.complex32}.

[ghstack-poisoned]
jiayisunx added a commit that referenced this pull request Aug 22, 2024
@jiayisunx jiayisunx requested a review from Skylion007 August 22, 2024 07:42
@jiayisunx
Copy link
Collaborator Author

hi @Skylion007 , could you please review this PR again? thanks!

jiayisunx added a commit that referenced this pull request Nov 6, 2024
[ghstack-poisoned]
@jiayisunx
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 7, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpu CPU specific problem (e.g., perf, algorithm) open source release notes: linalg_frontend release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants