KEMBAR78
Reland #50999 (Added pow() on CPU for float16 & bfloat16) by imaginary-person · Pull Request #55280 · pytorch/pytorch · GitHub
Skip to content

Conversation

@imaginary-person
Copy link
Contributor

@imaginary-person imaginary-person commented Apr 3, 2021

Reason for relanding

Line 1607 of torch/testing/_internal/common_methods_invocations.py of #50999 had dtype instead of dtype=torch.bool, so 4 of the 9 sample inputs for bool had incorrect dtype. This bug was caught by #54949.

Summary

  1. Added support for pow() on CPU for float16 (Half) and bfloat16 types.
    Both pow(Tensor, Scalar) and pow(Tensor, Tensor) are now supported for the aforementioned types.
    However autograd isn't supported for Float16 on CPU yet, as log_vml_cpu can't be enabled for it.
  2. @heitorschueroff added pow_tensor_scalar_optimized_kernel to refactor & simplify PowKernel.cpp.
    It provides a common path for all the complex types & floating point types (except Float16, due to lack of complete AVX2 vectorization support for it). It replaced code that had previously been duplicated for (float, double) and complex types,
    so PowKernel.cpp looks a lot cleaner now.
  3. Enabled (unskipped) some tests for erf, erfc,erfinv, tan and linalg.vector.norm which were being skipped earlier due to pow() not having been implemented for float16 & bfloat16.
  4. Added an OpInfo for pow() & enabled some test cases for pow().
  5. Extended the coverage of existing tests for pow in test_binary_ufuncs.py in order to enable comparison with numpy, even with discontiguous tensors, and added a test to ensure that a runtime error is raised for pow's inplace variant if resizing the base tensor is required during its invocation.
  6. Added float16 & bfloat16 to square's dtype lists in its UnaryUfuncInfo.
  7. Removed redundant dtypesIfCPU and dtypesIfCUDA from OpInfos where they are equal to dtypes.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Apr 3, 2021

💊 CI failures summary and remediations

As of commit f47c554 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

@imaginary-person imaginary-person changed the title Reland #50999 Reland #50999 (Added pow() on CPU for float16 & bfloat16) Apr 3, 2021
@imaginary-person imaginary-person marked this pull request as draft April 3, 2021 19:25
@imaginary-person

This comment has been minimized.

@imaginary-person imaginary-person marked this pull request as ready for review April 3, 2021 21:10
@heitorschueroff
Copy link
Contributor

@imaginary-person Could you rebase this to get clean signals on the fixed upstream failure.

@heitorschueroff heitorschueroff self-requested a review April 5, 2021 15:19
@imaginary-person
Copy link
Contributor Author

Hello @heitorschueroff, I'll do that asap. Sorry for the inconvenience!

@imaginary-person
Copy link
Contributor Author

imaginary-person commented Apr 5, 2021

Hello @heitorschueroff, the 3 failed CI checks are related to PyTorch docs, and were fixed upstream earlier today. Please let me if any further changes are required.

All other CI checks passed.
I apologize again for the inconvenience of having to reland this PR.
Thank you!

Get fixes for docs CI checks
@codecov
Copy link

codecov bot commented Apr 6, 2021

Codecov Report

Merging #55280 (f47c554) into master (c7312f5) will increase coverage by 2.79%.
The diff coverage is 81.64%.

@@            Coverage Diff             @@
##           master   #55280      +/-   ##
==========================================
+ Coverage   74.66%   77.46%   +2.79%     
==========================================
  Files        1896     1896              
  Lines      187857   188005     +148     
==========================================
+ Hits       140268   145631    +5363     
+ Misses      47589    42374    -5215     

@facebook-github-bot
Copy link
Contributor

@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@heitorschueroff heitorschueroff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, landing now.

@imaginary-person
Copy link
Contributor Author

imaginary-person commented Apr 6, 2021

Hello @heitorschueroff, a recently landed PR has introduced a conflict related to test_out, that wasn't present earlier. Please let me know if I should resolve it after the FB internal tests would complete, or now. I suspect that the FB internal tests might end if I add another commit to this PR, so I'm currently hesitant to do so. Thanks!

EDIT: now that I think of it, FB internal tests keep running even after another commit. So, I resolved the conflicts that GitHub was warning me about. The internal FB tests that were running are still running here - https://github.com/pytorch/pytorch/runs/2278377934

@facebook-github-bot
Copy link
Contributor

@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@mruberry mruberry mentioned this pull request Apr 7, 2021
@facebook-github-bot
Copy link
Contributor

@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@imaginary-person

This comment has been minimized.

@imaginary-person

This comment has been minimized.

@imaginary-person
Copy link
Contributor Author

imaginary-person commented Apr 9, 2021

Hello @heitorschueroff, please let me know if any further changes are required on this PR. Thanks!

EDIT: I made the following changes:

  1. Revised sample inputs to have different sample inputs for the inplace variant.
  2. Unskipped test_variant_consistency_eager for float32.
  3. Unskipped the ROCm test failure, as ROCm CI tests now run on ROCm 4.1
  4. In make_tensor, the parameter discontiguous was renamed to noncontiguous in Ports logdet from method_tests() to op_db #55743, so it had to be changed in test_binary_ufuncs.py.

@imaginary-person

This comment has been minimized.

Copy link
Contributor

@heitorschueroff heitorschueroff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@facebook-github-bot
Copy link
Contributor

@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@heitorschueroff merged this pull request in aceceb3.

krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
…orch#55280)

Summary:
#### Reason for relanding
Line 1607 of `torch/testing/_internal/common_methods_invocations.py` of pytorch#50999  had `dtype` instead of `dtype=torch.bool`, so 4 of the 9 sample inputs for `bool` had incorrect dtype. This bug was caught by pytorch#54949.

1. Added support for pow() on CPU for `float16` (`Half`) and `bfloat16` types.
Both `pow(Tensor, Scalar)` and `pow(Tensor, Tensor)` are now supported for the aforementioned types.
However autograd isn't supported for `Float16` on CPU yet, as `log_vml_cpu` can't be enabled for it.
2. heitorschueroff added `pow_tensor_scalar_optimized_kernel` to refactor & simplify `PowKernel.cpp`.
It provides a common path for all the complex types & floating point types (except Float16, due to lack of complete AVX2 vectorization support for it).  It replaced code that had previously been duplicated for (float, double) and complex types,
so PowKernel.cpp looks a lot cleaner now.
3. Enabled (unskipped) some tests for `erf`, `erfc`,`erfinv`, `tan` and `linalg.vector.norm` which were being skipped earlier due to `pow()` not having been implemented for `float16` & `bfloat16`.
4. Added an OpInfo for `pow()` & enabled some test cases for `pow()`.
5. Extended the coverage of existing tests for `pow` in `test_binary_ufuncs.py` in order to enable comparison with `numpy`, even with discontiguous tensors, and added a test to ensure that a runtime error is raised for `pow`'s inplace variant if resizing the base tensor is required during its invocation.
6. Added `float16` & `bfloat16` to `square`'s dtype lists in its `UnaryUfuncInfo`.
7. Removed redundant `dtypesIfCPU` and `dtypesIfCUDA` from `OpInfo`s where they are equal to `dtypes`.

Pull Request resolved: pytorch#55280

Reviewed By: jbschlosser

Differential Revision: D27591772

Pulled By: heitorschueroff

fbshipit-source-id: c7420811b32595bb3353149a61e54a73f2eb352b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants