-
Notifications
You must be signed in to change notification settings - Fork 671
Add BF16 in FP8 quantize ops #1961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs canceled.
|
|
This pull request was exported from Phabricator. Differential Revision: D47904459 |
|
This pull request was exported from Phabricator. Differential Revision: D47904459 |
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Differential Revision: D47904459 fbshipit-source-id: 41d3f0c50365d0482aab912c202f458a787419d8
8e44a0d to
facb7ed
Compare
|
This pull request was exported from Phabricator. Differential Revision: D47904459 |
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Reviewed By: jianyuh Differential Revision: D47904459 fbshipit-source-id: f608d7da5dcf05ff78a6e0eb13d985ed99207d1a
facb7ed to
56e870d
Compare
|
This pull request was exported from Phabricator. Differential Revision: D47904459 |
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Reviewed By: jianyuh Differential Revision: D47904459 fbshipit-source-id: 3fdca310b5262c249e7dc552070e27a569c9af23
56e870d to
7d5b278
Compare
|
This pull request was exported from Phabricator. Differential Revision: D47904459 |
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Reviewed By: jianyuh Differential Revision: D47904459 fbshipit-source-id: 4f5bbf71cb3c5f0ec2f4ef0048f30c6cdf48cd2e
7d5b278 to
a6ee85a
Compare
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Reviewed By: jianyuh Differential Revision: D47904459 fbshipit-source-id: d48da0fc7b0b158c46628952a7c7ec8e1aa502df
|
This pull request was exported from Phabricator. Differential Revision: D47904459 |
a6ee85a to
19fb8e1
Compare
|
This pull request has been merged in 4920770. |
Summary:
dequantization functions; currently it's an integer value defined by
Sparse_dtype (float:0, half:1, bfloat16:5)
CUDA/HIP functions for half to float conversion and writing
everything explicitly.
Differential Revision: D47904459