KEMBAR78

Add BF16 in FP8 quantize ops by sryap · Pull Request #1961 · pytorch/FBGEMM · GitHub

Add BF16 in FP8 quantize ops #1961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

sryap wants to merge 1 commit into pytorch:main from sryap:export-D47904459

Contributor

sryap commented Aug 21, 2023

Summary:

Added output_dtype for half, bfloat16 and float as output in
dequantization functions; currently it's an integer value defined by
Sparse_dtype (float:0, half:1, bfloat16:5)
Added type conversion in quant and dequant kernels by using native
CUDA/HIP functions for half to float conversion and writing
everything explicitly.

Differential Revision: D47904459

netlify bot commented Aug 21, 2023 •

edited

Loading

✅ Deploy Preview for pytorch-fbgemm-docs canceled.

Name	Link
🔨 Latest commit	`19fb8e1`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/64e6a4ae3f99cf000870964e

facebook-github-bot added the cla signed label

Contributor

facebook-github-bot commented Aug 21, 2023

This pull request was exported from Phabricator. Differential Revision: D47904459

facebook-github-bot added the fb-exported label

Contributor

facebook-github-bot commented Aug 21, 2023

This pull request was exported from Phabricator. Differential Revision: D47904459

sryap added a commit to sryap/FBGEMM that referenced this pull request


          Add BF16 in FP8 quantize ops (pytorch#1961)

facb7ed

Summary:
Pull Request resolved: pytorch#1961

- Added output_dtype for half, bfloat16 and float as output in
  dequantization functions; currently it's an integer value defined by
  Sparse_dtype (float:0, half:1, bfloat16:5)
- Added type conversion in quant and dequant kernels by using native
  CUDA/HIP functions for half to float conversion and writing
  everything explicitly.

Differential Revision: D47904459

fbshipit-source-id: 41d3f0c50365d0482aab912c202f458a787419d8

sryap force-pushed the export-D47904459 branch from 8e44a0d to facb7ed Compare

August 21, 2023 22:22

Contributor

facebook-github-bot commented Aug 23, 2023

This pull request was exported from Phabricator. Differential Revision: D47904459

sryap added a commit to sryap/FBGEMM that referenced this pull request


          Add BF16 in FP8 quantize ops (pytorch#1961)

56e870d

Summary:
Pull Request resolved: pytorch#1961

- Added output_dtype for half, bfloat16 and float as output in
  dequantization functions; currently it's an integer value defined by
  Sparse_dtype (float:0, half:1, bfloat16:5)
- Added type conversion in quant and dequant kernels by using native
  CUDA/HIP functions for half to float conversion and writing
  everything explicitly.

Reviewed By: jianyuh

Differential Revision: D47904459

fbshipit-source-id: f608d7da5dcf05ff78a6e0eb13d985ed99207d1a

sryap force-pushed the export-D47904459 branch from facb7ed to 56e870d Compare

August 23, 2023 23:25

Contributor

facebook-github-bot commented Aug 24, 2023

This pull request was exported from Phabricator. Differential Revision: D47904459

sryap added a commit to sryap/FBGEMM that referenced this pull request


          Add BF16 in FP8 quantize ops (pytorch#1961)

7d5b278

Summary:
Pull Request resolved: pytorch#1961

- Added output_dtype for half, bfloat16 and float as output in
  dequantization functions; currently it's an integer value defined by
  Sparse_dtype (float:0, half:1, bfloat16:5)
- Added type conversion in quant and dequant kernels by using native
  CUDA/HIP functions for half to float conversion and writing
  everything explicitly.

Reviewed By: jianyuh

Differential Revision: D47904459

fbshipit-source-id: 3fdca310b5262c249e7dc552070e27a569c9af23

sryap force-pushed the export-D47904459 branch from 56e870d to 7d5b278 Compare

August 24, 2023 00:20

Contributor

facebook-github-bot commented Aug 24, 2023

This pull request was exported from Phabricator. Differential Revision: D47904459

sryap added a commit to sryap/FBGEMM that referenced this pull request


          Add BF16 in FP8 quantize ops (pytorch#1961)

a6ee85a

Summary:
Pull Request resolved: pytorch#1961

- Added output_dtype for half, bfloat16 and float as output in
  dequantization functions; currently it's an integer value defined by
  Sparse_dtype (float:0, half:1, bfloat16:5)
- Added type conversion in quant and dequant kernels by using native
  CUDA/HIP functions for half to float conversion and writing
  everything explicitly.

Reviewed By: jianyuh

Differential Revision: D47904459

fbshipit-source-id: 4f5bbf71cb3c5f0ec2f4ef0048f30c6cdf48cd2e

sryap force-pushed the export-D47904459 branch from 7d5b278 to a6ee85a Compare

August 24, 2023 00:25


          Add BF16 in FP8 quantize ops (pytorch#1961)

19fb8e1

Summary:
Pull Request resolved: pytorch#1961

- Added output_dtype for half, bfloat16 and float as output in
  dequantization functions; currently it's an integer value defined by
  Sparse_dtype (float:0, half:1, bfloat16:5)
- Added type conversion in quant and dequant kernels by using native
  CUDA/HIP functions for half to float conversion and writing
  everything explicitly.

Reviewed By: jianyuh

Differential Revision: D47904459

fbshipit-source-id: d48da0fc7b0b158c46628952a7c7ec8e1aa502df

Contributor

facebook-github-bot commented Aug 24, 2023

This pull request was exported from Phabricator. Differential Revision: D47904459

sryap force-pushed the export-D47904459 branch from a6ee85a to 19fb8e1 Compare

August 24, 2023 00:30

facebook-github-bot closed this in

facebook-github-bot added the Merged label

Contributor

facebook-github-bot commented Aug 24, 2023

This pull request has been merged in 4920770.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed fb-exported Merged