Add support for float8_e4m3fnuz and _e5m2fnuz #107586

galexite · 2023-08-21T12:19:56Z

This PR relates to the feature in this feature submission. It has been based on #104242 which adds similar float8 types.

These new types added in this PR are described in the paper at https://arxiv.org/abs/2206.02915. A brief description and comparison of the types with other float8 types can be also found in the OpenXLA RFC.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @EikanWang @albanD

pytorch-bot · 2023-08-21T12:19:58Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107586

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 89852c6 with merge base 79e3833 ():

NEW FAILURE - The following job has failed:

pull / linux-focal-cuda11.8-py3.10-gcc9 / test (distributed, 1, 3, linux.8xlarge.nvidia.gpu) (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

galexite · 2023-08-22T11:51:07Z

@albanD @seemethere – would it be possible to get an exception for the 2000 LOC limit? I'm only 6 lines over and quite a lot of it is just registering the types.

galexite · 2023-08-22T12:38:20Z

I've removed my changes to add these and existing types to torch.finfo to fit inside the LOC limit – I'll commit them as a separate PR.

vkuzo · 2023-08-24T18:43:40Z

Thanks for working on this! At a high level this looks good, and we will need to do a more detailed review. Given that the branch cut for v2.1 is only a few days away and the high LOC of this change and our experience with landing previous "new dtype" PRs, I would expect that Pytorch v2.2 would be a reasonable target for eventually getting this in.

vkuzo · 2023-09-18T16:02:01Z

c10/util/Float8_e4m3fnuz.h

this looks reasonable. Just curious, what made you choose a LUT over bit shifting?

also, do we expect the hardware to support an accelerated version of these?

There's no particular reason to use a LUT here, I can change to bit shifting if needed!

Yep, Graphcore's C600 hardware has dedicated instructions for which can be used to convert to and from both these FP8 types.

Did we run any perf benchmarks/binary size consideration for having this lookup table embedded over and over in every op that will need to convert F8E4M3FNUZ to float?

I will move the LUT in to the .cpp file.

test/quantization/core/experimental/test_float8.py

vkuzo · 2023-09-18T16:05:07Z

The dtype pieces looks reasonable to me! Wondering if you could share some context on which hardware supports these float8 flavors now, and which hardware is expected to support this in the future?

cc @malfet , would you be up for a more detailed review on the framework pieces figuring out the best way to land this?

galexite · 2023-09-19T07:46:29Z

Thanks for the review @vkuzo! I'll make the change to take those list of dtypes for the test parameters out in to a constant.

The dtype pieces looks reasonable to me! Wondering if you could share some context on which hardware supports these float8 flavors now, and which hardware is expected to support this in the future?

Graphcore's current C600 card support these types at the hardware level. It has instructions in the Tile ISA to perform common operations directly on FP8 data, as well as convert between types.

galexite · 2023-09-21T09:31:32Z

I've rebased and added back my TypeInfo changes as I see that TypeInfo support has been added for the other FP8 types too, but this might make it go over the LOC limit again if the number of lines removed is also included in that count.

umangyadav · 2023-11-14T23:17:09Z

c10/util/Float8_e4m3fnuz.h

+
+  if (f_bits >= fnuz_max) {
+    // NaN -- sign bit set to 1, rest 0s
+    return 0x80;


table here clips float values more than FNUZ_MAX to FLT_MAX.
https://onnx.ai/onnx/technical/float8.html#cast

Is there reason behind using NaNs ?

The reason is that the existing casting code for e5m2 and e4m3fn types are also implemented without any saturation, so I chose to do the same to match that behaviour.

I fear it would lead to having more NaNs when doing inference. Ideally there should be a flag for the users to set which behaviour they want

c10/util/Float8_e4m3fnuz.h

alugorey

A couple comments to help with your rocm CI failure. Hope this helps! :)

c10/util/Float8_e4m3fnuz.h

c10/util/Float8_e5m2fnuz.h

c10/util/Float8_e4m3fnuz.h

c10/util/Float8_e5m2fnuz.h

malfet · 2023-11-15T14:47:29Z

@pytorchbot merge -i

pytorchmergebot · 2023-11-15T14:49:29Z

Merge started

Your change will be merged while ignoring the following 1 checks: pull / linux-focal-cuda11.8-py3.10-gcc9 / test (distributed, 1, 3, linux.8xlarge.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Follow up to pytorch#107586.

Follow up to #107586. Pull Request resolved: #115214 Approved by: https://github.com/peterbell10

Follow up to #107586. Pull Request resolved: #115214 Approved by: https://github.com/peterbell10, https://github.com/malfet

Follow up to pytorch#107586. Pull Request resolved: pytorch#115214 Approved by: https://github.com/peterbell10, https://github.com/malfet

galexite requested review from IvanYashchuk, jerryzh168, lezcano and nikitaved as code owners August 21, 2023 12:19

pytorch-bot bot added the release notes: linalg_frontend release notes category label Aug 21, 2023

github-actions bot added module: cpu CPU specific problem (e.g., perf, algorithm) NNC release notes: quantization release notes category labels Aug 21, 2023

pytorchbot added the open source label Aug 21, 2023

galexite force-pushed the georgew/upstream_fp8 branch from 49b0849 to 34871d7 Compare August 21, 2023 13:12

lezcano removed their request for review August 21, 2023 14:17

IvanYashchuk removed their request for review August 21, 2023 14:45

galexite force-pushed the georgew/upstream_fp8 branch 2 times, most recently from cc7fd13 to ed84796 Compare August 22, 2023 10:27

galexite force-pushed the georgew/upstream_fp8 branch from ed84796 to f889d6d Compare August 22, 2023 12:37

nikitaved removed their request for review August 22, 2023 13:47

cpuhrsch added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 23, 2023

vkuzo requested review from malfet and vkuzo August 24, 2023 18:43

vkuzo reviewed Sep 18, 2023

View reviewed changes

test/quantization/core/experimental/test_float8.py Outdated Show resolved Hide resolved

galexite force-pushed the georgew/upstream_fp8 branch 2 times, most recently from 1dd6bd9 to cac8d42 Compare September 21, 2023 09:19

umangyadav reviewed Nov 14, 2023

View reviewed changes

c10/util/Float8_e4m3fnuz.h Outdated Show resolved Hide resolved

alugorey suggested changes Nov 14, 2023

View reviewed changes

c10/util/Float8_e4m3fnuz.h Show resolved Hide resolved

c10/util/Float8_e4m3fnuz.h Outdated Show resolved Hide resolved

malfet reviewed Nov 15, 2023

View reviewed changes

c10/util/Float8_e4m3fnuz.h Outdated Show resolved Hide resolved

c10/util/Float8_e5m2fnuz.h Outdated Show resolved Hide resolved

c10/util/Float8_e4m3fnuz.h Outdated Show resolved Hide resolved

c10/util/Float8_e5m2fnuz.h Outdated Show resolved Hide resolved

galexite added 4 commits November 15, 2023 09:09

Fixes from comments

13432d3

Fix typo with CUDA dtypes in test_float8.py

fc40e6b

Lint

de34268

Do the same to e4m3fnuz

89852c6

pytorchmergebot added the merging label Nov 15, 2023

pytorchmergebot added the Merged label Nov 15, 2023

pytorchmergebot closed this in 6c18724 Nov 15, 2023

pytorchmergebot removed the merging label Nov 15, 2023

galexite deleted the georgew/upstream_fp8 branch November 15, 2023 15:29

huydhn mentioned this pull request Nov 16, 2023

DISABLED test_cast_round_trip_soak_cuda_float8_e4m3fn (__main__.TestFloat8DtypeCUDA) #113829

Closed

ngimel mentioned this pull request Nov 27, 2023

FP8 types should not participate in type promotion and should have no math ops defined on them #113663

Open

jeffdaily added a commit to ROCm/pytorch that referenced this pull request Dec 6, 2023

additional support for float8_e4m3fnuz and _e5m2fnuz

a7be2ce

Follow up to pytorch#107586.

jeffdaily mentioned this pull request Dec 6, 2023

additional support for float8_e4m3fnuz and _e5m2fnuz #115214

Closed

jeffdaily added a commit to ROCm/pytorch that referenced this pull request Dec 20, 2023

additional support for float8_e4m3fnuz and _e5m2fnuz

9652f0d

Follow up to pytorch#107586.

pytorchmergebot pushed a commit to ROCm/pytorch that referenced this pull request Dec 21, 2023

additional support for float8_e4m3fnuz and _e5m2fnuz

2345519

Follow up to pytorch#107586.

pytorchmergebot pushed a commit to ROCm/pytorch that referenced this pull request Jan 4, 2024

additional support for float8_e4m3fnuz and _e5m2fnuz

ad43102

Follow up to pytorch#107586.

jeffdaily added a commit to ROCm/pytorch that referenced this pull request Jan 10, 2024

additional support for float8_e4m3fnuz and _e5m2fnuz

d721b8f

Follow up to pytorch#107586.

pytorchmergebot pushed a commit to ROCm/pytorch that referenced this pull request Jan 18, 2024

additional support for float8_e4m3fnuz and _e5m2fnuz

c09395c

Follow up to pytorch#107586.

pytorchmergebot pushed a commit that referenced this pull request Jan 19, 2024

additional support for float8_e4m3fnuz and _e5m2fnuz (#115214)

74e1362

Follow up to #107586. Pull Request resolved: #115214 Approved by: https://github.com/peterbell10

pytorchmergebot pushed a commit that referenced this pull request Jan 22, 2024

additional support for float8_e4m3fnuz and _e5m2fnuz (#115214)

01abb5a

Follow up to #107586. Pull Request resolved: #115214 Approved by: https://github.com/peterbell10, https://github.com/malfet

FFFrog mentioned this pull request Mar 27, 2024

[RFC] The interoperability Standard of Third-party Backend Integartion Mechanism #122770

Open

FFFrog mentioned this pull request Aug 14, 2024

CI Failure cosdt/torch_backend#82

Open

Add support for float8_e4m3fnuz and _e5m2fnuz #107586

Add support for float8_e4m3fnuz and _e5m2fnuz #107586

Uh oh!

Conversation

galexite commented Aug 21, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107586

❌ 1 New Failure

Uh oh!

galexite commented Aug 22, 2023

Uh oh!

galexite commented Aug 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vkuzo commented Aug 24, 2023

Uh oh!

vkuzo Sep 18, 2023

Choose a reason for hiding this comment

Uh oh!

galexite Sep 19, 2023

Choose a reason for hiding this comment

Uh oh!

malfet Oct 26, 2023

Choose a reason for hiding this comment

Uh oh!

galexite Oct 27, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vkuzo commented Sep 18, 2023

Uh oh!

galexite commented Sep 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

galexite commented Sep 21, 2023

Uh oh!

umangyadav Nov 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

galexite Nov 15, 2023

Choose a reason for hiding this comment

Uh oh!

umangyadav Nov 15, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alugorey left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

malfet commented Nov 15, 2023

Uh oh!

pytorchmergebot commented Nov 15, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

galexite commented Aug 21, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 21, 2023 •

edited

Loading

galexite commented Aug 22, 2023 •

edited

Loading

galexite commented Sep 19, 2023 •

edited

Loading

umangyadav Nov 14, 2023 •

edited

Loading