make torch.amp.autocast more generic #125103

guangyey · 2024-04-27T15:38:24Z

Stack from ghstack (oldest at bottom):

-> make torch.amp.autocast more generic #125103

Motivation

As discussed in #124479, torch.amp.autocast can NOT be completely equivalent to torch.cuda.amp.autocast and torch.cpu.amp.autocast since torch.amp.autocast has NOT the default dtype for CPU (torch.bfloat16 by default) and CUDA (torch.float16 by default) respectively. We would like torch.amp.autocast to be more generic to help the developer/customer write the device-agnostic code. Because there are not enough reasons to add device-specific autocast torch.xxx.amp.autocast for each device backend.

Solution

When None is passed to dtype, we should use torch.get_autocast_dtype to get the related dtype for each backend. Meanwhile, torch.get_autocast_dtype is necessary to be supported in JIT path for BC.

Additional Context

With this PR, torch.amp.autocast(device_type='cuda') is equivalent to torch.cuda.amp.autocast.
Add two new UTs to cover this change in eager and jit path respectively.

cc @mcarilli @ptrblck @leslie-fang-intel @jgong5 @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang @voznesenskym @penguinwu @EikanWang @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng

pytorch-bot · 2024-04-27T15:38:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125103

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e11d24b with merge base 5007312 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

cc mcarilli ptrblck leslie-fang-intel jgong5 ezyang msaroufim bdhirsh anijain2305 chauhang voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng [ghstack-poisoned]

ghstack-source-id: 4e58007 Pull Request resolved: #125103

cc mcarilli ptrblck leslie-fang-intel jgong5 ezyang msaroufim bdhirsh anijain2305 chauhang voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng [ghstack-poisoned]

ghstack-source-id: 31ac5e5 Pull Request resolved: #125103

guangyey · 2024-05-06T06:38:38Z

torch/_dynamo/output_graph.py

        )
        global_state["grad_enabled"] = (torch.set_grad_enabled, torch.is_grad_enabled())

-        def autocast_specific_backend(


code improvements.

guangyey · 2024-05-07T16:08:52Z

@albanD This PR intends to make torch.amp.autocast to be more generic. Developers can use it to write device-agnostic code instead of using torch.cuda.amp.autocast or torch.cpu.amp.autocast. Is it reasonable?

albanD · 2024-05-07T16:02:54Z

torch/amp/autocast_mode.py

+        if dtype is None:
+            dtype = torch.get_autocast_dtype(device_type)


We should update the doc to mention the new default value for this arg?

albanD · 2024-05-07T16:13:25Z

torch/utils/checkpoint.py

            ) if torch.amp.is_autocast_available(device) else contextlib.nullcontext()
-            with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), \
-                 recompute_context:
+            with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]


Ho we gather and restore both the cpu context and another device's context here?
This makes this code a bit weird. But sounds fair. We definitely don't want to change the behavior here.

cc @soulitzer in case this is something you want to clean up for AC in general in a follow upnow that we have the nice API

We don't change the behavior here, just use torch.amp.autocast to be more generic code and leave the logic as it is.

yep perfect!

# Motivation As discussed in [#124479](#124479), `torch.amp.autocast` can NOT be completely equivalent to `torch.cuda.amp.autocast` and `torch.cpu.amp.autocast` since `torch.amp.autocast` has NOT the default `dtype` for CPU (`torch.bfloat16` by default) and CUDA (`torch.float16` by default) respectively. We would like `torch.amp.autocast` to be more generic to help the developer/customer write the device-agnostic code. Because there are not enough reasons to add device-specific autocast `torch.xxx.amp.autocast` for each device backend. # Solution When `None` is passed to `dtype`, we should use `torch.get_autocast_dtype` to get the related dtype for each backend. Meanwhile, `torch.get_autocast_dtype` is necessary to be supported in JIT path for BC. # Additional Context With this PR, `torch.amp.autocast(device_type='cuda')` is equivalent to `torch.cuda.amp.autocast`. Add two new UTs to cover this change in eager and jit path respectively. cc mcarilli ptrblck leslie-fang-intel jgong5 ezyang msaroufim bdhirsh anijain2305 chauhang voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng [ghstack-poisoned]

albanD

nit in doc, sounds good otherwise.

albanD · 2024-05-07T18:11:41Z

torch/amp/autocast_mode.py

            Default: ``True``
-        dtype(torch_dtype, optional):  Whether to use torch.float16 or torch.bfloat16.
+        dtype(torch_dtype, optional):  Data type for ops run in autocast. It uses the default value
+            (``torch.float16`` for CUDA and ``torch.bfloat16`` for CPU, by default), given by


Suggested change

(``torch.float16`` for CUDA and ``torch.bfloat16`` for CPU, by default), given by

(``torch.float16`` for CUDA and ``torch.bfloat16`` for CPU), given by

# Motivation As discussed in [#124479](#124479), `torch.amp.autocast` can NOT be completely equivalent to `torch.cuda.amp.autocast` and `torch.cpu.amp.autocast` since `torch.amp.autocast` has NOT the default `dtype` for CPU (`torch.bfloat16` by default) and CUDA (`torch.float16` by default) respectively. We would like `torch.amp.autocast` to be more generic to help the developer/customer write the device-agnostic code. Because there are not enough reasons to add device-specific autocast `torch.xxx.amp.autocast` for each device backend. # Solution When `None` is passed to `dtype`, we should use `torch.get_autocast_dtype` to get the related dtype for each backend. Meanwhile, `torch.get_autocast_dtype` is necessary to be supported in JIT path for BC. # Additional Context With this PR, `torch.amp.autocast(device_type='cuda')` is equivalent to `torch.cuda.amp.autocast`. Add two new UTs to cover this change in eager and jit path respectively. cc mcarilli ptrblck leslie-fang-intel jgong5 ezyang msaroufim bdhirsh anijain2305 chauhang voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng [ghstack-poisoned]

ghstack-source-id: c05a9e0 Pull Request resolved: #125103

guangyey · 2024-05-08T07:56:14Z

@pytorchbot merge

pytorchmergebot · 2024-05-08T07:58:20Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

# Motivation As discussed in [#124479](#124479), `torch.amp.autocast` can NOT be completely equivalent to `torch.cuda.amp.autocast` and `torch.cpu.amp.autocast` since `torch.amp.autocast` has NOT the default `dtype` for CPU (`torch.bfloat16` by default) and CUDA (`torch.float16` by default) respectively. We would like `torch.amp.autocast` to be more generic to help the developer/customer write the device-agnostic code. Because there are not enough reasons to add device-specific autocast `torch.xxx.amp.autocast` for each device backend. # Solution When `None` is passed to `dtype`, we should use `torch.get_autocast_dtype` to get the related dtype for each backend. Meanwhile, `torch.get_autocast_dtype` is necessary to be supported in JIT path for BC. # Additional Context With this PR, `torch.amp.autocast(device_type='cuda')` is equivalent to `torch.cuda.amp.autocast`. Add two new UTs to cover this change in eager and jit path respectively. cc mcarilli ptrblck leslie-fang-intel jgong5 ezyang msaroufim bdhirsh anijain2305 chauhang voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng [ghstack-poisoned]

Summary: # Motivation As discussed in [#124479](pytorch/pytorch#124479), `torch.amp.autocast` can NOT be completely equivalent to `torch.cuda.amp.autocast` and `torch.cpu.amp.autocast` since `torch.amp.autocast` has NOT the default `dtype` for CPU (`torch.bfloat16` by default) and CUDA (`torch.float16` by default) respectively. We would like `torch.amp.autocast` to be more generic to help the developer/customer write the device-agnostic code. Because there are not enough reasons to add device-specific autocast `torch.xxx.amp.autocast` for each device backend. # Solution When `None` is passed to `dtype`, we should use `torch.get_autocast_dtype` to get the related dtype for each backend. Meanwhile, `torch.get_autocast_dtype` is necessary to be supported in JIT path for BC. # Additional Context With this PR, `torch.amp.autocast(device_type='cuda')` is equivalent to `torch.cuda.amp.autocast`. Add two new UTs to cover this change in eager and jit path respectively. X-link: pytorch/pytorch#125103 Approved by: https://github.com/albanD, https://github.com/jgong5, https://github.com/gujinghui Reviewed By: izaitsevfb Differential Revision: D57138276 fbshipit-source-id: 17f883924e43f68dd6836d99b06fe8a47cfccbf6

# Motivation We generalize a device-agnostic API `torch.amp.autocast` in [#125103](#125103). After that, - `torch.cpu.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cpu', args...)`, and - `torch.cuda.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cuda', args...)` no matter in eager mode or JIT mode. Base on this point, we would like to deprecate `torch.cpu.amp.autocast` and `torch.cuda.amp.autocast` to **strongly recommend** developer to use `torch.amp.autocast` that is a device-agnostic API. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

# Motivation We generalize a device-agnostic API `torch.amp.autocast` in [#125103](#125103). After that, - `torch.cpu.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cpu', args...)`, and - `torch.cuda.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cuda', args...)` no matter in eager mode or JIT mode. Base on this point, we would like to deprecate `torch.cpu.amp.autocast` and `torch.cuda.amp.autocast` to **strongly recommend** developer to use `torch.amp.autocast` that is a device-agnostic API. cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 mcarilli ptrblck leslie-fang-intel [ghstack-poisoned]

# Motivation We generalize a device-agnostic API `torch.amp.autocast` in [#125103](#125103). After that, - `torch.cpu.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cpu', args...)`, and - `torch.cuda.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cuda', args...)` no matter in eager mode or JIT mode. Base on this point, we would like to deprecate `torch.cpu.amp.autocast` and `torch.cuda.amp.autocast` to **strongly recommend** developer to use `torch.amp.autocast` that is a device-agnostic API. Pull Request resolved: #126062 Approved by: https://github.com/eqy, https://github.com/albanD

# Motivation We generalize a device-agnostic API `torch.amp.autocast` in [pytorch#125103](pytorch#125103). After that, - `torch.cpu.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cpu', args...)`, and - `torch.cuda.amp.autocast(args...)` is completely equivalent to `torch.amp.autocast('cuda', args...)` no matter in eager mode or JIT mode. Base on this point, we would like to deprecate `torch.cpu.amp.autocast` and `torch.cuda.amp.autocast` to **strongly recommend** developer to use `torch.amp.autocast` that is a device-agnostic API. Pull Request resolved: pytorch#126062 Approved by: https://github.com/eqy, https://github.com/albanD

pytorch-bot bot added ciflow/inductor module: amp (automated mixed precision) autocast module: dynamo oncall: pt2 labels Apr 27, 2024

guangyey changed the title ~~make torch.amp.autocast more generic~~ [WIP] make torch.amp.autocast more generic Apr 27, 2024

guangyey marked this pull request as draft April 27, 2024 15:42

pytorchbot added the open source label Apr 27, 2024

guangyey added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 27, 2024

guangyey added 2 commits April 27, 2024 23:26

make torch.amp.autocast more generic

054a451

[ghstack-poisoned]

Update on "[WIP] make torch.amp.autocast more generic"

2b6089e

cc mcarilli ptrblck leslie-fang-intel jgong5 ezyang msaroufim bdhirsh anijain2305 chauhang voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng [ghstack-poisoned]

pytorch-bot bot added the release notes: jit release notes category label Apr 28, 2024

guangyey added a commit that referenced this pull request Apr 28, 2024

make torch.amp.autocast more generic

fc055f1

ghstack-source-id: 4e58007 Pull Request resolved: #125103

guangyey added 2 commits April 29, 2024 00:55

Update on "[WIP] make torch.amp.autocast more generic"

c6e6fac

cc mcarilli ptrblck leslie-fang-intel jgong5 ezyang msaroufim bdhirsh anijain2305 chauhang voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng [ghstack-poisoned]

Update on "[WIP] make torch.amp.autocast more generic"

58cabab

cc mcarilli ptrblck leslie-fang-intel jgong5 ezyang msaroufim bdhirsh anijain2305 chauhang voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng [ghstack-poisoned]

guangyey added the topic: improvements topic category label May 6, 2024

guangyey added a commit that referenced this pull request May 6, 2024

make torch.amp.autocast more generic

a4f128e

ghstack-source-id: 31ac5e5 Pull Request resolved: #125103

guangyey changed the title ~~[WIP] make torch.amp.autocast more generic~~ make torch.amp.autocast more generic May 6, 2024

guangyey marked this pull request as ready for review May 6, 2024 05:52

guangyey requested review from EikanWang, albanD, atalman, gujinghui, jgong5 and malfet May 6, 2024 05:53

guangyey commented May 6, 2024

View reviewed changes

albanD reviewed May 7, 2024

View reviewed changes

albanD approved these changes May 7, 2024

View reviewed changes

jgong5 approved these changes May 8, 2024

View reviewed changes

gujinghui approved these changes May 8, 2024

View reviewed changes

guangyey added a commit that referenced this pull request May 8, 2024

make torch.amp.autocast more generic

943a24a

ghstack-source-id: c05a9e0 Pull Request resolved: #125103

pytorchmergebot added the merging label May 8, 2024

guangyey added 2 commits May 8, 2024 10:19

pytorchmergebot added the Merged label May 8, 2024

pytorchmergebot closed this in d17be10 May 8, 2024

pytorchmergebot removed the merging label May 8, 2024

guangyey mentioned this pull request May 13, 2024

[Doc] Add deprecated autocast comments for doc #126062

Closed

github-actions bot deleted the gh/guangyey/26/head branch June 8, 2024 01:56

		if dtype is None:
		dtype = torch.get_autocast_dtype(device_type)

	(``torch.float16`` for CUDA and ``torch.bfloat16`` for CPU, by default), given by
	(``torch.float16`` for CUDA and ``torch.bfloat16`` for CPU), given by

make torch.amp.autocast more generic #125103

make torch.amp.autocast more generic #125103

Uh oh!

Conversation

guangyey commented Apr 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Solution

Additional Context

Uh oh!

pytorch-bot bot commented Apr 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125103

✅ No Failures

Uh oh!

guangyey May 6, 2024

Choose a reason for hiding this comment

Uh oh!

guangyey commented May 7, 2024

Uh oh!

albanD May 7, 2024

Choose a reason for hiding this comment

Uh oh!

albanD May 7, 2024

Choose a reason for hiding this comment

Uh oh!

guangyey May 7, 2024

Choose a reason for hiding this comment

Uh oh!

albanD May 7, 2024

Choose a reason for hiding this comment

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

albanD May 7, 2024

Choose a reason for hiding this comment

Uh oh!

guangyey May 8, 2024

Choose a reason for hiding this comment

Uh oh!

guangyey commented May 8, 2024

Uh oh!

pytorchmergebot commented May 8, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

guangyey commented Apr 27, 2024 •

edited

Loading

pytorch-bot bot commented Apr 27, 2024 •

edited

Loading