OpInfo: `zero_` #58731

krshrimali · 2021-05-21T06:23:55Z

See #54261

…ssfully

facebook-github-bot · 2021-05-21T06:24:01Z

💊 CI failures summary and remediations

As of commit 79781e3 (more details on the Dr. CI page):

2/2 failures possibly* introduced in this PR
- 1/2 non-scanned failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

docker-pytorch-linux-xenial-py3.6-clang7 (1/1)

Step: "Check if image should be built" (full log | diagnosis details | 🔁 rerun)

ERROR: Something has gone wrong and the previou... isn't available for the merge-base of your branch

+ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/pytorch-linux-xenial-py3.6-clang7:2152ba6309b12251994fcdcd4ff234bfbe646d06
unsupported manifest format: &{{{2 application/vnd.docker.distribution.manifest.v2+json} {application/vnd.docker.container.image.v1+json 25631 sha256:98cde2da1075205bab984ae16bb71248bdf53d3406c8e91346cd830b28d32a78 [] map[] <nil>} [{application/vnd.docker.image.rootfs.diff.tar.gzip 46254451 sha256:92473f7ef45574f608989888a6cfc8187d3a1425e3a63f974434acab03fed068 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 851 sha256:fb52bde70123ac7f3a1b88fee95e74f4bdcdbd81917a91a35b56a52ec7671947 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 527 sha256:64788f86be3fd71809b5de602deff9445f3de18d2f44a49d0a053dfc9a2008ae [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 169 sha256:33f6d5f2e001ababe3ddac4731d9c33121e1148ef32a87a83a5b470cb401abef [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 1390 sha256:ed9900d32ae781b33daf435f0133a89f512e19232f9cfc61bbfd99168591262d [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 241532322 sha256:7813b469b93fa781c42122f271b265b0080cf399716afae3eeca88f62fe6caf0 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 809 sha256:a6b44d42c4c4855b0c54aec5ca5aa8a1e3f78fa76d124bb6287b4ca20b1279f1 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 64903309 sha256:bac60e667e389be9028f2aca24f78def91dd821475cd027d55288a25bc9e4fbf [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 408 sha256:d66ba4a66f6a91a00e123f5b683aae0417a14355fff5054da54ec0e3961bc0ee [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 32 sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 107 sha256:8c5de0402787795ea46c6246b0e1dd2eef28e5ddfb7d0ac164a02b2abdf3c0ca [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 361 sha256:6069f28cdeff49b9650af7fb5a957841984c5f9de13d773d8e1f30a0f9de58ec [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 1154 sha256:17909504ffa51d49dae642ea2460e5d8fc588203d42ff72ca4d6be58ae403d66 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 407 sha256:38ed1a4ccda8be1f22832cf05543049781c2b4b82f5857d5d2c4726d04f2d248 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 106 sha256:9b52e58a368e65b9ea16d5324d16ecd0582b42fed0706e153a6631c6332b200c [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 1965 sha256:94fce42716a874c973a30dd96754ddab6f656631cf4581a2bf0ee7687d1d7d9d [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 1245973803 sha256:0bbcf9b41635af953e95d2682508175fcbb34dbb61c3c2f5e5f3e0cfa335c6d1 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 440 sha256:e38ce17947a8e5c33aa3af336ea07d57e391fc2eb837470aeab6b27a041bc7b1 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 102 sha256:cb18224d181d07322ce87b6877e179e51767b3735d9c3c4eb3acbfcf8fef3b2e [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 231 sha256:cd8b5f60cf974e6b6fcb4af6af4ae2f1dd014077378a7a2b5d104140f15f6510 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 987375 sha256:f7e5cc4ba07fc190e24f7f8e5b0775bf72c870d1d990ad2625a14db1c348b0ce [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 914 sha256:436c832c4c47bef7b5fc231f07ca9a89e836f3b1ab959a6da2761b8b8714b28e [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 2007661 sha256:5a73a911fe436975bb865209a6411000e8856025e8c7e6adb1be9f380139360b [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 109 sha256:5945f0ebca6c82de0749ca0f8716860bb3bce7f5a489ea8c2c962cfc2e85bed3 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 875 sha256:d007f582ecf863cd0f528c2fd225bbfa385e138e7e6b6e1d9251dc4b043159d6 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 988062 sha256:159337f3e54858592f2eb89882be8d34476617408569ead4536cc660f56bf9f9 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 103 sha256:eedbb2fa6a0a759935d61e00ac9f39ffd2f77f06cf26dc72b1f1378e166dc6e8 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 861 sha256:239b2bf0e12bc4725688abe5fa531965d4bd43e58f1efa50f6e29debd4fb2c32 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 96973035 sha256:1be86537bd9316f33b62b3fc5ec5808ce1d672ce779a411aa0cb127b3b472bee [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 107 sha256:1d858cc47c7f5c551a73cadea9a519e64c6fef6f843628d6203fd69745686d9d [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 1292 sha256:67d965b9db7bea40827815509462732ae1ead923fb6f3542b99aa5b80edfc5cd [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 151 sha256:1b35e2ff20fb17f39b8866b45ec335ebcb612007114fdc77588eaf7918f57786 [] map[] <nil>} {application/vnd.docker.image.rootfs.diff.tar.gzip 642 sha256:d21740c6aafa264b6aa3a9cfb514e7ef2e39f3a93e8d9105af0e54082324b547 [] map[] <nil>} {applica
++ git merge-base HEAD aaccdc39965ade4b61b7852329739e777e244c25
+ git rev-parse aaccdc39965ade4b61b7852329739e777e244c25:.circleci/docker
2152ba6309b12251994fcdcd4ff234bfbe646d06
+++ git merge-base HEAD aaccdc39965ade4b61b7852329739e777e244c25
++ git rev-parse aaccdc39965ade4b61b7852329739e777e244c25:.circleci/docker
+ PREVIOUS_DOCKER_TAG=2152ba6309b12251994fcdcd4ff234bfbe646d06
+ [[ 2152ba6309b12251994fcdcd4ff234bfbe646d06 = \2\1\5\2\b\a\6\3\0\9\b\1\2\2\5\1\9\9\4\f\c\d\c\d\4\f\f\2\3\4\b\f\b\e\6\4\6\d\0\6 ]]
+ echo 'ERROR: Something has gone wrong and the previous image isn'\''t available for the merge-base of your branch'
ERROR: Something has gone wrong and the previous image isn't available for the merge-base of your branch
+ echo '       contact the PyTorch team to restore the original images'
       contact the PyTorch team to restore the original images
+ exit 1


Exited with code exit status 1

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm4.2-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

kshitij12345

Thanks @krshrimali .

Have asked a few questions.

torch/testing/_internal/common_methods_invocations.py

kshitij12345 · 2021-05-21T06:30:07Z

torch/testing/_internal/common_methods_invocations.py

           sample_inputs_func=sample_inputs_xlogy),
+    OpInfo('zero_',
+           dtypes=all_types_and_complex_and(torch.bfloat16, torch.half, torch.bool),
+           supports_autograd=False,


This should support autograd.

pytorch/tools/autograd/derivatives.yaml

Line 1274 in 6de11c2

- name: zero_(Tensor(a!) self) -> Tensor(a!)

This is the error I got for almost all the tests:

RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

Do you have any idea what is the solution to this? @kshitij12345

Do you still get an error after wrapping the operator in lambda?

I think the tests always assume that the operator's function variant doesn't modify the input inplace (which was always true except for this operator) .

eager consistency test has some mechanism to deal with this,

pytorch/test/test_ops.py

Line 321 in d88d321

cloned = clone_input_helper(sample.input) if variant in inplace_ops else sample.input

Need to dig a bit further.

REG: #58731 (comment)

Yes, still getting the same errors after wrapping the operator in lambda.

Inspired by

pytorch/test/test_ops.py

Line 321 in d88d321

cloned = clone_input_helper(sample.input) if variant in inplace_ops else sample.input

I tried,

OpInfo('zero', op=lambda x: torch.zero_(x.clone()), dtypes=all_types_and_complex_and(torch.bfloat16, torch.half, torch.bool), supports_autograd=True, supports_out=False, sample_inputs_func=sample_inputs_zero_),

which works for all test except jit and eager consistency.
jit consistency has issues with op being wrapped in lambda.

The eager test fails for more subtle reasons.

pytorch/torch/testing/_internal/common_methods_invocations.py

Lines 224 to 228 in d88d321

method_variant = getattr(torch.Tensor, name, None)

# attributes like real, imag are not callable

self.method_variant = method_variant if callable(method_variant) else None

inplace_name = name + "_"

self.inplace_variant = getattr(torch.Tensor, inplace_name, None)

We get the method variant and inplace variant in the above way.

For this case, method variant is Tensor.zero_ and inplace variant is None.

Thus in eager tests at the point below,

pytorch/test/test_ops.py

Lines 321 to 323 in d88d321

cloned = clone_input_helper(sample.input) if variant in inplace_ops else sample.input

if variant in inplace_ops and sample.broadcasts_input:

We incorrectly don't clone the inputs for method variant (which is actually the inplace variant) and hence we get,

RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

To mitigate this, I propose to allow setting method_variant and inplace_variant via the OpInfo constructor.

cc: @mruberry

@krshrimali can you take a look at how is this handled by method_tests ? Thanks!

Thanks, @kshitij12345 for digging into this. Taking a look at test/test_autograd.py to understand how autograd is tested for method_tests, it seems like requires_grad is set to False (/reset) for inplace ops. There is an additional comment that mentions that it needs to be fixed. Please see the relevant snippet below.

is_magic_method = name[:2] == '__' and name[-2] == '__' is_inpalce = name[-1] == "_" and not is_magic_method self_variable = create_input((self_size,), dtype=dtype, device=device)[0][0] # FixMe: run grad checks on inplace self if is_inplace: self_variable.requires_grad = False

Reference: https://github.com/pytorch/pytorch/blob/master/test/test_autograd.py#L5474

For the scope of this PR, I suggest to jit and eager consistency checks, and we can file this an issue (on allowing to set method_variant and inplace_variant via OpInfo constructor) separately and take the discussion there.

Looking forward to know your views, @mruberry and @kshitij12345.

Skipping those tests and following up by allowing method and inplace variants to be specified sounds good

kshitij12345 · 2021-05-21T06:32:33Z

torch/testing/_internal/common_methods_invocations.py

+    cases_scalar = ((),)
+    cases_non_scalar = ((S, S, S),
+                        (S, S),
+                        (S, ))


I think it is okay merge these two. It won't be hard to read.

mruberry · 2021-05-23T03:24:31Z

torch/testing/_internal/common_methods_invocations.py

+def sample_inputs_zero_(op_info, device, dtype, requires_grad, **kwargs):
+    make_arg = partial(make_tensor, device=device, dtype=dtype, requires_grad=requires_grad)
+
+    cases = ((), (S, S, S), (S, S), (S,))


Nit: this probably doesn't need the (S, S) case

mruberry

Overall this looks but, per discussion, some follow-up is needed.

For now, actually, I'd keep "supports_autograd=False" and add a comment with a link to an issue for that not being correct, but our tests not correctly dealing with this case.

Then I'd file another issue for setting the method variant and inplace variants for operators directly, AND we should support ops that don't have function variants but only have method and/or inplace variants, too. Solving this latter issue should let us address the first issue.

How does that sound @krshrimali, @kshitij12345?

kshitij12345 · 2021-05-23T08:46:14Z

For now, actually, I'd keep "supports_autograd=False" and add a comment with a link to an issue for that not being correct, but our tests not correctly dealing with this case.

I think that is fine since even method_tests are not doing the grad tests.

Then I'd file another issue for setting the method variant and inplace variants for operators directly, AND we should support ops that don't have function variants but only have method and/or inplace variants, too. Solving this latter issue should let us address the first issue.

On this, it actually has a function variant is which weird, torch.zero_ which very rarely used. Should it be deprecated? It has very few actual usages across the code base and probably should be replaced with tensor.zero_().

@mruberry what do you think?

mruberry · 2021-05-23T21:16:49Z

For now, actually, I'd keep "supports_autograd=False" and add a comment with a link to an issue for that not being correct, but our tests not correctly dealing with this case.

I think that is fine since even method_tests are not doing the grad tests.

Then I'd file another issue for setting the method variant and inplace variants for operators directly, AND we should support ops that don't have function variants but only have method and/or inplace variants, too. Solving this latter issue should let us address the first issue.

On this, it actually has a function variant is which weird, torch.zero_ which very rarely used. Should it be deprecated? It has very few actual usages across the code base and probably should be replaced with tensor.zero_().

@mruberry what do you think?

Unfortunately we're inconsistent today with how the inplace versions of operators are exposed. Your question is a good one but this is one discrepancy we probably don't want to worry about at the moment as it's dependent on some future decisions about how we represent operators.

krshrimali · 2021-05-28T07:22:15Z

Updates:

Adding method_variant and inplace_variant args to OpInfo class is being done here. I think this (#59138) PR can wait for a few days until the PR is merged (linked before). The jit issue will still be there though, and an issue will be raised on that soon. Thanks for the patience!

kshitij12345

LGTM! Thanks!

mruberry · 2021-05-31T01:10:48Z

torch/testing/_internal/common_methods_invocations.py

+    OpInfo('zero_',
+           op=lambda x: torch.zero_(x.clone()),
+           method_variant=None,
+           inplace_variant=torch.Tensor.zero_,


Nice use of the new architecture

mruberry

Nice work, @krshrimali!

facebook-github-bot · 2021-05-31T01:11:21Z

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-05-31T04:50:57Z

@mruberry merged this pull request in ef40757.

Summary: See pytorch#54261 Pull Request resolved: pytorch#58731 Reviewed By: ngimel Differential Revision: D28784083 Pulled By: mruberry fbshipit-source-id: f06de8045afd3728b1fedc014c091d8fd1955a9f

krshrimali added 18 commits January 29, 2020 09:52

torch.prod works fine now with fp16 input and fp32 output tensor

dc0e6f3

Replicates sum_kernel_cuda and sum_kernel_impl, adds out_t arg

9141aba

Test added for test_prod_gpu, under review, passes with changes succe…

e770241

…ssfully

modified test for torch.prod (only CUDA), uses all dtype combinations

aa9740d

use torch.tensor.dtype instead of torch.tensor.type()

f7f4876

torch.prod works fine now with fp16 input and fp32 output tensor

99fd726

Replicates sum_kernel_cuda and sum_kernel_impl, adds out_t arg

7c6607d

Test added for test_prod_gpu, under review, passes with changes succe…

f46c73a

…ssfully

modified test for torch.prod (only CUDA), uses all dtype combinations

d5e11eb

use torch.tensor.dtype instead of torch.tensor.type()

f26719c

Merge branch 'master' of https://github.com/krshrimali/pytorch

15d5c1a

:Merge branch 'master' of https://github.com/pytorch/pytorch

dcf6acf

Merge remote-tracking branch 'upstream/master'

5fc012a

adding test_torch back to master

58f7b06

Merge remote-tracking branch 'upstream/master'

87dd4af

Merge remote-tracking branch 'upstream/master'

355024a

OpInfo: zero_

4d3f17a

OpInfo for zero_, tested OK

a6dcfa2

krshrimali requested a review from kshitij12345 May 21, 2021 06:23

facebook-github-bot added the cla signed label May 21, 2021

pytorchbot added the open source label May 21, 2021

kshitij12345 reviewed May 21, 2021

View reviewed changes

krshrimali requested a review from mruberry May 21, 2021 11:55

Merge cases, per review by Kshitij12345

2e222eb

This comment has been minimized.

Sign in to view

H-Huang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 21, 2021

mruberry reviewed May 23, 2021

View reviewed changes

merge

9b51485

krshrimali mentioned this pull request May 25, 2021

[Testing] method_tests don't correctly deal with grad tests for in-place variants #58900

Closed

Link issue on why supports_autograd is False

beae967

krshrimali added 5 commits May 30, 2021 14:30

Merge remote-tracking branch 'upstream/master' into opinfo_zero

7e3bdc3

Use method_variant and inplace_variant from pytorch#59138

c8b7b0e

Merge remote-tracking branch 'upstream/master' into opinfo_zero

521e08d

Skip zero_ for test_normalize_operator_exhaustive

60ecb2f

Add zero_ to op_skip

79781e3

kshitij12345 approved these changes May 30, 2021

View reviewed changes

mruberry reviewed May 31, 2021

View reviewed changes

mruberry approved these changes May 31, 2021

View reviewed changes

facebook-github-bot closed this in ef40757 May 31, 2021

facebook-github-bot added the Merged label May 31, 2021

	method_variant = getattr(torch.Tensor, name, None)
	# attributes like real, imag are not callable
	self.method_variant = method_variant if callable(method_variant) else None
	inplace_name = name + "_"
	self.inplace_variant = getattr(torch.Tensor, inplace_name, None)

	cloned = clone_input_helper(sample.input) if variant in inplace_ops else sample.input

	if variant in inplace_ops and sample.broadcasts_input:

OpInfo: zero_ #58731

OpInfo: zero_ #58731

Uh oh!

Conversation

krshrimali commented May 21, 2021

Uh oh!

facebook-github-bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

docker-pytorch-linux-xenial-py3.6-clang7 (1/1)

ci.pytorch.org: 1 failed

Uh oh!

kshitij12345 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kshitij12345 May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kshitij12345 May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krshrimali May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

kshitij12345 commented May 23, 2021

Uh oh!

mruberry commented May 23, 2021

Uh oh!

krshrimali commented May 28, 2021

Uh oh!

kshitij12345 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 31, 2021

Uh oh!

facebook-github-bot commented May 31, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

OpInfo: `zero_` #58731

OpInfo: `zero_` #58731

facebook-github-bot commented May 21, 2021 •

edited

Loading

kshitij12345 May 21, 2021 •

edited

Loading

kshitij12345 May 21, 2021 •

edited

Loading

krshrimali May 21, 2021 •

edited

Loading