KEMBAR78
OpInfo: split, split_with_sizes by kshitij12345 · Pull Request #58184 · pytorch/pytorch · GitHub
Skip to content

Conversation

@kshitij12345
Copy link
Collaborator

Reference: #54261

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 12, 2021

💊 CI failures summary and remediations

As of commit 0554930 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@kshitij12345 kshitij12345 requested a review from mruberry May 12, 2021 20:25
(True, 'aten::split_with_sizes'), [1]),
('split_with_sizes', (S, S, S), ([int(S / 3), S - int(S / 3) * 2, int(S / 3)],), '', (True,)),
('split_with_sizes', (S, S, S), ([int(S / 3), S - int(S / 3), 0],), 'size_0', (True, )),
('split_with_sizes', (S, S, S), ([int(S / 3), S - int(S / 3) * 2, int(S / 3)],), 'dim', (True, ), [1]),
Copy link
Collaborator Author

@kshitij12345 kshitij12345 May 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this cases was supposed to add a dim as well to the argument (otherwise it is the same as first). Have added one in sample_inputs_split_with_sizes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test generator from method_tests understood the dim argument and would programmatically generate some options for it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is the case.
For checking, I had added the following snippet after the linked line.

if 'split_with' in test_name:
      print(test_name, new_args)

new_args = tuple(new_args)

Output for the given sample was (not all args are same),

test_split_with_sizes_dim_neg0 ([1, 3, 1],)
test_split_with_sizes_dim ([1, 3, 1],)
test_split_with_sizes_dim_neg0_complex ([1, 3, 1],)
test_split_with_sizes_dim_complex ([1, 3, 1],)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting; thanks for investigating. Good thing soon method_tests() will be gone and we won't have to wonder at its mysteries anymore

@kshitij12345 kshitij12345 changed the title OpInfo: split, split_with_sizes [WIP] OpInfo: split, split_with_sizes May 12, 2021
@kshitij12345 kshitij12345 removed the request for review from mruberry May 12, 2021 20:32
@kshitij12345 kshitij12345 marked this pull request as draft May 12, 2021 20:33
variant_test_name='list_args',
dtypes=all_types_and_complex_and(torch.bfloat16, torch.half, torch.bool),
sample_inputs_func=partial(sample_inputs_split, list_args=True),
supports_out=False),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at these entries,

('split', (S, S, S), ([int(S / 3), S - int(S / 3) * 2, int(S / 3)],), 'size_list',
            (True, 'aten::split_with_sizes')),
 ('split', (S, S, S), ([int(S / 2), S - int(S / 2) * 2, int(S / 2)], 2), 'size_list_dim',
            (True, 'aten::split_with_sizes'), [1]),

and

# (should_autodiff_node[bool], nonfusible_nodes, fusible_nodes) for autodiff, // optional

tried adding to the entry with list_args

assert_autodiffed=True,
autodiff_nonfusible_nodes=['aten::split_with_sizes']

but jit consistency test failed with,

Failure in testing nodes' autodifferentiation. One or more nodes were expected to be autodiffed, but were not found in specified fusible/nonfusible DifferentiableGraph groups. 
Specifically:
  ['aten::split_with_sizes'] were not in one of the DifferentiableGraphs when they were expected to be. Did you intend for these nodes to be autodiffed? If not, remove them from the list of nonfusible nodes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great attention to detail as always, @kshitij12345

I'm not sure what's going on here. I made a sample program:

import torch
S = 5
t = torch.randn((S, S, S), requires_grad=True)

def foo(t):
  S = 5
  return t.split([int(S / 2), S - int(S / 2) * 2, int(S / 2)])

scripted_foo = torch.jit.script(foo)
scripted_foo.graph_for(t)

traced_foo = torch.jit.trace(foo, (t,))
traced_foo.graph

And I can see "split" becomes "split_with_sizes" in the traced graph, but not the scripted graph, and it doesn't look like it's part of a differentiable subgraph in either case.

Looking at the code it does seem like there's some special analysis that might convert split to split_with_sizes, but I can't see it being triggered, either:

"aten::split(Tensor self, int[] split_sizes, int dim=0) -> Tensor[]"),

I wonder if the autodifferentiation part of method_tests() is even running? Would you submit a PR that just deletes the (True, 'aten::split_with_sizes') from split and see if anything fails? I tried to find the tests for it in the code and couldn't spot them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locally verified:
With the following diff

+        ('split', (S, S, S), ([int(S / 3), S - int(S / 3) * 2, int(S / 3)],), 'size_list',),
+        ('split', (S, S, S), ([int(S / 2), S - int(S / 2) * 2, int(S / 2)], 2), 'size_list_dim', (), [1]),

The tests pass locally.

python test/test_autograd.py -k _split
/home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/site-packages/torch/backends/cudnn/__init__.py:73: UserWarning: PyTorch was compiled without cuDNN/MIOpen support. To use cuDNN/MIOpen, rebuild PyTorch making sure the library is visible to the build system.
  warnings.warn(
................../home/kshiteej/.conda/envs/pytorch-cuda-dev/lib/python3.8/site-packages/torch/backends/cudnn/__init__.py:73: UserWarning: PyTorch was compiled without cuDNN/MIOpen support. To use cuDNN/MIOpen, rebuild PyTorch making sure the library is visible to the build system.
  warnings.warn(
..................ssssssssssssssssss
----------------------------------------------------------------------
Ran 54 tests in 2.998s

OK (skipped=18)

Will submit a PR to verify the same on CI.
Thanks for digging into it and the suggestion :)

@kshitij12345 kshitij12345 marked this pull request as ready for review May 12, 2021 20:42
@kshitij12345 kshitij12345 requested a review from mruberry May 12, 2021 20:42
@kshitij12345 kshitij12345 changed the title [WIP] OpInfo: split, split_with_sizes OpInfo: split, split_with_sizes May 12, 2021
@agolynski agolynski added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 13, 2021
@codecov
Copy link

codecov bot commented May 13, 2021

Codecov Report

Merging #58184 (0554930) into master (82d7149) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #58184      +/-   ##
==========================================
- Coverage   76.81%   76.79%   -0.02%     
==========================================
  Files        1986     1986              
  Lines      198192   198208      +16     
==========================================
- Hits       152232   152212      -20     
- Misses      45960    45996      +36     

@kshitij12345
Copy link
Collaborator Author

@mruberry gentle ping :)

@mruberry mruberry self-requested a review May 23, 2021 05:12
Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Good investigation with this one.

@facebook-github-bot
Copy link
Contributor

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in ee3ea31.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants