KEMBAR78
`addmv`: port to structured kernels, improve error checks by ngimel · Pull Request #55746 · pytorch/pytorch · GitHub
Skip to content

Conversation

@ngimel
Copy link
Collaborator

@ngimel ngimel commented Apr 11, 2021

Per title
I've revamped size checks a bit to provide better error message if self is of the wrong size, also added check that inplace variant has correct self size

Ref: #55070

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Apr 11, 2021

💊 CI failures summary and remediations

As of commit 57ee700 (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 1/2 non-scanned failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_windows_vs2019_py36_cuda10.1_test2 (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

RuntimeError: test_autograd failed!
  File "test_autograd.py", line 32, in <module>
    from torch.testing._internal.common_cuda import TEST_CUDA, _get_torch_cuda_version
  File "C:\Users\circleci\project\build\win_tmp\build\torch\testing\_internal\common_cuda.py", line 16, in <module>
    TEST_CUDNN = TEST_CUDA and torch.backends.cudnn.is_acceptable(torch.tensor(1., device=CUDA_DEVICE))
RuntimeError: CUDA error: unspecified launch failure
Traceback (most recent call last):
  File "run_test.py", line 1101, in <module>
    main()
  File "run_test.py", line 1080, in main
    raise RuntimeError(err_message)
RuntimeError: test_autograd failed!

(base) C:\Users\circleci\project\test>if ERRORLEVEL 1 exit /b 1 
+ cleanup
+ retcode=1
+ set +x


Exited with code exit status 1


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

std::array<int64_t, 1> out_size{mat.sizes()[0]};
set_output(out_size, mat.options());
auto result = maybe_get_output();
//this check can fire for inplace op only, for all other versions result is guaranteed to be correct size
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for reviewers: maybe we should update inplace structured codegen to perform this check. TensorIterator checks that inplace output is not broadcasted, so for TI-based structured kernels there would be duplication

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think I agree I'd prefer just checking that self and set_output have the same size. Should be easy enough to add, and similar to the out tests we added earlier. I definitely don't like the maybe_get_output call here (it is in fact guaranteed to give an output because you called set_output earlier, and this isn't actually part of the intended public contract.)


Tensor &mv_out(const Tensor &self, const Tensor &vec, Tensor& result) {
return native::addmv_out(result, self, vec, 0, 1, result);
return at::addmv_out(result, result, self, vec, 0, 1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this adds another dispatch, but I don't see how it can be avoided.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry, weren't you and mruberry planning to fix this? ;) #50953

@ezyang
Copy link
Contributor

ezyang commented Apr 11, 2021

When I tried that, however, I couldn't send const Tensor result argument to at::native::copy_ or at::native::mul_out and was getting error: binding reference of type ‘at::Tensor&’ to ‘const at::Tensor’ discards qualifiers (rightly).

Just use a const_cast here

@ngimel ngimel force-pushed the ngimel/mv_structured branch from 196101f to 8dbafcc Compare April 13, 2021 04:23
if (betaval == 0.0) {
result.zero_();
} else {
at::mul_out(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neither at::native::mul_out nor at::cpu::mul_out work here, so another dispatch it is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about at::cuda::mul_out 8)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"at::cuda" has no member "mul_out" - do I need some includes?


void propagate_names_for_addmv(
Tensor& result,
std::vector<Dimname> propagate_names_for_addmv(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider using a dimvector here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require changing all functions using unify_from_right (because it returns std::vector), better done as a separate PR

//in addmv, because addmv expects self to satisfy proper conditions
//to avoid this, supply correctly sized self, its contents doesn't matter because beta is 0
if (result.dim() > 1 || (result.numel() != self.size(0) || result.numel() !=1)) {
Tensor self_addmv = at::empty({self.size(0)}, self.options());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ick, we probably shouldn't implement it this way (preexisting condition, carry on...)

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great job, thank you!

@facebook-github-bot
Copy link
Contributor

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in 3fbca31.

krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
Summary:
Per title
I've revamped size checks a bit to provide better error message if `self` is of the wrong size, also added check that inplace variant has correct `self` size

Ref: pytorch#55070

Pull Request resolved: pytorch#55746

Reviewed By: ezyang

Differential Revision: D27782980

Pulled By: ngimel

fbshipit-source-id: 6ba949b682b8fd1170d0304da0ed348dd1a7b8c7
@bdhirsh bdhirsh changed the title port addmv to structured kernels addmv: pory to structured kernels May 24, 2021
@bdhirsh bdhirsh changed the title addmv: pory to structured kernels addmv: port to structured kernels, improve error checks May 24, 2021
facebook-github-bot pushed a commit that referenced this pull request Jul 16, 2021
…xed before (#61579)

Summary:
This PR:

1. Removes `test_out` skip: it's not needed anymore after it was fixed in #55746. This should also close #55589.
2. Removes `test_variant_consistency_eager` skip, it was added by mistake in #55771.
3. Refines `sample_inputs_addmv` function, the updated function should now be cleaner and easy to read.

cc: mruberry

Pull Request resolved: #61579

Reviewed By: gchanan

Differential Revision: D29709674

Pulled By: mruberry

fbshipit-source-id: 9b975c024777efdd33c6b9444b0b36e0eab85c03
@ngimel ngimel deleted the ngimel/mv_structured branch December 26, 2021 06:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants