-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[structured] Preserve computed elements from meta func to impl #61746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[structured] Preserve computed elements from meta func to impl #61746
Conversation
[ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 7d56ae6 (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
[ghstack-poisoned]
aten/src/ATen/native/ReduceOps.cpp
Outdated
|
|
||
| TORCH_META_FUNC2(all, dim)(const Tensor& self, int64_t dim, bool keepdim) { | ||
| check_allany_for_meta(*this, "all", self, dim, keepdim); | ||
| this->dim = maybe_wrap_dim(dim, self.dim()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmmm. Let's enforce the convention that all members have trailing underscores, it will make it more obvious what's going on here
|
Some more philosophical questions that I've wondered about for this task (they may be insoluble, just trying to put some seeds in your brain):
|
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
aten/src/ATen/TensorMeta.h
Outdated
| #define TORCH_META_FUNC2(name, overload) void structured_##name##_##overload::meta | ||
|
|
||
| #define TORCH_PRECOMPUTE_META_FUNC(name) structured_##name::precompute_out structured_##name::meta | ||
| #define TORCH_PRECOMPUTE_META_FUNC2(name, overload) structured_##name##_##overload::precompute_out structured_##name##_##overload::meta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there actually is any reason to change the naming convention here, since, IIUC, precompute meta and meta are mutually exclusive (so it wouldn't be like you would have a naming conflict)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reupping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which naming convention did I change? New macros are required because the return type for meta functions of kernels that have precompute are different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh true. I guess we could unconditionally define structured_##name::precompute_out (maybe as return_type) and then just typedef it to void when it's not precomputed he he
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After your last comment, I don't feel too strongly about this, and wouldn't block on this
| } | ||
|
|
||
| return { | ||
| kH=kH, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, the dots are optional? Should we prefer if people specify dot or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think what's happening is that the kH declared above is being assigned to itself and then used to make the struct. The dots are definitely required. But on that note, I can't reproduce the error that is supposed to be emitted with this syntax: https://godbolt.org/z/Y6dxh1Eqd
I remember clearly that we did manage to write code that produces that error during a VC, but I can't seem to do it again. I tried gcc and clang both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blegh. Well I guess we'll just have to write a little lint rule for this. Maybe clang tidy can catch it.
**Summary** This commit introduces a new feature for structured kernels that allows kernels to declare quantities as "precomputed" in `native_functions.yaml`, compute them once in the `meta` function and reuse them again in the `impl`. The names and types of these quantities are used to generate code for a struct containing them that the `meta` function must return. In the case of a handful of surveyed kernels (`all,`, `any`, `avg_pool2d`), these quantities that are used both in the `meta` and `impl` have the same meaning as certain kernel arguments and in fact supersede them. Accordingly, the correspondence between a kernel argument and the precomputed elements that supersede it is also captured in `native_functions.yaml`. This information is used to unpack the struct returned by `meta` and pass its contents correctly to the `impl` function. The primary goal is to avoid recompute and enhance developer experience (e.g. sometimes people can forget to compute these elements while porting a kernel). ghstack-source-id: 74eca07 Pull Request resolved: #61746
|
Okay, so I tried the fluent approach with a templated class. It went pretty smoothly for the most part, but I commented on the diff with some questions I have. Here's an example of a I left the other setters out but I think you get the idea. |
| meta_return = "void" | ||
| precomputed = g.out.precomputed if g.structured else None | ||
|
|
||
| if precomputed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code added below looks really ugly:
struct TORCH_API structured_avg_pool2d : public at::impl::MetaBase {
template <bool A = false, bool B = false, bool C = false, bool D
struct TORCH_API precompute_out {
precompute_out<true, B, C, D, E, F> set_kH(int64_t value) {
TORCH_INTERNAL_ASSERT(A == false);
precompute_out<true, B, C, D, E, F> ret;
ret.kH = value;
ret.kW = this->kW;
ret.dH = this->dH;
ret.dW = this->dW;
ret.padH = this->padH;
ret.padW = this->padW;
return ret;
}
Is there a way to indent these blocks properly (other than adding tab characters manually)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, sorry :( I kind of want to run all of the generated code through an autoformatter but never have gotten around to it.
|
@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
Internal tests on D30407831 look good, so that means Windows is happy |
|
What do the error messages look like when you miss a parameter? |
I removed Hmm, if I didn't know what I know about the internals of structured kernels, I wouldn't know what to do here without digging into the autogenerated code. The codegen (purposely) writes out the return type of the meta as |
Actually looks like this doesn't work, because we need to set the return type of the function directly. |
| else: | ||
| construction_stmts.append(f"ret.{elem.name} = this->{elem.name};") | ||
|
|
||
| construction_stmts.append("return ret;") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if the compiler is smart enough to optimize all of the repeated reconstruction of the precompute struct here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No; how would I go about determining that?
|
|
||
| # Generate a string containing declarations of all precomputed elements. | ||
| precomputed_elements_with_cpp_types = [ | ||
| structured.argument_type(elem, binds=elem.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone ever precomputes tensors (or other non-value types), we will have to be a bit careful here, because reference types (which can be produced by argument_type) are not assignable, so this will choke to death. Technically, we probably need another api module for this, and use translate to get from the argument type to the struct type, but I guess for now this is OK.
|
You already have my approve, but the modified API looks great! |
…impl" **Summary** This commit introduces a new feature for structured kernels that allows kernels to declare quantities as "precomputed" in `native_functions.yaml`, compute them once in the `meta` function and reuse them again in the `impl`. The names and types of these quantities are used to generate code for a struct containing them that the `meta` function must return. In the case of a handful of surveyed kernels (`all,`, `any`, `avg_pool2d`), these quantities that are used both in the `meta` and `impl` have the same meaning as certain kernel arguments and in fact supersede them. Accordingly, the correspondence between a kernel argument and the precomputed elements that supersede it is also captured in `native_functions.yaml`. This information is used to unpack the struct returned by `meta` and pass its contents correctly to the `impl` function. The primary goal is to avoid recompute and enhance developer experience (e.g. sometimes people can forget to compute these elements while porting a kernel). Differential Revision: [D30407831](https://our.internmc.facebook.com/intern/diff/D30407831) [ghstack-poisoned]
**Summary** This commit introduces a new feature for structured kernels that allows kernels to declare quantities as "precomputed" in `native_functions.yaml`, compute them once in the `meta` function and reuse them again in the `impl`. The names and types of these quantities are used to generate code for a struct containing them that the `meta` function must return. In the case of a handful of surveyed kernels (`all,`, `any`, `avg_pool2d`), these quantities that are used both in the `meta` and `impl` have the same meaning as certain kernel arguments and in fact supersede them. Accordingly, the correspondence between a kernel argument and the precomputed elements that supersede it is also captured in `native_functions.yaml`. This information is used to unpack the struct returned by `meta` and pass its contents correctly to the `impl` function. The primary goal is to avoid recompute and enhance developer experience (e.g. sometimes people can forget to compute these elements while porting a kernel). ghstack-source-id: 1a553d2 Pull Request resolved: #61746
|
@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…impl" **Summary** This commit introduces a new feature for structured kernels that allows kernels to declare quantities as "precomputed" in `native_functions.yaml`, compute them once in the `meta` function and reuse them again in the `impl`. The names and types of these quantities are used to generate code for a struct containing them that the `meta` function must return. In the case of a handful of surveyed kernels (`all,`, `any`, `avg_pool2d`), these quantities that are used both in the `meta` and `impl` have the same meaning as certain kernel arguments and in fact supersede them. Accordingly, the correspondence between a kernel argument and the precomputed elements that supersede it is also captured in `native_functions.yaml`. This information is used to unpack the struct returned by `meta` and pass its contents correctly to the `impl` function. The primary goal is to avoid recompute and enhance developer experience (e.g. sometimes people can forget to compute these elements while porting a kernel). Differential Revision: [D30407831](https://our.internmc.facebook.com/intern/diff/D30407831) [ghstack-poisoned]
**Summary** This commit introduces a new feature for structured kernels that allows kernels to declare quantities as "precomputed" in `native_functions.yaml`, compute them once in the `meta` function and reuse them again in the `impl`. The names and types of these quantities are used to generate code for a struct containing them that the `meta` function must return. In the case of a handful of surveyed kernels (`all,`, `any`, `avg_pool2d`), these quantities that are used both in the `meta` and `impl` have the same meaning as certain kernel arguments and in fact supersede them. Accordingly, the correspondence between a kernel argument and the precomputed elements that supersede it is also captured in `native_functions.yaml`. This information is used to unpack the struct returned by `meta` and pass its contents correctly to the `impl` function. The primary goal is to avoid recompute and enhance developer experience (e.g. sometimes people can forget to compute these elements while porting a kernel). ghstack-source-id: 6159291 Pull Request resolved: #61746
|
@SplitInfinity has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
@SplitInfinity merged this pull request in 968d7ee. |
Stack from ghstack:
Summary
This commit introduces a new feature for structured kernels that allows
kernels to declare quantities as "precomputed" in
native_functions.yaml, compute them once in themetafunction andreuse them again in the
impl. The names and types of these quantitiesare used to generate code for a struct containing them that the
metafunction must return. In the case of a handful of surveyed kernels
(
all,,any,avg_pool2d), these quantities that are used both inthe
metaandimplhave the same meaning as certain kernel argumentsand in fact supersede them. Accordingly, the correspondence between a
kernel argument and the precomputed elements that supersede it is also
captured in
native_functions.yaml. This information is used to unpackthe struct returned by
metaand pass its contents correctly to theimplfunction.The primary goal is to avoid recompute and enhance developer experience
(e.g. sometimes people can forget to compute these elements while
porting a kernel).
Differential Revision: D30407831