-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Expose functions used in custom backend in torch_python dll #148213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148213
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit e0370b0 with merge base 4216478 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
7f63068 to
b463f81
Compare
|
@pytorchbot label "topic: not user facing" |
| TORCH_PYTHON_API c10::DispatchKey get_default_dispatch_key(); | ||
| TORCH_PYTHON_API at::Device get_default_device(); | ||
|
|
||
| // Gets the ScalarType for the default tensor type. | ||
| at::ScalarType get_default_scalar_type(); | ||
| TORCH_PYTHON_API at::ScalarType get_default_scalar_type(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should test if this change affects torch_npu cc: @FFFrog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it will just bring some symbols back. Those symbols used to exist but got removed ~3 months ago.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No effect, but adding TORCH_PYTHON_API for private functions is a bit weird to me :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. So those are moved to public as discussed in #148208.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's wrong to say that private class method should be accessible from the extensions. (The rest of the changes are fine)
torch/csrc/utils/python_arg_parser.h
Outdated
| TORCH_PYTHON_API at::Tensor tensor_slow(int i); | ||
| TORCH_PYTHON_API at::Scalar scalar_slow(int i); | ||
| TORCH_PYTHON_API at::Scalar scalar_slow(PyObject* arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels wrong: we should not make private functions part of the API...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my PR message on the top. Those are called in (performance critical) inline functions in .h file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO two potential solutions: Add TORCH_PYTHON_API to the entire class, which will make all its methods symbols visible or make those methods public, rather than private to the class implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try the 2nd solution, which looks more aligned with the goal of reducing API surace.
| inline bool isNone(int i); | ||
| inline std::optional<c10::DispatchKeySet> toDispatchKeySetOptional(int i); | ||
|
|
||
| private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not part of the functions that used to be public but moved to private 3 months ago? Why are you adding them here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you now who moved them to private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "moved to private" I mention here is the PR from 3 months ago hiding the symbols. Not moving these to be on this class "private:" tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what do you mean by public. Do you mean public C++ member function or symbols available in DLL? Our custom backend links to these functions when building with PyTorch. Those functions are accessible in PyTorch DLL ~ 1 year ago, so they were exposed in DLL before.
Q: Why are those APIs are tagged with TORCH_PYTHON_API?
A: These private functions are called inline functions in .h file. Whoever includes that .h file and indirectly calls these private functions will need these symbols when linking to torch_python DLL.
Q: Why is C++ private keyword removed?
A: It's a consensus in #148208. We can change tensor_slow back to private C++ member function, but you will see
...
private:
TORCH_PYTHON_API ... tensor_slow(...)
which is strange because private function becomes an API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, just curious, would TORCH_PYTHON_API the entire class will work? (That would transitively apply to all methods?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging TORCH_PYTHON_API at structure level works for my case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's land it this way then
Fix indent
e8faba8 to
e0370b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for the updates
|
@pytorchbot merge -f "Everything looks green" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes #148208. There are solutions for exposing symbols implicitly from inline functions (i.e., inline function A calls non-inline function B in foo.h. Code includes foo.h has to see the symbol B in DLL).
Solution 1: tag the entire struct where the inline functions are defined as member functions with TORCH_PYTHON_API --- this PR does this for python_arg_parser.h. An alternative solution exists but will slow down dispatching a lot --- drop inline keyword and move implementation to .cc file.
Solution 2: tag individual functions with TORCH_PYTHON_API. This PR does this for python_tensor.h.
Related discussion about hiding torch_python symbols: #142214