-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Hide torch_python symbols #142214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hide torch_python symbols #142214
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142214
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (3 Unrelated Failures)As of commit 20fae6f with merge base e0bdae7 ( BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
✅ Deploy Preview for chimerical-cranachan-793287 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
8b584ca
to
525351e
Compare
525351e
to
acc57b4
Compare
What's the point of doing this and why weren't we hiding symbols in it before? Does this guard against any invalidly compiled PyTorch extensions silently failing? |
The torch_python exposed all symbols before. Recently it was wrapped into torch_compile_options with default visibility. This PR tries to mark public symbols explicitly. We should have a clear definition of public python API. Some extensions using private symbols will see linker failures. |
I'm not sure how to test this :/ |
It is #136743 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of risky but I guess we can give it a try.
if(NOT WIN32) | ||
target_compile_options(torch_python PRIVATE | ||
$<$<COMPILE_LANGUAGE:CXX>: -fvisibility=default>) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
torch_compile_options uses private visibility for torch targets. Line 313 is from merge conflicting that is now solved.
}; | ||
|
||
extern PyTypeObject THPLayoutType; | ||
TORCH_PYTHON_API extern PyTypeObject THPLayoutType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you decide which 3 APIs to start labeling with TORCH_PYTHON_API
?
Will this change break extensions that use other APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built it locally and fixed all import errors. The 3 APIs were found during that time.
Curious what is the plan to test this to make sure we didn't miss others? Did you try on a set of third party extensions? |
Tried it on hugging face libraries and torchvison, it works fine. |
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / macos-py3-arm64 / test (default, 1, 3, macos-m1-stable) Details for Dev Infra teamRaised by workflow job |
c8f3628
to
20fae6f
Compare
@pytorchmergebot merge -i |
Merge startedYour change will be merged while ignoring the following 3 checks: periodic / linux-focal-cuda12.1-py3-gcc9-slow-gradcheck / test (default, 2, 8, linux.g5.4xlarge.nvidia.gpu, module:slowgradcheck), periodic / linux-focal-cuda11.8-py3.9-gcc9 / test (multigpu, 1, 1, lf.linux.g5.12xlarge.nvidia.gpu, oncall:distributed), periodic / linux-focal-cuda11.8-py3.10-gcc9-debug / test (default, 3, 5, lf.linux.4xlarge.nvidia.gpu, oncall:debug-build) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
I actually would like to argue to revert this one again. I am afraid there are a lot more of these APIs that would need to be marked and as you mentioned in the other PR, we don't have a good way right now to make sure we caught all of them. |
add_library(torch_python SHARED ${TORCH_PYTHON_SRCS}) | ||
torch_compile_options(torch_python) # see cmake/public/utils.cmake | ||
if(NOT WIN32) | ||
if(APPLE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this check here? Imo symbol visibility on Linux and Mac should be identical, shouldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apple has some hard to find linking errors that lead to test failures
Wow, I'm surprised we didn't do it before. I'm very supportive of the change, but also a bit scared of the things it could break. (Though, why keep default visibility on Mac?)
and https://github.com/pytorch/pytorch/commits/release/2.6/torch/CMakeLists.txt |
I think this change makes sense directionally, but making a public symbol private is a breaking change so we should be careful about it. I think it would be good to revert and come up with a plan for how to roll it out. |
Another way is to gradually discover those symbols and mark them public. |
The problem with this approach @cyyever is that users based on the release will have a 3months delay where they're unable to use PyTorch until the next release comes out. Which is really not acceptable. |
@albanD Fully understand, seems that it should be reverted. |
@albanD @malfet I confirm this is not part of the release 2.6: |
Fixes #148208. There are solutions for exposing symbols implicitly from inline functions (i.e., inline function A calls non-inline function B in foo.h. Code includes foo.h has to see the symbol B in DLL). Solution 1: tag the entire struct where the inline functions are defined as member functions with TORCH_PYTHON_API --- this PR does this for python_arg_parser.h. An alternative solution exists but will slow down dispatching a lot --- drop inline keyword and move implementation to .cc file. Solution 2: tag individual functions with TORCH_PYTHON_API. This PR does this for python_tensor.h. Related discussion about hiding torch_python symbols: #142214 Pull Request resolved: #148213 Approved by: https://github.com/malfet
Change symbols in torch_python to invisible by default on platforms other than Apple.
cc @albanD