KEMBAR78
Add TORCH_API annotations to c10d by lw · Pull Request #59695 · pytorch/pytorch · GitHub
Skip to content

Conversation

@lw
Copy link
Contributor

@lw lw commented Jun 9, 2021

Stack from ghstack:

c10d apparently doesn't have hidden visibility by default, but libtorch has, so in order to prepare c10d to be merged into libtorch we need to add these annotations. I've done this "as needed", seeing what would fail and fixing it. Windows is much more picky about it than Linux, hence there's more annotations in the Windows-only files (e.g., Gloo, as compared to NCCL).

Differential Revision: D28987578

c10d apparently doesn't have hidden visibility by default, but libtorch has, so in order to prepare c10d to be merged into libtorch we need to add these annotations. I've done this "as needed", seeing what would fail and fixing it. Windows is much more picky about it than Linux, hence there's more annotations in the Windows-only files (e.g., Gloo, as compared to NCCL).

Differential Revision: [D28987578](https://our.internmc.facebook.com/intern/diff/D28987578/)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 9, 2021

💊 CI failures summary and remediations

As of commit 8383207 (more details on the Dr. CI page):


  • 2/2 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_windows_vs2019_py36_cuda11.1_build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

FAILED: bin/TCPStoreTest.exe
c10d.lib(ProcessGroupWrapper.cpp.obj) : error LNK2019: unresolved external symbol "struct at::indexing::EllipsisIndexType const at::indexing::Ellipsis" (?Ellipsis@indexing@at@@3UEllipsisIndexType@12@B) referenced in function "public: __cdecl at::indexing::TensorIndex::TensorIndex(char const *)" (??0TensorIndex@indexing@at@@QEAA@PEBD@Z)
c10d.lib(Utils.cpp.obj) : error LNK2001: unresolved external symbol "struct at::indexing::EllipsisIndexType const at::indexing::Ellipsis" (?Ellipsis@indexing@at@@3UEllipsisIndexType@12@B)
c10d.lib(ProcessGroup.cpp.obj) : error LNK2001: unresolved external symbol "struct at::indexing::EllipsisIndexType const at::indexing::Ellipsis" (?Ellipsis@indexing@at@@3UEllipsisIndexType@12@B)
c10d.lib(ProcessGroupGloo.cpp.obj) : error LNK2001: unresolved external symbol "struct at::indexing::EllipsisIndexType const at::indexing::Ellipsis" (?Ellipsis@indexing@at@@3UEllipsisIndexType@12@B)
bin\FileStoreTest.exe : fatal error LNK1120: 2 unresolved externals
[5560/5664] C:\Users\circleci\project\build\win_tmp\bin\sccache-cl.exe   /TP -DBUILDING_TESTS -DFMT_HEADER_ONLY=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTHP_BUILD_MAIN_LIB -DTH_BLAS_MKL -DUSE_C10D -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_CUDNN -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -Dtorch_python_EXPORTS -Iaten\src -I..\aten\src -I. -I..\ -I..\cmake\..\third_party\benchmark\include -I..\cmake\..\third_party\cudnn_frontend\include -Icaffe2\contrib\aten -I..\third_party\onnx -Ithird_party\onnx -I..\third_party\foxi -Ithird_party\foxi -I..\torch\.. -I..\torch\..\aten\src -I..\torch\..\aten\src\TH -Icaffe2\aten\src -Ithird_party -I..\torch\..\third_party\valgrind-headers -I..\torch\..\third_party\gloo -I..\torch\..\third_party\onnx -I..\torch\csrc -I..\torch\csrc\api\include -I..\torch\lib -I..\torch\lib\libshm_windows -I..\torch\csrc\api -I..\c10\.. -Ithird_party\ideep\mkl-dnn\include -I..\third_party\ideep\mkl-dnn\src\..\include -I..\c10\cuda\..\.. -I..\third_party\fmt\include -I..\torch\lib\c10d\.. -Ithird_party\gloo -I..\cmake\..\third_party\gloo -I..\cmake\..\third_party\googletest\googlemock\include -I..\cmake\..\third_party\googletest\googletest\include -I..\third_party\protobuf\src -Iwin_tmp\mkl\include -I..\third_party\XNNPACK\include -I..\third_party -I..\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -I..\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -Iwin_tmp\magma\include -I..\third_party\ideep\mkl-dnn\include -I..\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/Users/circleci/project/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DHAVE_AVX_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\jit\passes\onnx\peephole.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c ..\torch\csrc\jit\passes\onnx\peephole.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29337 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

[5561/5664] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=caffe2\lib_c10d\test\CMakeFiles\TCPStoreTest.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  caffe2\lib_c10d\test\CMakeFiles\TCPStoreTest.dir\TCPStoreTest.cpp.obj  /out:bin\TCPStoreTest.exe /implib:lib\TCPStoreTest.lib /pdb:bin\TCPStoreTest.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /INCREMENTAL:NO /subsystem:console  lib\c10d.lib  lib\gtest_main.lib  lib\torch.lib  lib\torch_cuda.lib  lib\torch_cuda_cu.lib  lib\torch_cuda_cpp.lib  lib\torch_cpu.lib  lib\libprotobuf.lib  win_tmp\mkl\lib\mkl_intel_lp64.lib  win_tmp\mkl\lib\mkl_intel_thread.lib  win_tmp\mkl\lib\mkl_core.lib  win_tmp\mkl\lib\libiomp5md.lib  lib\dnnl.lib  -INCLUDE:?warp_size@cuda@at@@YAHXZ  lib\c10_cuda.lib  lib\c10.lib  "C:\Program Files\NVIDIA Corporation\NvToolsExt\lib\x64\nvToolsExt64_1.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cufft.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\curand.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cublas.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudnn.lib"  -INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z  lib\gloo_cuda.lib  lib\gloo.lib  C:\Jenkins\Miniconda3\Library\lib\uv.lib  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudart.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudart_static.lib"  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
FAILED: bin/TCPStoreTest.exe 
cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=caffe2\lib_c10d\test\CMakeFiles\TCPStoreTest.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  caffe2\lib_c10d\test\CMakeFiles\TCPStoreTest.dir\TCPStoreTest.cpp.obj  /out:bin\TCPStoreTest.exe /implib:lib\TCPStoreTest.lib /pdb:bin\TCPStoreTest.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /INCREMENTAL:NO /subsystem:console  lib\c10d.lib  lib\gtest_main.lib  lib\torch.lib  lib\torch_cuda.lib  lib\torch_cuda_cu.lib  lib\torch_cuda_cpp.lib  lib\torch_cpu.lib  lib\libprotobuf.lib  win_tmp\mkl\lib\mkl_intel_lp64.lib  win_tmp\mkl\lib\mkl_intel_thread.lib  win_tmp\mkl\lib\mkl_core.lib  win_tmp\mkl\lib\libiomp5md.lib  lib\dnnl.lib  -INCLUDE:?warp_size@cuda@at@@YAHXZ  lib\c10_cuda.lib  lib\c10.lib  "C:\Program Files\NVIDIA Corporation\NvToolsExt\lib\x64\nvToolsExt64_1.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cufft.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\curand.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cublas.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudnn.lib"  -INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z  lib\gloo_cuda.lib  lib\gloo.lib  C:\Jenkins\Miniconda3\Library\lib\uv.lib  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudart.lib"  "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudart_static.lib"  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
LINK: command "C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe caffe2\lib_c10d\test\CMakeFiles\TCPStoreTest.dir\TCPStoreTest.cpp.obj /out:bin\TCPStoreTest.exe /implib:lib\TCPStoreTest.lib /pdb:bin\TCPStoreTest.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /INCREMENTAL:NO /subsystem:console lib\c10d.lib lib\gtest_main.lib lib\torch.lib lib\torch_cuda.lib lib\torch_cuda_cu.lib lib\torch_cuda_cpp.lib lib\torch_cpu.lib lib\libprotobuf.lib win_tmp\mkl\lib\mkl_intel_lp64.lib win_tmp\mkl\lib\mkl_intel_thread.lib win_tmp\mkl\lib\mkl_core.lib win_tmp\mkl\lib\libiomp5md.lib lib\dnnl.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ lib\c10_cuda.lib lib\c10.lib C:\Program Files\NVIDIA Corporation\NvToolsExt\lib\x64\nvToolsExt64_1.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cufft.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\curand.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cublas.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudnn.lib -INCLUDE:?searchsorted_cuda@native@at@@YA?AVTensor@2@AEBV32@0_N1@Z lib\gloo_cuda.lib lib\gloo.lib C:\Jenkins\Miniconda3\Library\lib\uv.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudart.lib C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib\x64\cudart_static.lib lib\gtest.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:bin\TCPStoreTest.exe.manifest" failed (exit code 1120) with the following output:
Microsoft (R) Incremental Linker Version 14.28.29337.0
Copyright (C) Microsoft Corporation.  All rights reserved.

torch_cpu.lib(torch_cpu.dll) : error LNK2005: "public: __cdecl at::Tensor::Tensor(class at::Tensor &&)" (??0Tensor@at@@QEAA@$$QEAV01@@Z) already defined in c10d.lib(ProcessGroupWrapper.cpp.obj)
torch_cpu.lib(torch_cpu.dll) : error LNK2005: "public: __cdecl c10::IValue::IValue(struct c10::IValue const &)" (??0IValue@c10@@QEAA@AEBU01@@Z) already defined in c10d.lib(ProcessGroupWrapper.cpp.obj)
torch_cpu.lib(torch_cpu.dll) : error LNK2005: "public: __cdecl c10::IValue::IValue(struct c10::IValue &&)" (??0IValue@c10@@QEAA@$$QEAU01@@Z) already defined in c10d.lib(ProcessGroupWrapper.cpp.obj)
torch_cpu.lib(torch_cpu.dll) : error LNK2005: "public: __cdecl c10::IValue::~IValue(void)" (??1IValue@c10@@QEAA@XZ) already defined in c10d.lib(ProcessGroupWrapper.cpp.obj)
torch_cpu.lib(torch_cpu.dll) : error LNK2005: "public: struct c10::IValue & __cdecl c10::IValue::operator=(struct c10::IValue const &)& " (??4IValue@c10@@QEGAAAEAU01@AEBU01@@Z) already defined in c10d.lib(ProcessGroupWrapper.cpp.obj)

1 failure not recognized by patterns:

Job Step Action
GitHub Actions Windows CI (pytorch-win-vs2019-cpu-py3) / render_test_results Download PyTorch Test Reports 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

c10d apparently doesn't have hidden visibility by default, but libtorch has, so in order to prepare c10d to be merged into libtorch we need to add these annotations. I've done this "as needed", seeing what would fail and fixing it. Windows is much more picky about it than Linux, hence there's more annotations in the Windows-only files (e.g., Gloo, as compared to NCCL).

Differential Revision: [D28987578](https://our.internmc.facebook.com/intern/diff/D28987578/)

[ghstack-poisoned]
Copy link
Contributor

@cbalioglu cbalioglu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

lw added 2 commits June 10, 2021 02:01
c10d apparently doesn't have hidden visibility by default, but libtorch has, so in order to prepare c10d to be merged into libtorch we need to add these annotations. I've done this "as needed", seeing what would fail and fixing it. Windows is much more picky about it than Linux, hence there's more annotations in the Windows-only files (e.g., Gloo, as compared to NCCL).

Differential Revision: [D28987578](https://our.internmc.facebook.com/intern/diff/D28987578/)

[ghstack-poisoned]
c10d apparently doesn't have hidden visibility by default, but libtorch has, so in order to prepare c10d to be merged into libtorch we need to add these annotations. I've done this "as needed", seeing what would fail and fixing it. Windows is much more picky about it than Linux, hence there's more annotations in the Windows-only files (e.g., Gloo, as compared to NCCL).

Differential Revision: [D28987578](https://our.internmc.facebook.com/intern/diff/D28987578/)

[ghstack-poisoned]
@lw
Copy link
Contributor Author

lw commented Jun 10, 2021

It actually turns out that it's complicated to use TORCH_API while building a library that is not libtorch: by default that macro would translate to dllimport, which is not what we need because in fact here we need to export those symbols. But if we change that macro to become dllexport then it would look like c10d is (re-)exporting all the symbols of libtorch and thus create conflicts.

The proper solution would be to introduce a temporary C10D_API macro of sorts. However, instead, I'm just going to merge this PR into #59563. This will solve the issue.

@lw lw closed this Jun 10, 2021
@facebook-github-bot facebook-github-bot deleted the gh/lw/212/head branch July 10, 2021 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed oncall: distributed Add this issue/PR to distributed oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants