KEMBAR78
Centralize setting messageId in RequestCallback by lw · Pull Request #57848 · pytorch/pytorch · GitHub
Skip to content

Conversation

@lw
Copy link
Contributor

@lw lw commented May 7, 2021

Stack from ghstack:

This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.

Differential Revision: D28224477

This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.

Differential Revision: [D28224477](https://our.internmc.facebook.com/intern/diff/D28224477/)

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added oncall: distributed Add this issue/PR to distributed oncall triage queue cla signed labels May 7, 2021
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 7, 2021

💊 CI failures summary and remediations

As of commit d564aa9 (more details on the Dr. CI page):


  • 2/2 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_windows_vs2019_py36_cuda11.1_build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

FAILED: caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/shared/cudnn.cpp.obj
caused by: Failed to read response header
caused by: failed to fill whole buffer
[5617/5654] C:\Users\circleci\project\build\win_tmp\bin\sccache-cl.exe   /TP -DBUILDING_TESTS -DFMT_HEADER_ONLY=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTHP_BUILD_MAIN_LIB -DTH_BLAS_MKL -DUSE_C10D -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_CUDNN -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -Dtorch_python_EXPORTS -Iaten\src -I..\aten\src -I. -I..\ -I..\cmake\..\third_party\benchmark\include -I..\cmake\..\third_party\cudnn_frontend\include -Icaffe2\contrib\aten -I..\third_party\onnx -Ithird_party\onnx -I..\third_party\foxi -Ithird_party\foxi -I..\torch\.. -I..\torch\..\aten\src -I..\torch\..\aten\src\TH -Icaffe2\aten\src -Ithird_party -I..\torch\..\third_party\valgrind-headers -I..\torch\..\third_party\gloo -I..\torch\..\third_party\onnx -I..\torch\csrc -I..\torch\csrc\api\include -I..\torch\lib -I..\torch\lib\libshm_windows -I..\torch\csrc\api -I..\c10\.. -Ithird_party\ideep\mkl-dnn\include -I..\third_party\ideep\mkl-dnn\src\..\include -I..\c10\cuda\..\.. -I..\third_party\fmt\include -I..\torch\lib\c10d\.. -Ithird_party\gloo -I..\cmake\..\third_party\gloo -I..\cmake\..\third_party\googletest\googlemock\include -I..\cmake\..\third_party\googletest\googletest\include -I..\third_party\protobuf\src -Iwin_tmp\mkl\include -I..\third_party\XNNPACK\include -I..\third_party -I..\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -I..\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -Iwin_tmp\magma\include -I..\third_party\ideep\mkl-dnn\include -I..\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/Users/circleci/project/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DHAVE_AVX_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\cuda\python_comm.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c ..\torch\csrc\cuda\python_comm.cpp
FAILED: caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/python_comm.cpp.obj 
C:\Users\circleci\project\build\win_tmp\bin\sccache-cl.exe   /TP -DBUILDING_TESTS -DFMT_HEADER_ONLY=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTHP_BUILD_MAIN_LIB -DTH_BLAS_MKL -DUSE_C10D -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_CUDNN -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -Dtorch_python_EXPORTS -Iaten\src -I..\aten\src -I. -I..\ -I..\cmake\..\third_party\benchmark\include -I..\cmake\..\third_party\cudnn_frontend\include -Icaffe2\contrib\aten -I..\third_party\onnx -Ithird_party\onnx -I..\third_party\foxi -Ithird_party\foxi -I..\torch\.. -I..\torch\..\aten\src -I..\torch\..\aten\src\TH -Icaffe2\aten\src -Ithird_party -I..\torch\..\third_party\valgrind-headers -I..\torch\..\third_party\gloo -I..\torch\..\third_party\onnx -I..\torch\csrc -I..\torch\csrc\api\include -I..\torch\lib -I..\torch\lib\libshm_windows -I..\torch\csrc\api -I..\c10\.. -Ithird_party\ideep\mkl-dnn\include -I..\third_party\ideep\mkl-dnn\src\..\include -I..\c10\cuda\..\.. -I..\third_party\fmt\include -I..\torch\lib\c10d\.. -Ithird_party\gloo -I..\cmake\..\third_party\gloo -I..\cmake\..\third_party\googletest\googlemock\include -I..\cmake\..\third_party\googletest\googletest\include -I..\third_party\protobuf\src -Iwin_tmp\mkl\include -I..\third_party\XNNPACK\include -I..\third_party -I..\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -I..\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -Iwin_tmp\magma\include -I..\third_party\ideep\mkl-dnn\include -I..\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/Users/circleci/project/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DHAVE_AVX_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\cuda\python_comm.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c ..\torch\csrc\cuda\python_comm.cpp
error: failed to execute compile
caused by: error reading compile response from server
caused by: Failed to read response header
caused by: failed to fill whole buffer
[5618/5654] C:\Users\circleci\project\build\win_tmp\bin\sccache-cl.exe   /TP -DBUILDING_TESTS -DFMT_HEADER_ONLY=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTHP_BUILD_MAIN_LIB -DTH_BLAS_MKL -DUSE_C10D -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_CUDNN -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -Dtorch_python_EXPORTS -Iaten\src -I..\aten\src -I. -I..\ -I..\cmake\..\third_party\benchmark\include -I..\cmake\..\third_party\cudnn_frontend\include -Icaffe2\contrib\aten -I..\third_party\onnx -Ithird_party\onnx -I..\third_party\foxi -Ithird_party\foxi -I..\torch\.. -I..\torch\..\aten\src -I..\torch\..\aten\src\TH -Icaffe2\aten\src -Ithird_party -I..\torch\..\third_party\valgrind-headers -I..\torch\..\third_party\gloo -I..\torch\..\third_party\onnx -I..\torch\csrc -I..\torch\csrc\api\include -I..\torch\lib -I..\torch\lib\libshm_windows -I..\torch\csrc\api -I..\c10\.. -Ithird_party\ideep\mkl-dnn\include -I..\third_party\ideep\mkl-dnn\src\..\include -I..\c10\cuda\..\.. -I..\third_party\fmt\include -I..\torch\lib\c10d\.. -Ithird_party\gloo -I..\cmake\..\third_party\gloo -I..\cmake\..\third_party\googletest\googlemock\include -I..\cmake\..\third_party\googletest\googletest\include -I..\third_party\protobuf\src -Iwin_tmp\mkl\include -I..\third_party\XNNPACK\include -I..\third_party -I..\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -I..\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -Iwin_tmp\magma\include -I..\third_party\ideep\mkl-dnn\include -I..\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/Users/circleci/project/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DHAVE_AVX_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\cuda\shared\cudnn.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c ..\torch\csrc\cuda\shared\cudnn.cpp
FAILED: caffe2/torch/CMakeFiles/torch_python.dir/csrc/cuda/shared/cudnn.cpp.obj 
C:\Users\circleci\project\build\win_tmp\bin\sccache-cl.exe   /TP -DBUILDING_TESTS -DFMT_HEADER_ONLY=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTHP_BUILD_MAIN_LIB -DTH_BLAS_MKL -DUSE_C10D -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_CUDNN -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -Dtorch_python_EXPORTS -Iaten\src -I..\aten\src -I. -I..\ -I..\cmake\..\third_party\benchmark\include -I..\cmake\..\third_party\cudnn_frontend\include -Icaffe2\contrib\aten -I..\third_party\onnx -Ithird_party\onnx -I..\third_party\foxi -Ithird_party\foxi -I..\torch\.. -I..\torch\..\aten\src -I..\torch\..\aten\src\TH -Icaffe2\aten\src -Ithird_party -I..\torch\..\third_party\valgrind-headers -I..\torch\..\third_party\gloo -I..\torch\..\third_party\onnx -I..\torch\csrc -I..\torch\csrc\api\include -I..\torch\lib -I..\torch\lib\libshm_windows -I..\torch\csrc\api -I..\c10\.. -Ithird_party\ideep\mkl-dnn\include -I..\third_party\ideep\mkl-dnn\src\..\include -I..\c10\cuda\..\.. -I..\third_party\fmt\include -I..\torch\lib\c10d\.. -Ithird_party\gloo -I..\cmake\..\third_party\gloo -I..\cmake\..\third_party\googletest\googlemock\include -I..\cmake\..\third_party\googletest\googletest\include -I..\third_party\protobuf\src -Iwin_tmp\mkl\include -I..\third_party\XNNPACK\include -I..\third_party -I..\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -I..\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -Iwin_tmp\magma\include -I..\third_party\ideep\mkl-dnn\include -I..\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/Users/circleci/project/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DHAVE_AVX_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\cuda\shared\cudnn.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c ..\torch\csrc\cuda\shared\cudnn.cpp
error: failed to execute compile
caused by: error reading compile response from server
caused by: Failed to read response header
caused by: failed to fill whole buffer
[5619/5654] C:\Users\circleci\project\build\win_tmp\bin\sccache-cl.exe   /TP -DBUILDING_TESTS -DFMT_HEADER_ONLY=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTHP_BUILD_MAIN_LIB -DTH_BLAS_MKL -DUSE_C10D -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_CUDNN -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -Dtorch_python_EXPORTS -Iaten\src -I..\aten\src -I. -I..\ -I..\cmake\..\third_party\benchmark\include -I..\cmake\..\third_party\cudnn_frontend\include -Icaffe2\contrib\aten -I..\third_party\onnx -Ithird_party\onnx -I..\third_party\foxi -Ithird_party\foxi -I..\torch\.. -I..\torch\..\aten\src -I..\torch\..\aten\src\TH -Icaffe2\aten\src -Ithird_party -I..\torch\..\third_party\valgrind-headers -I..\torch\..\third_party\gloo -I..\torch\..\third_party\onnx -I..\torch\csrc -I..\torch\csrc\api\include -I..\torch\lib -I..\torch\lib\libshm_windows -I..\torch\csrc\api -I..\c10\.. -Ithird_party\ideep\mkl-dnn\include -I..\third_party\ideep\mkl-dnn\src\..\include -I..\c10\cuda\..\.. -I..\third_party\fmt\include -I..\torch\lib\c10d\.. -Ithird_party\gloo -I..\cmake\..\third_party\gloo -I..\cmake\..\third_party\googletest\googlemock\include -I..\cmake\..\third_party\googletest\googletest\include -I..\third_party\protobuf\src -Iwin_tmp\mkl\include -I..\third_party\XNNPACK\include -I..\third_party -I..\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -I..\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -Iwin_tmp\magma\include -I..\third_party\ideep\mkl-dnn\include -I..\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/Users/circleci/project/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DHAVE_AVX_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\lib\c10d\logger.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c ..\torch\lib\c10d\logger.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29337 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

[5620/5654] C:\Users\circleci\project\build\win_tmp\bin\sccache-cl.exe   /TP -DBUILDING_TESTS -DFMT_HEADER_ONLY=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTHP_BUILD_MAIN_LIB -DTH_BLAS_MKL -DUSE_C10D -DUSE_C10D_GLOO -DUSE_CUDA -DUSE_CUDNN -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_OPENMP_NOFORCE_MANIFEST -Dtorch_python_EXPORTS -Iaten\src -I..\aten\src -I. -I..\ -I..\cmake\..\third_party\benchmark\include -I..\cmake\..\third_party\cudnn_frontend\include -Icaffe2\contrib\aten -I..\third_party\onnx -Ithird_party\onnx -I..\third_party\foxi -Ithird_party\foxi -I..\torch\.. -I..\torch\..\aten\src -I..\torch\..\aten\src\TH -Icaffe2\aten\src -Ithird_party -I..\torch\..\third_party\valgrind-headers -I..\torch\..\third_party\gloo -I..\torch\..\third_party\onnx -I..\torch\csrc -I..\torch\csrc\api\include -I..\torch\lib -I..\torch\lib\libshm_windows -I..\torch\csrc\api -I..\c10\.. -Ithird_party\ideep\mkl-dnn\include -I..\third_party\ideep\mkl-dnn\src\..\include -I..\c10\cuda\..\.. -I..\third_party\fmt\include -I..\torch\lib\c10d\.. -Ithird_party\gloo -I..\cmake\..\third_party\gloo -I..\cmake\..\third_party\googletest\googlemock\include -I..\cmake\..\third_party\googletest\googletest\include -I..\third_party\protobuf\src -Iwin_tmp\mkl\include -I..\third_party\XNNPACK\include -I..\third_party -I..\cmake\..\third_party\eigen -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\lib\site-packages\numpy\core\include -I..\cmake\..\third_party\pybind11\include -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -Iwin_tmp\magma\include -I..\third_party\ideep\mkl-dnn\include -I..\third_party\ideep\include -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/Users/circleci/project/build/win_tmp/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DHAVE_AVX_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /MD /O2 /Ob2 /DNDEBUG /w /bigobj -DNDEBUG -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DUSE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD /EHsc /DNOMINMAX /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -std:c++14 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\distributed\c10d\python_comm_hook.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c ..\torch\csrc\distributed\c10d\python_comm_hook.cpp

1 failure not recognized by patterns:

Job Step Action
CircleCI pytorch_xla_linux_bionic_py3_6_clang9_build Build 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

lw added 2 commits May 17, 2021 03:53
This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.

Differential Revision: [D28224477](https://our.internmc.facebook.com/intern/diff/D28224477/)

[ghstack-poisoned]
This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.

Differential Revision: [D28224477](https://our.internmc.facebook.com/intern/diff/D28224477/)

[ghstack-poisoned]
return future;
} catch (std::exception& e) {
return asFuture(handleError(e, messageType, messageId));
// Pass a dummy message ID since it will be overwritten anyways.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove message id from handleError argument list?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please feel free to do this in followup PRs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My long-term idea is to remove the ID entirely from Message. I believe the ID is only used by the RPC agent to match requests with responses, so in a sense it's an "external" information (it shouldn't be inside the Message, it should be a "tag" that's attached to it).

message->setId(id);
return message;
},
c10::getCustomClassType<c10::intrusive_ptr<Message>>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this does some type conversion + map lookup + branching. I wonder if it make sense to keel this c10::ClassTypePtr in Message implementation and reuse it in other places?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't worry about performance here. It's true that we need to do a map lookup and some bookkeeping, but only the first time, as the result is then cached for subsequent uses. See the static local variable here

template <typename T>
const c10::ClassTypePtr& getCustomClassType() {
// Classes are never unregistered from getCustomClassTypeMap and the
// hash lookup can be a hot path, so just cache.
// For the same reason, it's fine If this ends up getting duplicated across
// DSO boundaries for whatever reason.
static c10::ClassTypePtr cache = getCustomClassTypeImpl<T>();
return cache;
}

Moreover, since that function is templated, the type will be resolved at compile time. Hence a reasonable compiler will inline that code and this callsite will basically just become one memory access (plus a check if the cache has been initialized).

Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

lw added 3 commits May 18, 2021 02:35
This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.

Differential Revision: [D28224477](https://our.internmc.facebook.com/intern/diff/D28224477/)

[ghstack-poisoned]
This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.

Differential Revision: [D28224477](https://our.internmc.facebook.com/intern/diff/D28224477/)

[ghstack-poisoned]
This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.

Differential Revision: [D28224477](https://our.internmc.facebook.com/intern/diff/D28224477/)

[ghstack-poisoned]
This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.

Differential Revision: [D28224477](https://our.internmc.facebook.com/intern/diff/D28224477/)

[ghstack-poisoned]
dgl-intel pushed a commit to dgl-intel/pytorch that referenced this pull request May 21, 2021
Pull Request resolved: pytorch#57848

This PR looks large, but all it does is add a dozen lines and remove a lot of other ones.

One first advantage of using Futures is that we can easily chain some "post-processing" to them. Until now we needed to pass the message ID around everywhere because it was set separately by each method. Instead, we could simply add a follow-up step to the final future which sets this ID, and remove all the former logic.
ghstack-source-id: 129567065

Differential Revision: [D28224477](https://our.internmc.facebook.com/intern/diff/D28224477/)
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 4ac18f6.

@facebook-github-bot facebook-github-bot deleted the gh/lw/162/head branch May 25, 2021 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged oncall: distributed Add this issue/PR to distributed oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants