-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Description
🐛 Bug
ccache no longer caches pytorch build after #57361 , 9354a68
Before the PR, a fully cached build with ccache takes 2 minutes on my machine. After the PR, it takes 35 minutes, because every CUDA objects and many CPU objects are recompiled.
A full log is attached, so that you can check which objects are cached and which aren't.
To Reproduce
I wrote a bash script for this
First, install ccache using instructions here https://github.com/pytorch/pytorch/blob/d7d0fa20698127c2eb113d86a4b6de22a098d179/CONTRIBUTING.md#use-ccache
Then, run this script
#!/bin/bash
set -ex
TORCH_SOURCE_DIR=~/Developer/pytorch
COMMIT=9354a68e7d8c4680a115b70b9b14565cd42cb03f
build() {(
cd $TORCH_SOURCE_DIR
git checkout "$1"
git clean -ffd
for i in `seq 5`; do
python setup.py clean
pip uninstall torch -y
done
git submodule update --init --recursive --force
start_time=`date +%s%N`
if command -v 'ts' &> /dev/null; then
TORCH_CUDA_ARCH_LIST='6.1 7.5' USE_SYSTEM_NCCL=1 python setup.py develop --user | ts
ret=${PIPESTATUS[0]}
else
TORCH_CUDA_ARCH_LIST='6.1 7.5' USE_SYSTEM_NCCL=1 python setup.py develop --user
ret=$?
fi
end_time=`date +%s%N`
time_cost=`echo "scale=3;($end_time-$start_time)/1000000000" | bc -l`
echo '****** git commit' `git rev-parse HEAD` ', compile time' $time_cost 'second, compile exit code' $ret ' ******'
return $ret
)}
# build the first time to populate ccache
build "$COMMIT~1"
# build the second time, expect ccache to work, the full build time should be less than 5 minutes
build "$COMMIT~1"
# build the first time to populate ccache
build $COMMIT
# build the second time, expect ccache to work, the full build time should be less than 5 minutes
# but it takes 35 minutes and does a full build on my system
build $COMMIT$ grep 'git commit' a.log
****** git commit 0db33eda2a2ca813ee00162ba062ce31d564f8f4 , compile time 124.966 second, compile exit code 0 ******
****** git commit 0db33eda2a2ca813ee00162ba062ce31d564f8f4 , compile time 122.681 second, compile exit code 0 ******
****** git commit 9354a68e7d8c4680a115b70b9b14565cd42cb03f , compile time 2067.030 second, compile exit code 0 ******
****** git commit 9354a68e7d8c4680a115b70b9b14565cd42cb03f , compile time 2055.776 second, compile exit code 0 ******
(The reason that the first build takes only 120 second is because I already have cache of that.)
Expected behavior
ccache should work
Environment
N/A
Additional context
N/A
cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @malfet @seemethere @walterddr @ptrblck @zasdfgbnm