KEMBAR78
ccache no longer caches pytorch build after PR 57361 · Issue #58796 · pytorch/pytorch · GitHub
Skip to content

ccache no longer caches pytorch build after PR 57361 #58796

@xwang233

Description

@xwang233

🐛 Bug

ccache no longer caches pytorch build after #57361 , 9354a68

Before the PR, a fully cached build with ccache takes 2 minutes on my machine. After the PR, it takes 35 minutes, because every CUDA objects and many CPU objects are recompiled.

A full log is attached, so that you can check which objects are cached and which aren't.

To Reproduce

I wrote a bash script for this

First, install ccache using instructions here https://github.com/pytorch/pytorch/blob/d7d0fa20698127c2eb113d86a4b6de22a098d179/CONTRIBUTING.md#use-ccache

Then, run this script

#!/bin/bash

set -ex

TORCH_SOURCE_DIR=~/Developer/pytorch
COMMIT=9354a68e7d8c4680a115b70b9b14565cd42cb03f

build() {(
    cd $TORCH_SOURCE_DIR
    git checkout "$1"
    git clean -ffd
    for i in `seq 5`; do
        python setup.py clean
        pip uninstall torch -y
    done

    git submodule update --init --recursive --force

    start_time=`date +%s%N`

    if command -v 'ts' &> /dev/null; then
        TORCH_CUDA_ARCH_LIST='6.1 7.5' USE_SYSTEM_NCCL=1 python setup.py develop --user | ts
        ret=${PIPESTATUS[0]}
    else
        TORCH_CUDA_ARCH_LIST='6.1 7.5' USE_SYSTEM_NCCL=1 python setup.py develop --user
        ret=$?
    fi

    end_time=`date +%s%N`

    time_cost=`echo "scale=3;($end_time-$start_time)/1000000000" | bc -l`
    echo '****** git commit' `git rev-parse HEAD` ', compile time' $time_cost 'second, compile exit code' $ret ' ******'

    return $ret
)}

# build the first time to populate ccache
build "$COMMIT~1"
# build the second time, expect ccache to work, the full build time should be less than 5 minutes
build "$COMMIT~1"


# build the first time to populate ccache
build $COMMIT
# build the second time, expect ccache to work, the full build time should be less than 5 minutes
# but it takes 35 minutes and does a full build on my system
build $COMMIT

Full log https://gist.githubusercontent.com/xwang233/6090715539e2a2eb6794a22523a56df1/raw/7c82f36d8afcce07f91ed0a8b1286a729c1c5105/58796.log

$ grep 'git commit' a.log
****** git commit 0db33eda2a2ca813ee00162ba062ce31d564f8f4 , compile time 124.966 second, compile exit code 0  ******
****** git commit 0db33eda2a2ca813ee00162ba062ce31d564f8f4 , compile time 122.681 second, compile exit code 0  ******
****** git commit 9354a68e7d8c4680a115b70b9b14565cd42cb03f , compile time 2067.030 second, compile exit code 0  ******
****** git commit 9354a68e7d8c4680a115b70b9b14565cd42cb03f , compile time 2055.776 second, compile exit code 0  ******

(The reason that the first build takes only 120 second is because I already have cache of that.)

Expected behavior

ccache should work

Environment

N/A

Additional context

N/A

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @malfet @seemethere @walterddr @ptrblck @zasdfgbnm

Metadata

Metadata

Assignees

No one assigned

    Labels

    high prioritymodule: buildBuild system issuestriage reviewtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions