[TorchRec][PT2 compile] enable dynamo in _get_user_embeddings #136798

TroyGarden · 2024-09-26T21:52:21Z

Summary:

context

enable the _get_user_embeddings function
run failed at P1610151892

  torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
  GuardOnDataDependentSymNode: Could not guard on data-dependent expression u22 <= 0 (unhinted: u22 <= 0).  (Size-like symbols: u22)

  ATTENTION: guard_size_oblivious would fix the error, evaluating expression to False.
  Maybe you need to add guard_size_oblivious to framework code, see doc below for more guidance.

  Potential framework code culprit (scroll up for full backtrace):
    File "/data/users/hhy/fbsource/buck-out/v2/gen/fbcode/38472faba4e3e6c1/aps_models/ads/icvr/__icvr_launcher_live__/icvr_launcher_live#link-tree/torch/_decomp/decompositions.py", line 1692, in native_layer_norm_backward
      if M <= 0 or N <= 0:

    N = prod(inner_dims)  # type: ignore[arg-type]
    M = prod(outer_dims)  # type: ignore[arg-type]
    if M <= 0 or N <= 0:
        return (
            input.new_zeros(input_shape) if output_mask[0] else None,
            input.new_zeros(input_shape[axis:]) if output_mask[1] else None,
            input.new_zeros(input_shape[axis:]) if output_mask[2] else None,
        )

changes

use guard_size_oblivious since the new_zeros return is kind of optimization, shouldn't impact the correctness of the follow up code logic.
the size ret[i][j] could be zero, so the change in V1 isn't valid
for more details: post

    from torch.fx.experimental.symbolic_shapes import guard_size_oblivious
    if guard_size_oblivious(M <= 0) or guard_size_oblivious(N <= 0):

past

found u22 was introduced at

    def _wait_impl(self) -> List[List[int]]:
        # Can not use is_torchdynamo_compiling(), as every such condition should be independent for compilation with graph breaks.
        if isinstance(self._splits_awaitable, dist.Work):
            self._splits_awaitable.wait()

        ret = self._output_tensor.view(self.num_workers, -1).T.tolist()  # <------ u22 introduced here

        if not torch.jit.is_scripting() and is_torchdynamo_compiling():
            for i in range(len(ret)):
                for j in range(len(ret[i])):
                    torch._check_is_size(ret[i][j])   # <----------  my question: why the _check_is_size isn't enough??
                    torch._check(ret[i][j] > 0)   # <------ added by diff V1

Test Plan:

run command

TORCH_SHOW_CPP_STACKTRACES=1 TORCHDYNAMO_EXTENDED_DEBUG_CPP=1 TORCH_LOGS="+graph_code,output_code,dynamic,aot,guards,verbose_guards,recompiles,graph_breaks" TORCH_TRACE=/var/tmp/tt buck2 run fbcode//mode/opt fbcode//aps_models/ads/icvr:icvr_launcher_live -- mode=fmc/local_ig_fm_v4_mini training.pipeline_type=pt2 2>&1 | tee -a `tagT`.`tagH`.log

results

before
without enabling _get_user_embeddings
14 Failures and Restarts
log: P1610151892
{F1889387940}
V1
enable _get_user_embeddings
with torch._check(ret[i][j] > 0)
13 Failures and Restarts
{F1889388378}
V2
enable _get_user_embeddings
with if guard_size_oblivious(M <= 0) or guard_size_oblivious(N <= 0):
tlparse
if guard_size_oblivious(M <= 0) or guard_size_oblivious(N <= 0):

Differential Revision: D63424929

pytorch-bot · 2024-09-26T21:52:24Z

This appears to be a diff that was exported from phabricator, but the PR author does not have sufficient permissions to run CI. @TroyGarden, please do step 2 of internal wiki to get write access so you do not need to get CI approvals in the future. If you think this is a mistake, please contact the Pytorch Dev Infra team.

pytorch-bot · 2024-09-26T21:52:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136798

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7f7f670 with merge base a02093e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-09-26T21:52:37Z