KEMBAR78
Regression in the compilation of the torch.all operation in PyTorch version 2.6.0 compared to 2.5.1 · Issue #145220 · pytorch/pytorch · GitHub
Skip to content

Regression in the compilation of the torch.all operation in PyTorch version 2.6.0 compared to 2.5.1 #145220

@wdziurdz

Description

@wdziurdz

🐛 Describe the bug

There is an issue with tracing after upgrading to PyTorch 2.6.0 from 2.5.1. It appears to be a regression related to compiling the torch.all operation.
Before the upgrade, the code below compiles without any graph breaks in PyTorch 2.5.1:

import torch

@torch.compile(backend="inductor")
def compiled_fn(input_tensor: torch.Tensor):
    output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)
    result = torch.all(input_tensor, dim=2, out=output_tensor)
    return result


if __name__ == "__main__":
    input_tensor = torch.randint(0, 2, (2, 3, 4), dtype=torch.bool, device="cpu")

    output = compiled_fn(input_tensor)

The code compiles to the following FX graph in PyTorch 2.5.1:

V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code] TRACED GRAPH
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]  ===== __compiled_fn_1 =====
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]  /home/user1/venv1/lib/python3.10/site-packages/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]     def forward(self, L_input_tensor_: "b8[2, 3, 4][12, 4, 1]cpu"):
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         l_input_tensor_ = L_input_tensor_
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]          # File: tests/compile/test_all.py:5 in compiled_fn, code: output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         empty: "b8[2, 3][3, 1]cpu" = torch.empty((0,), dtype = torch.bool)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         output_tensor: "b8[2, 3][3, 1]cpu" = empty.to(device(type='cpu'));  empty = None
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]          # File: tests/compile/test_all.py:6 in compiled_fn, code: result = torch.all(input_tensor, dim=2, out=output_tensor)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         result: "b8[2, 3][3, 1]cpu" = torch.all(l_input_tensor_, dim = 2, out = output_tensor);  l_input_tensor_ = output_tensor = None
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         return (result,)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code] 

However, after upgrading to PyTorch 2.6.0, the code fails to compile to the same graph and results in graph breaks:

V0120 14:57:46.684000 74548 torch/_dynamo/output_graph.py:972] [0/0_1] COMPILING GRAPH due to GraphCompileReason(reason='out variants with resizing on graph inputs', user_stack=[<FrameSummary file tests/compile/test_all.py, line 6 in compiled_fn>], graph_break=True)
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1615] [0/0_1] REMOVE UNUSED GRAPHARG L['input_tensor']
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code] TRACED GRAPH
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]  ===== __compiled_fn_2 =====
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]  /home/user1/venv1/lib/python3.10/site-packages/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]     def forward(self):
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]          # File: tests/compile/test_all.py:5 in compiled_fn, code: output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]         empty: "b8[0][1]cpu" = torch.empty((0,), dtype = torch.bool)
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]         output_tensor: "b8[0][1]cpu" = empty.to(device(type='cpu'));  empty = None
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]         return (output_tensor,)
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]         
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code] 

Please investigate this regression.
Full logs 2.5.1:

V0120 14:51:10.919000 72022 torch/_dynamo/convert_frame.py:864] [0/0] torchdynamo start compiling compiled_fn tests/compile/test_all.py:3, stack (elided 5 frames):
V0120 14:51:10.919000 72022 torch/_dynamo/convert_frame.py:864] [0/0]   File "tests/compile/test_all.py", line 14, in <module>
V0120 14:51:10.919000 72022 torch/_dynamo/convert_frame.py:864] [0/0]     output = compiled_fn(input_tensor)
V0120 14:51:10.919000 72022 torch/_dynamo/convert_frame.py:864] [0/0] 
I0120 14:51:10.920000 72022 torch/_dynamo/utils.py:859] [0/0] ChromiumEventLogger initialized with id 11952b32-9bff-4a1f-ae82-08757a4285ab
I0120 14:51:10.921000 72022 torch/_dynamo/logging.py:57] [0/0] Step 1: torchdynamo start tracing compiled_fn tests/compile/test_all.py:3
V0120 14:51:10.922000 72022 torch/fx/experimental/symbolic_shapes.py:2498] [0/0] create_env
V0120 14:51:10.939000 72022 torch/_dynamo/symbolic_convert.py:865] [0/0] [__trace_source] TRACE starts_line tests/compile/test_all.py:5 in compiled_fn (compiled_fn)
V0120 14:51:10.939000 72022 torch/_dynamo/symbolic_convert.py:865] [0/0] [__trace_source]         output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)
V0120 14:51:10.940000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_GLOBAL torch []
V0120 14:51:10.941000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_ATTR empty [PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:51:10.942000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_CONST (0,) [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f188888aa20>)]
V0120 14:51:10.942000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_GLOBAL torch [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f188888aa20>), TupleVariable(length=1)]
V0120 14:51:10.943000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_ATTR bool [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f188888aa20>), TupleVariable(length=1), PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:51:10.944000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_CONST ('dtype',) [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f188888aa20>), TupleVariable(length=1), ConstantVariable()]
V0120 14:51:10.944000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE CALL_FUNCTION_KW 2 [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f188888aa20>), TupleVariable(length=1), ConstantVariable(), TupleVariable(length=1)]
V0120 14:51:10.947000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_ATTR to [TensorVariable()]
V0120 14:51:10.947000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_FAST input_tensor [GetAttrVariable()]
V0120 14:51:10.948000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_ATTR device [GetAttrVariable(), LazyVariableTracker()]
V0120 14:51:10.948000 72022 torch/_dynamo/output_graph.py:2107] [0/0] create_graph_input L_input_tensor_ L['input_tensor']
V0120 14:51:10.949000 72022 torch/_dynamo/variables/builder.py:2702] [0/0] wrap_to_fake L['input_tensor'] (2, 3, 4) StatefulSymbolicContext(dynamic_sizes=[<DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>], dynamic_strides=[<DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>], constraint_sizes=[None, None, None], constraint_strides=[None, None, None], view_base_context=None, tensor_source=LocalSource(local_name='input_tensor', cell_or_freevar=False), shape_env_to_source_to_symbol_cache={}) <class 'torch.Tensor'>
V0120 14:51:10.951000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE CALL_FUNCTION 1 [GetAttrVariable(), ConstantVariable()]
V0120 14:51:10.952000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE STORE_FAST output_tensor [TensorVariable()]
V0120 14:51:10.953000 72022 torch/_dynamo/symbolic_convert.py:865] [0/0] [__trace_source] TRACE starts_line tests/compile/test_all.py:6 in compiled_fn (compiled_fn)
V0120 14:51:10.953000 72022 torch/_dynamo/symbolic_convert.py:865] [0/0] [__trace_source]         result = torch.all(input_tensor, dim=2, out=output_tensor)
V0120 14:51:10.953000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_GLOBAL torch []
V0120 14:51:10.953000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_ATTR all [PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:51:10.954000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_FAST input_tensor [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f188888aa20>)]
V0120 14:51:10.954000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_CONST 2 [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f188888aa20>), TensorVariable()]
V0120 14:51:10.955000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_FAST output_tensor [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f188888aa20>), TensorVariable(), ConstantVariable()]
V0120 14:51:10.955000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_CONST ('dim', 'out') [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f188888aa20>), TensorVariable(), ConstantVariable(), TensorVariable()]
V0120 14:51:10.956000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE CALL_FUNCTION_KW 3 [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f188888aa20>), TensorVariable(), ConstantVariable(), TensorVariable(), TupleVariable(length=2)]
V0120 14:51:10.959000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE STORE_FAST result [TensorVariable()]
V0120 14:51:10.960000 72022 torch/_dynamo/symbolic_convert.py:865] [0/0] [__trace_source] TRACE starts_line tests/compile/test_all.py:7 in compiled_fn (compiled_fn)
V0120 14:51:10.960000 72022 torch/_dynamo/symbolic_convert.py:865] [0/0] [__trace_source]         return result
V0120 14:51:10.960000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE LOAD_FAST result []
V0120 14:51:10.960000 72022 torch/_dynamo/symbolic_convert.py:888] [0/0] [__trace_bytecode] TRACE RETURN_VALUE None [TensorVariable()]
I0120 14:51:10.961000 72022 torch/_dynamo/logging.py:57] [0/0] Step 1: torchdynamo done tracing compiled_fn (RETURN_VALUE)
V0120 14:51:10.961000 72022 torch/_dynamo/symbolic_convert.py:2971] [0/0] RETURN_VALUE triggered compile
V0120 14:51:10.961000 72022 torch/_dynamo/output_graph.py:1004] [0/0] COMPILING GRAPH due to GraphCompileReason(reason='return_value', user_stack=[<FrameSummary file tests/compile/test_all.py, line 7 in compiled_fn>], graph_break=False)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code] TRACED GRAPH
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]  ===== __compiled_fn_1 =====
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]  /home/user1/venv1/lib/python3.10/site-packages/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]     def forward(self, L_input_tensor_: "b8[2, 3, 4][12, 4, 1]cpu"):
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         l_input_tensor_ = L_input_tensor_
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]          # File: tests/compile/test_all.py:5 in compiled_fn, code: output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         empty: "b8[2, 3][3, 1]cpu" = torch.empty((0,), dtype = torch.bool)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         output_tensor: "b8[2, 3][3, 1]cpu" = empty.to(device(type='cpu'));  empty = None
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]          # File: tests/compile/test_all.py:6 in compiled_fn, code: result = torch.all(input_tensor, dim=2, out=output_tensor)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         result: "b8[2, 3][3, 1]cpu" = torch.all(l_input_tensor_, dim = 2, out = output_tensor);  l_input_tensor_ = output_tensor = None
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         return (result,)
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code]         
V0120 14:51:10.966000 72022 torch/_dynamo/output_graph.py:1371] [0/0] [__graph_code] 
I0120 14:51:10.968000 72022 torch/_dynamo/logging.py:57] [0/0] Step 2: calling compiler function inductor
V0120 14:51:12.792000 72022 torch/fx/experimental/symbolic_shapes.py:5201] [0/0] eval True == True [statically known]
I0120 14:51:22.070000 72022 torch/fx/experimental/symbolic_shapes.py:3646] [0/0] produce_guards
W0120 14:51:22.072000 72022 torch/_inductor/debug.py:434] [0/0] model__0_inference_0 debug trace: /home/user1/qnpu/env_name/src/torch_compile_debug/run_2025_01_20_14_51_10_921557-pid_72022/torchinductor/model__0_inference_0.0
I0120 14:51:22.076000 72022 torch/_dynamo/logging.py:57] [0/0] Step 2: done compiler function inductor
I0120 14:51:22.080000 72022 torch/fx/experimental/symbolic_shapes.py:3646] [0/0] produce_guards
V0120 14:51:22.080000 72022 torch/fx/experimental/symbolic_shapes.py:3830] [0/0] track_symint L['input_tensor'].size()[0] 2 None
V0120 14:51:22.081000 72022 torch/fx/experimental/symbolic_shapes.py:3830] [0/0] track_symint L['input_tensor'].size()[1] 3 None
V0120 14:51:22.081000 72022 torch/fx/experimental/symbolic_shapes.py:3830] [0/0] track_symint L['input_tensor'].size()[2] 4 None
V0120 14:51:22.081000 72022 torch/fx/experimental/symbolic_shapes.py:3830] [0/0] track_symint L['input_tensor'].stride()[0] 12 None
V0120 14:51:22.082000 72022 torch/fx/experimental/symbolic_shapes.py:3830] [0/0] track_symint L['input_tensor'].stride()[1] 4 None
V0120 14:51:22.082000 72022 torch/fx/experimental/symbolic_shapes.py:3830] [0/0] track_symint L['input_tensor'].stride()[2] 1 None
V0120 14:51:22.082000 72022 torch/fx/experimental/symbolic_shapes.py:3830] [0/0] track_symint L['input_tensor'].storage_offset() 0 None
V0120 14:51:22.083000 72022 torch/fx/experimental/symbolic_shapes.py:3998] [0/0] Skipping guard L['input_tensor'].size()[0] == 2
V0120 14:51:22.083000 72022 torch/fx/experimental/symbolic_shapes.py:3998] [0/0] Skipping guard L['input_tensor'].size()[1] == 3
V0120 14:51:22.084000 72022 torch/fx/experimental/symbolic_shapes.py:3998] [0/0] Skipping guard L['input_tensor'].size()[2] == 4
V0120 14:51:22.084000 72022 torch/fx/experimental/symbolic_shapes.py:3998] [0/0] Skipping guard L['input_tensor'].stride()[0] == 12
V0120 14:51:22.085000 72022 torch/fx/experimental/symbolic_shapes.py:3998] [0/0] Skipping guard L['input_tensor'].stride()[1] == 4
V0120 14:51:22.085000 72022 torch/fx/experimental/symbolic_shapes.py:3998] [0/0] Skipping guard L['input_tensor'].stride()[2] == 1
V0120 14:51:22.085000 72022 torch/fx/experimental/symbolic_shapes.py:3998] [0/0] Skipping guard L['input_tensor'].storage_offset() == 0
V0120 14:51:22.086000 72022 torch/_dynamo/guards.py:2314] [0/0] [__guards] GUARDS:
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] 
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] TREE_GUARD_MANAGER:
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] +- RootGuardManager
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | +- DEFAULT_DEVICE: utils_device.CURRENT_DEVICE == None                           # _dynamo/output_graph.py:471 in init_ambient_guards
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | +- GLOBAL_STATE: ___check_global_state()
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | +- TORCH_FUNCTION_MODE_STACK: ___check_torch_function_mode_stack()
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | +- GuardManager: source=L['input_tensor'], accessed_by=DictGetItemGuardAccessor(input_tensor)
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | +- TENSOR_MATCH: check_tensor(L['input_tensor'], Tensor, DispatchKeySet(CPU, BackendSelect, ADInplaceOrView, AutogradCPU), torch.bool, device=None, requires_grad=False, size=[2, 3, 4], stride=[12, 4, 1])  # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | +- NO_HASATTR: hasattr(L['input_tensor'], '_dynamo_dynamic_indices') == False  # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | +- GuardManager: source=G, accessed_by=GlobalsGuardAccessor
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | +- GuardManager: source=G['torch'], accessed_by=DictGetItemGuardAccessor(torch)
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | | +- ID_MATCH: ___check_obj_id(G['torch'], 139743351173376)                  # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | | +- GuardManager: source=G['torch'].all, accessed_by=GetAttrGuardAccessor(all)
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | | | +- ID_MATCH: ___check_obj_id(G['torch'].all, 139743348124352)              # result = torch.all(input_tensor, dim=2, out=output_tensor)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:6 in compiled_fn
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | | +- GuardManager: source=G['torch'].bool, accessed_by=GetAttrGuardAccessor(bool)
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | | | +- EQUALS_MATCH: G['torch'].bool == torch.bool                                 # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | | +- GuardManager: source=G['torch'].empty, accessed_by=GetAttrGuardAccessor(empty)
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] | | | | +- ID_MATCH: ___check_obj_id(G['torch'].empty, 139743348128512)            # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:51:22.087000 72022 torch/_dynamo/guards.py:2280] [0/0] [__guards] 
V0120 14:51:22.088000 72022 torch/_dynamo/convert_frame.py:1234] skipping: _fn (reason: in skipfiles, file: /home/user1/venv1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py)
V0120 14:51:22.089000 72022 torch/_dynamo/convert_frame.py:1234] skipping: _maybe_set_eval_frame (reason: in skipfiles, file: /home/user1/venv1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py)
V0120 14:51:22.089000 72022 torch/_dynamo/convert_frame.py:1234] skipping: justknobs_check (reason: in skipfiles, file: /home/user1/venv1/lib/python3.10/site-packages/torch/_utils_internal.py)

Full logs 2.6.0:

V0120 14:57:46.629000 74548 torch/_dynamo/convert_frame.py:1345] skipping: _is_skip_guard_eval_unsafe_stance (reason: in skipfiles, file: /home/user1/venv1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py)
I0120 14:57:46.631000 74548 torch/_dynamo/utils.py:1162] [0/0] ChromiumEventLogger initialized with id 9bec8ac0-9067-4f58-ba32-04edd2949f59
V0120 14:57:46.632000 74548 torch/_dynamo/convert_frame.py:930] [0/0] torchdynamo start compiling compiled_fn tests/compile/test_all.py:3, stack (elided 5 frames):
V0120 14:57:46.632000 74548 torch/_dynamo/convert_frame.py:930] [0/0]   File "tests/compile/test_all.py", line 14, in <module>
V0120 14:57:46.632000 74548 torch/_dynamo/convert_frame.py:930] [0/0]     output = compiled_fn(input_tensor)
V0120 14:57:46.632000 74548 torch/_dynamo/convert_frame.py:930] [0/0] 
I0120 14:57:46.633000 74548 torch/_dynamo/symbolic_convert.py:2706] [0/0] Step 1: torchdynamo start tracing compiled_fn tests/compile/test_all.py:3
I0120 14:57:46.634000 74548 torch/fx/experimental/symbolic_shapes.py:3192] [0/0] create_env
V0120 14:57:46.637000 74548 torch/_dynamo/symbolic_convert.py:932] [0/0] [__trace_source] TRACE starts_line tests/compile/test_all.py:5 in compiled_fn (compiled_fn)
V0120 14:57:46.637000 74548 torch/_dynamo/symbolic_convert.py:932] [0/0] [__trace_source]         output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)
V0120 14:57:46.638000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_GLOBAL torch []
V0120 14:57:46.640000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR empty [PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:57:46.641000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_CONST (0,) [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>)]
V0120 14:57:46.642000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_GLOBAL torch [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>), TupleVariable(length=1)]
V0120 14:57:46.642000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR bool [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>), TupleVariable(length=1), PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:57:46.643000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_CONST ('dtype',) [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>), TupleVariable(length=1), ConstantVariable(dtype: torch.bool)]
V0120 14:57:46.643000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE CALL_FUNCTION_KW 2 [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>), TupleVariable(length=1), ConstantVariable(dtype: torch.bool), TupleVariable(length=1)]
V0120 14:57:46.655000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR to [TensorVariable()]
V0120 14:57:46.655000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_FAST input_tensor [GetAttrVariable(TensorVariable(), to)]
V0120 14:57:46.656000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR device [GetAttrVariable(TensorVariable(), to), LazyVariableTracker()]
V0120 14:57:46.656000 74548 torch/_dynamo/variables/builder.py:2853] [0/0] wrap_to_fake L['input_tensor'] (2, 3, 4) StatefulSymbolicContext(dynamic_sizes=[<DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>], dynamic_strides=[<DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>], constraint_sizes=[None, None, None], constraint_strides=[None, None, None], view_base_context=None, tensor_source=LocalSource(local_name='input_tensor', is_input=True, is_derefed_cell_contents=False), shape_env_to_source_to_symbol_cache={}) <class 'torch.Tensor'>
V0120 14:57:46.658000 74548 torch/_dynamo/output_graph.py:2156] [0/0] create_graph_input L_input_tensor_ L['input_tensor'] FakeTensor(..., size=(2, 3, 4), dtype=torch.bool) at debug_level 0 before=False
V0120 14:57:46.659000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), to), ConstantVariable(device: device(type='cpu'))]
V0120 14:57:46.660000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE STORE_FAST output_tensor [TensorVariable()]
V0120 14:57:46.661000 74548 torch/_dynamo/symbolic_convert.py:932] [0/0] [__trace_source] TRACE starts_line tests/compile/test_all.py:6 in compiled_fn (compiled_fn)
V0120 14:57:46.661000 74548 torch/_dynamo/symbolic_convert.py:932] [0/0] [__trace_source]         result = torch.all(input_tensor, dim=2, out=output_tensor)
V0120 14:57:46.661000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_GLOBAL torch []
V0120 14:57:46.662000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR all [PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:57:46.662000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_FAST input_tensor [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>)]
V0120 14:57:46.663000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_CONST 2 [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>), TensorVariable()]
V0120 14:57:46.663000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_FAST output_tensor [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>), TensorVariable(), ConstantVariable(int: 2)]
V0120 14:57:46.664000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_CONST ('dim', 'out') [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>), TensorVariable(), ConstantVariable(int: 2), TensorVariable()]
V0120 14:57:46.664000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE CALL_FUNCTION_KW 3 [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>), TensorVariable(), ConstantVariable(int: 2), TensorVariable(), TupleVariable(length=2)]
V0120 14:57:46.668000 74548 torch/_dynamo/symbolic_convert.py:435] [0/0] [__graph_breaks] Graph break in user code at tests/compile/test_all.py:6
V0120 14:57:46.668000 74548 torch/_dynamo/symbolic_convert.py:435] [0/0] [__graph_breaks] Reason: Unsupported: out variants with resizing on graph inputs
V0120 14:57:46.668000 74548 torch/_dynamo/symbolic_convert.py:435] [0/0] [__graph_breaks] User code traceback:
V0120 14:57:46.668000 74548 torch/_dynamo/symbolic_convert.py:435] [0/0] [__graph_breaks]   File "tests/compile/test_all.py", line 6, in compiled_fn
V0120 14:57:46.668000 74548 torch/_dynamo/symbolic_convert.py:435] [0/0] [__graph_breaks]     result = torch.all(input_tensor, dim=2, out=output_tensor)
V0120 14:57:46.668000 74548 torch/_dynamo/symbolic_convert.py:435] [0/0] [__graph_breaks] 
I0120 14:57:46.668000 74548 torch/_dynamo/convert_frame.py:755] [0/0] Restarting analysis due to _dynamo/symbolic_convert.py:161 in fail_and_restart_analysis
I0120 14:57:46.669000 74548 torch/_dynamo/symbolic_convert.py:2706] [0/0_1] Step 1: torchdynamo start tracing compiled_fn tests/compile/test_all.py:3
I0120 14:57:46.670000 74548 torch/fx/experimental/symbolic_shapes.py:3192] [0/0_1] create_env
V0120 14:57:46.671000 74548 torch/_dynamo/symbolic_convert.py:932] [0/0_1] [__trace_source] TRACE starts_line tests/compile/test_all.py:5 in compiled_fn (compiled_fn)
V0120 14:57:46.671000 74548 torch/_dynamo/symbolic_convert.py:932] [0/0_1] [__trace_source]         output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)
V0120 14:57:46.671000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_GLOBAL torch []
V0120 14:57:46.672000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_ATTR empty [PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:57:46.672000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_CONST (0,) [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>)]
V0120 14:57:46.673000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_GLOBAL torch [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>), TupleVariable(length=1)]
V0120 14:57:46.673000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_ATTR bool [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>), TupleVariable(length=1), PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:57:46.674000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_CONST ('dtype',) [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>), TupleVariable(length=1), ConstantVariable(dtype: torch.bool)]
V0120 14:57:46.674000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE CALL_FUNCTION_KW 2 [TorchInGraphFunctionVariable(<built-in method empty of type object at 0x7f144a228020>), TupleVariable(length=1), ConstantVariable(dtype: torch.bool), TupleVariable(length=1)]
V0120 14:57:46.675000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_ATTR to [TensorVariable()]
V0120 14:57:46.676000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_FAST input_tensor [GetAttrVariable(TensorVariable(), to)]
V0120 14:57:46.676000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_ATTR device [GetAttrVariable(TensorVariable(), to), LazyVariableTracker()]
V0120 14:57:46.677000 74548 torch/_dynamo/variables/builder.py:2853] [0/0_1] wrap_to_fake L['input_tensor'] (2, 3, 4) StatefulSymbolicContext(dynamic_sizes=[<DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>], dynamic_strides=[<DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>], constraint_sizes=[None, None, None], constraint_strides=[None, None, None], view_base_context=None, tensor_source=LocalSource(local_name='input_tensor', is_input=True, is_derefed_cell_contents=False), shape_env_to_source_to_symbol_cache={}) <class 'torch.Tensor'>
V0120 14:57:46.678000 74548 torch/_dynamo/output_graph.py:2156] [0/0_1] create_graph_input L_input_tensor_ L['input_tensor'] FakeTensor(..., size=(2, 3, 4), dtype=torch.bool) at debug_level 0 before=False
V0120 14:57:46.679000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), to), ConstantVariable(device: device(type='cpu'))]
V0120 14:57:46.680000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE STORE_FAST output_tensor [TensorVariable()]
V0120 14:57:46.681000 74548 torch/_dynamo/symbolic_convert.py:932] [0/0_1] [__trace_source] TRACE starts_line tests/compile/test_all.py:6 in compiled_fn (compiled_fn)
V0120 14:57:46.681000 74548 torch/_dynamo/symbolic_convert.py:932] [0/0_1] [__trace_source]         result = torch.all(input_tensor, dim=2, out=output_tensor)
V0120 14:57:46.681000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_GLOBAL torch []
V0120 14:57:46.681000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_ATTR all [PythonModuleVariable(<module 'torch' from '/home/user1/venv1/lib/python3.10/site-packages/torch/__init__.py'>)]
V0120 14:57:46.682000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_FAST input_tensor [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>)]
V0120 14:57:46.682000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_CONST 2 [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>), TensorVariable()]
V0120 14:57:46.683000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_FAST output_tensor [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>), TensorVariable(), ConstantVariable(int: 2)]
V0120 14:57:46.683000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE LOAD_CONST ('dim', 'out') [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>), TensorVariable(), ConstantVariable(int: 2), TensorVariable()]
V0120 14:57:46.684000 74548 torch/_dynamo/symbolic_convert.py:955] [0/0_1] [__trace_bytecode] TRACE CALL_FUNCTION_KW 3 [TorchInGraphFunctionVariable(<built-in method all of type object at 0x7f144a228020>), TensorVariable(), ConstantVariable(int: 2), TensorVariable(), TupleVariable(length=2)]
V0120 14:57:46.684000 74548 torch/_dynamo/output_graph.py:972] [0/0_1] COMPILING GRAPH due to GraphCompileReason(reason='out variants with resizing on graph inputs', user_stack=[<FrameSummary file tests/compile/test_all.py, line 6 in compiled_fn>], graph_break=True)
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1615] [0/0_1] REMOVE UNUSED GRAPHARG L['input_tensor']
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code] TRACED GRAPH
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]  ===== __compiled_fn_2 =====
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]  /home/user1/venv1/lib/python3.10/site-packages/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]     def forward(self):
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]          # File: tests/compile/test_all.py:5 in compiled_fn, code: output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]         empty: "b8[0][1]cpu" = torch.empty((0,), dtype = torch.bool)
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]         output_tensor: "b8[0][1]cpu" = empty.to(device(type='cpu'));  empty = None
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]         return (output_tensor,)
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code]         
V0120 14:57:46.689000 74548 torch/_dynamo/output_graph.py:1353] [0/0_1] [__graph_code] 
I0120 14:57:46.691000 74548 torch/_dynamo/output_graph.py:1458] [0/0_1] Step 2: calling compiler function inductor
W0120 14:57:48.602000 74548 torch/_inductor/debug.py:435] [0/0_1] model__0_inference_0 debug trace: /home/user1/qnpu/env_name/src/torch_compile_debug/run_2025_01_20_14_57_46_633319-pid_74548/torchinductor/model__0_inference_0.0
I0120 14:57:48.606000 74548 torch/_dynamo/output_graph.py:1463] [0/0_1] Step 2: done compiler function inductor
I0120 14:57:48.611000 74548 torch/fx/experimental/symbolic_shapes.py:4547] [0/0_1] produce_guards
V0120 14:57:48.612000 74548 torch/fx/experimental/symbolic_shapes.py:4755] [0/0_1] track_symint L['input_tensor'].size()[0] 2 None
V0120 14:57:48.612000 74548 torch/fx/experimental/symbolic_shapes.py:4755] [0/0_1] track_symint L['input_tensor'].size()[1] 3 None
V0120 14:57:48.612000 74548 torch/fx/experimental/symbolic_shapes.py:4755] [0/0_1] track_symint L['input_tensor'].size()[2] 4 None
V0120 14:57:48.613000 74548 torch/fx/experimental/symbolic_shapes.py:4755] [0/0_1] track_symint L['input_tensor'].stride()[0] 12 None
V0120 14:57:48.613000 74548 torch/fx/experimental/symbolic_shapes.py:4755] [0/0_1] track_symint L['input_tensor'].stride()[1] 4 None
V0120 14:57:48.613000 74548 torch/fx/experimental/symbolic_shapes.py:4755] [0/0_1] track_symint L['input_tensor'].stride()[2] 1 None
V0120 14:57:48.614000 74548 torch/fx/experimental/symbolic_shapes.py:4755] [0/0_1] track_symint L['input_tensor'].storage_offset() 0 None
V0120 14:57:48.614000 74548 torch/fx/experimental/symbolic_shapes.py:4958] [0/0_1] Skipping guard L['input_tensor'].size()[0] == 2
V0120 14:57:48.615000 74548 torch/fx/experimental/symbolic_shapes.py:4958] [0/0_1] Skipping guard L['input_tensor'].size()[1] == 3
V0120 14:57:48.615000 74548 torch/fx/experimental/symbolic_shapes.py:4958] [0/0_1] Skipping guard L['input_tensor'].size()[2] == 4
V0120 14:57:48.616000 74548 torch/fx/experimental/symbolic_shapes.py:4958] [0/0_1] Skipping guard L['input_tensor'].stride()[0] == 12
V0120 14:57:48.616000 74548 torch/fx/experimental/symbolic_shapes.py:4958] [0/0_1] Skipping guard L['input_tensor'].stride()[1] == 4
V0120 14:57:48.616000 74548 torch/fx/experimental/symbolic_shapes.py:4958] [0/0_1] Skipping guard L['input_tensor'].stride()[2] == 1
V0120 14:57:48.617000 74548 torch/fx/experimental/symbolic_shapes.py:4958] [0/0_1] Skipping guard L['input_tensor'].storage_offset() == 0
V0120 14:57:48.617000 74548 torch/_dynamo/guards.py:2364] [0/0_1] [__guards] GUARDS:
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] 
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] TREE_GUARD_MANAGER:
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] +- RootGuardManager
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | +- DEFAULT_DEVICE: utils_device.CURRENT_DEVICE == None                           # _dynamo/output_graph.py:493 in init_ambient_guards
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | +- GLOBAL_STATE: ___check_global_state()
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | +- TORCH_FUNCTION_MODE_STACK: ___check_torch_function_mode_stack()
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | +- GuardManager: source=L['input_tensor'], accessed_by=DictGetItemGuardAccessor('input_tensor')
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | +- TENSOR_MATCH: check_tensor(L['input_tensor'], Tensor, DispatchKeySet(CPU, BackendSelect, ADInplaceOrView, AutogradCPU), torch.bool, device=None, requires_grad=False, size=[2, 3, 4], stride=[12, 4, 1])  # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | +- NO_HASATTR: hasattr(L['input_tensor'], '_dynamo_dynamic_indices') == False  # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | +- GuardManager: source=G, accessed_by=GlobalsGuardAccessor
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | +- GuardManager: source=G['torch'], accessed_by=DictGetItemGuardAccessor('torch')
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | | +- ID_MATCH: ___check_obj_id(G['torch'], 139725124415584)                  # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | | +- GuardManager: source=G['torch'].all, accessed_by=GetAttrGuardAccessor(all)
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | | | +- ID_MATCH: ___check_obj_id(G['torch'].all, 139725121374464)              # result = torch.all(input_tensor, dim=2, out=output_tensor)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:6 in compiled_fn
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | | +- GuardManager: source=G['torch'].bool, accessed_by=GetAttrGuardAccessor(bool)
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | | | +- EQUALS_MATCH: G['torch'].bool == torch.bool                                 # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | | +- GuardManager: source=G['torch'].empty, accessed_by=GetAttrGuardAccessor(empty)
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] | | | | +- ID_MATCH: ___check_obj_id(G['torch'].empty, 139725121378624)            # output_tensor = torch.empty((0,), dtype=torch.bool).to(input_tensor.device)  # qnpu/env_name/src/pytorch-integration/tests/pytest_working/any_mode/test_hpu_all_any.py:5 in compiled_fn
V0120 14:57:48.618000 74548 torch/_dynamo/guards.py:2321] [0/0_1] [__guards] 
V0120 14:57:49.619000 74548 torch/_dynamo/guards.py:2346] [0/0_1] [__guards] Guard eval latency = 0.76 us
I0120 14:57:49.620000 74548 torch/_dynamo/pgo.py:636] [0/0_1] put_code_state: no cache key, skipping
V0120 14:57:49.626000 74548 torch/_dynamo/convert_frame.py:1345] skipping: _fn (reason: in skipfiles, file: /home/user1/venv1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py)
V0120 14:57:49.627000 74548 torch/_dynamo/convert_frame.py:1345] skipping: _callback_from_stance (reason: in skipfiles, file: /home/user1/venv1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py)
V0120 14:57:49.627000 74548 torch/_dynamo/convert_frame.py:1345] skipping: _maybe_set_eval_frame (reason: in skipfiles, file: /home/user1/venv1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py)
V0120 14:57:49.628000 74548 torch/_dynamo/convert_frame.py:1345] skipping: justknobs_check (reason: in skipfiles, file: /home/user1/venv1/lib/python3.10/site-packages/torch/_utils_internal.py)
V0120 14:57:49.629000 74548 torch/_dynamo/convert_frame.py:930] [1/0] torchdynamo start compiling torch_dynamo_resume_in_compiled_fn_at_6 tests/compile/test_all.py:6, stack (elided 5 frames):
V0120 14:57:49.629000 74548 torch/_dynamo/convert_frame.py:930] [1/0]   File "tests/compile/test_all.py", line 14, in <module>
V0120 14:57:49.629000 74548 torch/_dynamo/convert_frame.py:930] [1/0]     output = compiled_fn(input_tensor)
V0120 14:57:49.629000 74548 torch/_dynamo/convert_frame.py:930] [1/0]   File "/home/user1/venv1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 574, in _fn
V0120 14:57:49.629000 74548 torch/_dynamo/convert_frame.py:930] [1/0]     return fn(*args, **kwargs)
V0120 14:57:49.629000 74548 torch/_dynamo/convert_frame.py:930] [1/0] 
I0120 14:57:49.630000 74548 torch/_dynamo/symbolic_convert.py:2706] [1/0] Step 1: torchdynamo start tracing torch_dynamo_resume_in_compiled_fn_at_6 tests/compile/test_all.py:6
I0120 14:57:49.631000 74548 torch/fx/experimental/symbolic_shapes.py:3192] [1/0] create_env
V0120 14:57:49.632000 74548 torch/_dynamo/symbolic_convert.py:932] [1/0] [__trace_source] TRACE starts_line tests/compile/test_all.py:6 in torch_dynamo_resume_in_compiled_fn_at_6 (compiled_fn)
V0120 14:57:49.632000 74548 torch/_dynamo/symbolic_convert.py:932] [1/0] [__trace_source]         result = torch.all(input_tensor, dim=2, out=output_tensor)
V0120 14:57:49.632000 74548 torch/_dynamo/symbolic_convert.py:955] [1/0] [__trace_bytecode] TRACE LOAD_FAST ___stack0 []
V0120 14:57:49.633000 74548 torch/_dynamo/symbolic_convert.py:955] [1/0] [__trace_bytecode] TRACE JUMP_ABSOLUTE 42 [LazyVariableTracker()]
V0120 14:57:49.633000 74548 torch/_dynamo/symbolic_convert.py:955] [1/0] [__trace_bytecode] TRACE STORE_FAST result [LazyVariableTracker()]
V0120 14:57:49.634000 74548 torch/_dynamo/variables/builder.py:2853] [1/0] wrap_to_fake L['___stack0'] (2, 3) StatefulSymbolicContext(dynamic_sizes=[<DimDynamic.STATIC: 2>, <DimDynamic.STATIC: 2>], dynamic_strides=[<DimDynamic.INFER_STRIDE: 4>, <DimDynamic.INFER_STRIDE: 4>], constraint_sizes=[None, None], constraint_strides=[None, None], view_base_context=None, tensor_source=LocalSource(local_name='___stack0', is_input=True, is_derefed_cell_contents=False), shape_env_to_source_to_symbol_cache={}) <class 'torch.Tensor'>
V0120 14:57:49.635000 74548 torch/_dynamo/output_graph.py:2156] [1/0] create_graph_input L_stack0_ L['___stack0'] FakeTensor(..., size=(2, 3), dtype=torch.bool) at debug_level 0 before=False
V0120 14:57:49.637000 74548 torch/_dynamo/symbolic_convert.py:932] [1/0] [__trace_source] TRACE starts_line tests/compile/test_all.py:7 in torch_dynamo_resume_in_compiled_fn_at_6 (compiled_fn)
V0120 14:57:49.637000 74548 torch/_dynamo/symbolic_convert.py:932] [1/0] [__trace_source]         return result
V0120 14:57:49.637000 74548 torch/_dynamo/symbolic_convert.py:955] [1/0] [__trace_bytecode] TRACE LOAD_FAST result []
V0120 14:57:49.637000 74548 torch/_dynamo/symbolic_convert.py:955] [1/0] [__trace_bytecode] TRACE RETURN_VALUE None [TensorVariable()]
V0120 14:57:49.638000 74548 torch/_dynamo/convert_frame.py:768] [1/0] Skipping frame because no content in function call torch_dynamo_resume_in_compiled_fn_at_6                     tests/compile/test_all.py 6
I0120 14:57:49.638000 74548 torch/_dynamo/pgo.py:636] [1/0] put_code_state: no cache key, skipping
I0120 14:57:49.644000 74548 torch/_dynamo/eval_frame.py:398] TorchDynamo attempted to trace the following frames: [
I0120 14:57:49.644000 74548 torch/_dynamo/eval_frame.py:398]   * compiled_fn tests/compile/test_all.py:3
I0120 14:57:49.644000 74548 torch/_dynamo/eval_frame.py:398] ]

Versions

Collecting environment information...
PyTorch version: 2.6.0a0+gitc15b011
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: 14.0.5
CMake version: version 3.31.2
Libc version: glibc-2.35

Python version: 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-5.15.0-127-generic-x86_64-with-glibc2.35
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 43 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Gold 6132 CPU @ 2.60GHz
CPU family: 6
Model: 85
Thread(s) per core: 1
Core(s) per socket: 6
Socket(s): 2
Stepping: 0
BogoMIPS: 5187.81
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xsaves arat pku ospke md_clear flush_l1d arch_capabilities
Virtualization: VT-x
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 384 KiB (12 instances)
L1i cache: 384 KiB (12 instances)
L2 cache: 12 MiB (12 instances)
L3 cache: 38.5 MiB (2 instances)
NUMA node(s): 1
NUMA node0 CPU(s): 0-11
Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX flush not necessary, SMT disabled
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI SW loop, KVM SW loop
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] nvidia-cublas-cu12==12.4.5.8
[pip3] nvidia-cuda-cupti-cu12==12.4.127
[pip3] nvidia-cuda-nvrtc-cu12==12.4.127
[pip3] nvidia-cuda-runtime-cu12==12.4.127
[pip3] nvidia-cudnn-cu12==9.1.0.70
[pip3] nvidia-cufft-cu12==11.2.1.3
[pip3] nvidia-curand-cu12==10.3.5.147
[pip3] nvidia-cusolver-cu12==11.6.1.9
[pip3] nvidia-cusparse-cu12==12.3.1.170
[pip3] nvidia-nccl-cu12==2.21.5
[pip3] nvidia-nvjitlink-cu12==12.4.127
[pip3] nvidia-nvtx-cu12==12.4.127
[pip3] torch==2.6.0a0+gitc15b011
[pip3] torch_tb_profiler==0.4.0
[pip3] triton==3.1.0

cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @amjames

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions