Performance Regression if not release GIL in cpp wrapper

### 🐛 Describe the bug

After https://github.com/pytorch/pytorch/pull/122554, the gil will not be released.
When use [Throughputbenchmark](https://github.com/pytorch/pytorch/blob/77681facac264866388eb27ed413a075d6edef80/torch/utils/throughput_benchmark.py#L59) to validate model performance, the GIL will make the benchmark very slow.

![image](https://github.com/pytorch/pytorch/assets/54701539/10841865-a09a-4ae1-bc36-822f532fee21)


### Minified repro

```python
# bench_gil
import torch
from torch._inductor import config as inductor_config
inductor_config.cpp_wrapper = True

class SimpleM(torch.nn.Module):
    def __init__(self):
        super(SimpleM, self).__init__()
        self.linear1 = torch.nn.Linear(100, 100)
        self.linear2 = torch.nn.Linear(100, 100)


    def forward(self, x, y):
        return self.linear1(x) + self.linear1(y)

from torch.utils import ThroughputBenchmark
model = torch.compile(SimpleM().bfloat16())
x1 = torch.randn(100, 100).bfloat16()
x2 = torch.randn(100, 100).bfloat16()
with torch.no_grad():
    y = model(x1, x2)
    y = model(x1, x2)

bench = ThroughputBenchmark(model)
bench.add_input(x1, x2)
with torch.no_grad():
    stats = bench.benchmark(
        num_calling_threads=24,
        num_warmup_iters=100,
        num_iters=2400,
    )
print(stats)
```

```
TORCH_COMPILE_DEBUG=1 OMP_NUM_THREADS=1 numactl -C 0-23 -m 0 python bench_gil.py
```

## output
```
3ms->6.8ms on my system
```

### Versions

After https://github.com/pytorch/pytorch/commit/537cd66e73e8a9b33c843d55d546471f3074a390#diff-050ffbf46a890cee19edf85f989025805a1b5b20b26dc5cbe719f9539051b6bb.

cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang @desertfire @chenyang78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance Regression if not release GIL in cpp wrapper #123517

🐛 Describe the bug

Minified repro

output

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance Regression if not release GIL in cpp wrapper #123517

Description

🐛 Describe the bug

Minified repro

output

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions