Detect torch function in lists as well #160256

ezyang · 2025-08-09T18:12:10Z

Stack from ghstack (oldest at bottom):

We basically follow the same pattern we do for tensor arguments. The major downside is we now have to traverse the entirety of the int list / etc where previously we didn't have. Benchmark suggests 2% regression for relevant things.

Signed-off-by: Edward Yang ezyang@meta.com

cc @gchanan

[ghstack-poisoned]

pytorch-bot · 2025-08-09T18:12:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160256

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCm MI2xx CI/CD workflows failing due to : download from https://api.github.com/repos/pytorch/pytorch timed out.

✅ No Failures

As of commit 0e7dbe4 with merge base 8171d60 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This was done exclusively with claude code and I haven't reviewed it yet Signed-off-by: Edward Yang <ezyang@meta.com> ghstack-source-id: 0a57888 Pull-Request: #160256

[ghstack-poisoned]

ezyang · 2025-08-10T04:28:32Z

I have reviewed it and some of the code is bad but it "works". Need to improve some performance characteristics for it.

ezyang · 2025-08-10T11:21:41Z

torch/csrc/utils/python_arg_parser.cpp

-      return false;
+  bool has_torch_func = false;
+
+  for (long idx = 0; idx < size; idx++) {


The iteration here is the perf problem. Ideally we delay checking the insides until we are parsing. But this may result in a more involved change upstream as we typically assume by the time we parse TF cannot occur.

You should use c10::irange here, right?

That's just the color of the shed; the real problem is I'm adding O(n) extra CPython probes for int list arguments. I need to check to see if the overhead is perceptible.

Skylion007 · 2025-08-10T17:39:45Z

torch/csrc/utils/python_arg_parser.cpp

    PyObject* obj,
    int broadcast_size,
-    int64_t* failed_idx = nullptr) {
+    int64_t* failed_idx = nullptr,


Any reasons we want ptrs here instead of optional reference? Nullptr seems more error prone, especially when wrapping an integer type. We can statically guard against invalid std::optional accesses.

Pre-existing condition.

Skylion007 · 2025-08-10T17:43:38Z

torch/csrc/utils/python_arg_parser.cpp

+static bool is_scalar_list(
+    PyObject* obj,
+    std::vector<PyObject*>* overloaded_args = nullptr) {
  auto tuple = six::isTuple(obj);


Six? Uh we missed this in the upgrade didn't we... just use pybind11 handle APIs

Better to do this separately

ezyang · 2025-08-10T19:54:53Z

Some not very scientific benchmarking suggests this is something like 40ns overhead per call, where the calls end to end take 2000ns (so like 2% regression or something).

[ghstack-poisoned]

This was done exclusively with claude code and I haven't reviewed it yet Signed-off-by: Edward Yang <ezyang@meta.com> ghstack-source-id: 2b5d285 Pull-Request: #160256

albanD

Perf hit sounds fair for the benefit!

albanD · 2025-08-11T14:05:47Z

test/test_overrides.py

+                # Fallback
+                return torch.tensor(42.0)
+
+            def __index__(self):


I guess __index__ implies __int__ ?

this is an LLM test, I can delete it lol

test/test_overrides.py

albanD · 2025-08-11T14:09:28Z

test/test_overrides.py

+                    return torch.ones_like(args[0])
+                return torch.tensor(42.0)
+
+            def __float__(self):


What happens if this doesn't implement __float__ ?
Same question for the __int__ types?
Both when they're first and not first.

I would add error cases for these.

albanD · 2025-08-11T14:18:03Z

torch/csrc/utils/python_arg_parser.cpp

+
+  for (long idx = 0; idx < size; idx++) {
+    PyObject* iobj =
+        tuple ? PyTuple_GET_ITEM(obj, idx) : PyList_GET_ITEM(obj, idx);


Not sure how to solve this one, but Sam is going to hunt you down: https://py-free-threading.github.io/porting-extensions/#unsafe-apis-returning-borrowed-references

The tuple side is fine but the list side, you should use PyList_GetItemRef
But then you need conditional decref and handle early exit properly.

Will you get mad if I just use PySequence LOL

BTW we got a lot of these. Maybe I can ask Codex to fix them:

(pytorch-tmp2) ezyang-mac:pytorch-tmp2 ezyang$ git grep PyList_GET_ITEM functorch/csrc/dim/dim.cpp: PyObject** begin = &PyList_GET_ITEM(tv.ptr(), 0); functorch/csrc/dim/minpybind.h: return PyList_GET_ITEM(ptr(), i); torch/_inductor/codecache.py: void* elem = PyCapsule_GetPointer(PyList_GET_ITEM(pyvec, i), NULL); torch/_inductor/codegen/cpp_wrapper_cpu.py: lines += f"{output_arg} = reinterpret_cast<AtenTensorHandle>(PyCapsule_GetPointer(PyList_GET_ITEM(py_{buf_name}.get(), {idx}), NULL));\n" # noqa: B950 torch/csrc/autograd/init.cpp: if (!THPVariable_Check(PyList_GET_ITEM(o, i))) { torch/csrc/autograd/init.cpp: PyList_GET_ITEM(o, i), visit_tensor)) { torch/csrc/autograd/python_variable.h: PyObject* item = PyList_GET_ITEM(pyresult, i); torch/csrc/dynamo/python_compiled_autograd.cpp: py::cast<c10::SymInt>(PyList_GET_ITEM(pyresult, idx++))); torch/csrc/dynamo/python_compiled_autograd.cpp: py::cast<c10::SymInt>(PyList_GET_ITEM(fake_ivalue_args, i))); torch/csrc/dynamo/python_compiled_autograd.cpp: py::cast<c10::SymFloat>(PyList_GET_ITEM(fake_ivalue_args, i))); torch/csrc/fx/node.cpp: PyObject* elem = PyList_GET_ITEM(a, i); // borrowed ref torch/csrc/jit/passes/onnx/shape_type_inference.cpp: auto list_elem = PyList_GET_ITEM(output_obj, 0); torch/csrc/jit/passes/onnx/shape_type_inference.cpp: list_elem = PyList_GET_ITEM(output_obj, i); torch/csrc/jit/passes/onnx/shape_type_inference.cpp: PyList_GET_ITEM(output_obj, i), torch/csrc/jit/passes/onnx/shape_type_inference.cpp: PyList_GET_ITEM(unrolled_dict.ptr(), i), torch/csrc/python_dimname.cpp: tuple ? PyTuple_GET_ITEM(obj, 0) : PyList_GET_ITEM(obj, 0); torch/csrc/utils.cpp: tuple ? PyTuple_GET_ITEM(arg, i) : PyList_GET_ITEM(arg, i); torch/csrc/utils.cpp: tuple ? PyTuple_GET_ITEM(source, idx) : PyList_GET_ITEM(source, idx); torch/csrc/utils.cpp: tuple ? PyTuple_GET_ITEM(source, idx) : PyList_GET_ITEM(source, idx); torch/csrc/utils/python_arg_parser.cpp: tuple ? PyTuple_GET_ITEM(obj, idx) : PyList_GET_ITEM(obj, idx); torch/csrc/utils/python_arg_parser.cpp: tuple ? PyTuple_GET_ITEM(obj, idx) : PyList_GET_ITEM(obj, idx); torch/csrc/utils/python_arg_parser.cpp: tuple ? PyTuple_GET_ITEM(obj, idx) : PyList_GET_ITEM(obj, idx); torch/csrc/utils/python_arg_parser.cpp: is_tuple ? PyTuple_GET_ITEM(obj, idx) : PyList_GET_ITEM(obj, idx); torch/csrc/utils/python_arg_parser.h: : PyList_GET_ITEM(arg.get(), idx); torch/csrc/utils/python_arg_parser.h: : PyList_GET_ITEM(arg.get(), idx); torch/csrc/utils/python_arg_parser.h: : PyList_GET_ITEM(arg.get(), idx); torch/csrc/utils/python_arg_parser.h: : PyList_GET_ITEM(arg.get(), idx); torch/csrc/utils/python_arg_parser.h: tuple ? PyTuple_GET_ITEM(arg, idx) : PyList_GET_ITEM(arg, idx); torch/csrc/utils/python_arg_parser.h: tuple ? PyTuple_GET_ITEM(arg, idx) : PyList_GET_ITEM(arg, idx); torch/csrc/utils/python_arg_parser.h: tuple ? PyTuple_GET_ITEM(arg, idx) : PyList_GET_ITEM(arg, idx); torch/csrc/utils/python_arg_parser.h: tuple ? PyTuple_GET_ITEM(arg, idx) : PyList_GET_ITEM(arg, idx);

Will you get mad if I just use PySequence LOL

As long as you only call it for list and tuple, sounds ok to me :)

BTW we got a lot of these.

Tuple is ok as they're immutable :)
The list ones I though I went through but I guess I missed them :(

I chatted with @colesbury about this and he said there's basically three ways we can do it:

Don't worry about it. (Pretty good option imo)

Use PyList_GetItemRef instead of PyList_GET_ITEM and handle the refcounting

Lock the list with Py_BEGIN_CRITICAL_SECTION(pyvec); at the beginning

(1) is the easiest
(3) is probably the most correct because you get a consistent view of the list including the size

For the code here, where we're already penny pinching nanoseconds, it's probably better to do (1)

albanD · 2025-08-11T14:18:23Z

torch/csrc/utils/python_arg_parser.cpp

+
+    for (Py_ssize_t idx = 0; idx < size; idx++) {
+      PyObject* item_ptr =
+          is_tuple ? PyTuple_GET_ITEM(obj, idx) : PyList_GET_ITEM(obj, idx);


Same thread safety issue

[ghstack-poisoned]

torch/csrc/autograd/python_variable_indexing.cpp

torch/csrc/utils/python_arg_parser.cpp

albanD · 2025-08-12T14:53:55Z

torch/csrc/utils/python_arg_parser.cpp

    auto* obj = PyTuple_GetItem(index_tup.ptr(), i);
-    is_tensor_and_append_overloaded(obj, &overridable_args);
+    auto r = is_tensor_and_append_overloaded(obj, &overridable_args);
+    if (!r && PySequence_Check(obj)) {


Are we guaranteed we can't get a Tensor in here? :)

Seems pretty guaranteed to me?

bool is_tensor_and_append_overloaded( PyObject* obj, std::vector<PyObject*>* overloaded_args) { if (THPVariable_CheckExact(obj)) { // torch.Tensor instances (not subclasses, except for Parameter) return true; } if (check_has_torch_function(obj, /*ignore_mode*/ true)) { // tensor subclasses and unrelated objects with __torch_function__ append_overloaded_tensor(overloaded_args, obj); return true; } else if (THPVariable_Check(obj)) { // tensor subclasses without __torch_function__ return true; } return false; }

torch/csrc/utils/python_arg_parser.cpp

torch/csrc/utils/python_arg_parser.h

[ghstack-poisoned]

ezyang · 2025-08-16T02:09:59Z

~~Review comments addressed~~ ~~I think I put this comment on the wrong PR lol~~ I did actually resolve most of them, just missed one

ezyang · 2025-08-31T18:57:19Z

This is waiting for final approval!

[ghstack-poisoned]

albanD

Ok!
Let's mark this as BC-breaking so we can nicely track it in the release notes.

ezyang · 2025-09-02T16:15:00Z

@pytorchbot merge

pytorchmergebot · 2025-09-02T16:17:03Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

swolchok · 2025-09-02T19:33:29Z

torch/csrc/utils/python_arg_parser.cpp

+    const bool is_tuple = PyTuple_Check(obj);
+    const auto size = is_tuple ? PyTuple_GET_SIZE(obj) : PyList_GET_SIZE(obj);


perf nitpick: if we're doing this optimization here we should've hoisted it up to lines 965-968 so we don't hit PySequence_Size. Also should only call PyTuple_Check once total.

sent #161998

This function has come up in DTensor perf work, and I had a nitpick on #160256 so here it is. I have neither compiled nor measured this, but am reasonably confident it's better nonetheless. [ghstack-poisoned]

This function has come up in DTensor perf work, and I had a nitpick on #160256 so here it is. I have neither compiled nor measured this, but am reasonably confident it's better nonetheless. ghstack-source-id: 5d9d3d5 Pull Request resolved: #161998

This function has come up in DTensor perf work, and I had a nitpick on #160256 so here it is. I have neither compiled nor measured this, but am reasonably confident it's better nonetheless. Pull Request resolved: #161998 Approved by: https://github.com/ezyang

This was done exclusively with claude code and I haven't reviewed it yet Signed-off-by: Edward Yang <ezyang@meta.com> ghstack-source-id: f48741d Pull-Request: pytorch#160256

We basically follow the same pattern we do for tensor arguments. The major downside is we now have to traverse the entirety of the int list / etc where previously we didn't have. Benchmark suggests 2% regression for relevant things. Signed-off-by: Edward Yang <ezyang@meta.com> Pull Request resolved: pytorch#160256 Approved by: https://github.com/albanD

…61998) This function has come up in DTensor perf work, and I had a nitpick on pytorch#160256 so here it is. I have neither compiled nor measured this, but am reasonably confident it's better nonetheless. Pull Request resolved: pytorch#161998 Approved by: https://github.com/ezyang

We basically follow the same pattern we do for tensor arguments. The major downside is we now have to traverse the entirety of the int list / etc where previously we didn't have. Benchmark suggests 2% regression for relevant things. Signed-off-by: Edward Yang <ezyang@meta.com> Pull Request resolved: pytorch#160256 Approved by: https://github.com/albanD

…61998) This function has come up in DTensor perf work, and I had a nitpick on pytorch#160256 so here it is. I have neither compiled nor measured this, but am reasonably confident it's better nonetheless. Pull Request resolved: pytorch#161998 Approved by: https://github.com/ezyang

Update

7377d2f

[ghstack-poisoned]

ezyang added a commit that referenced this pull request Aug 9, 2025

Detect torch function in lists as well

c10e566

This was done exclusively with claude code and I haven't reviewed it yet Signed-off-by: Edward Yang <ezyang@meta.com> ghstack-source-id: 0a57888 Pull-Request: #160256

ezyang mentioned this pull request Aug 9, 2025

Delete Python reference implementation from torchdim, as it is untested #160115

Closed

Update

91b782d

[ghstack-poisoned]

ezyang mentioned this pull request Aug 10, 2025

torchdim Python port #160236

Closed

ezyang marked this pull request as ready for review August 10, 2025 04:28

github-actions bot requested review from SherlockNoMad, albanD, antoniojkim, bdhirsh and miladm August 10, 2025 04:28

ezyang commented Aug 10, 2025

View reviewed changes

ezyang mentioned this pull request Aug 10, 2025

torch_function objects passed as non-Tensor args should trigger overrides #119194

Open

Skylion007 reviewed Aug 10, 2025

View reviewed changes

ezyang added 2 commits August 10, 2025 16:48

Update

5ce44a9

[ghstack-poisoned]

Update

c8f54e7

[ghstack-poisoned]

ezyang added a commit that referenced this pull request Aug 11, 2025

Detect torch function in lists as well

3fbb966

This was done exclusively with claude code and I haven't reviewed it yet Signed-off-by: Edward Yang <ezyang@meta.com> ghstack-source-id: 2b5d285 Pull-Request: #160256

ezyang mentioned this pull request Aug 11, 2025

Recursively descend into lists for TF in getitem #160297

Closed

albanD reviewed Aug 11, 2025

View reviewed changes

Update

a3e1e5c

[ghstack-poisoned]

ezyang requested a review from soulitzer as a code owner August 12, 2025 00:27

Update

48ef0b9

[ghstack-poisoned]

albanD reviewed Aug 12, 2025

View reviewed changes

Update

70236c5

[ghstack-poisoned]

ezyang added release notes: python_frontend python frontend release notes category topic: new features topic category labels Aug 16, 2025

Update

cb92962

[ghstack-poisoned]

ezyang added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 16, 2025

ezyang added 2 commits August 31, 2025 15:13

Update

fcea8f3

[ghstack-poisoned]

Update

0e7dbe4

[ghstack-poisoned]

albanD approved these changes Sep 2, 2025

View reviewed changes

albanD added module: bc-breaking Related to a BC-breaking change topic: bc breaking topic category labels Sep 2, 2025

pytorchmergebot added the merging label Sep 2, 2025

pytorchmergebot closed this in 9a1c5c0 Sep 2, 2025

pytorchmergebot added Merged and removed merging labels Sep 2, 2025

swolchok mentioned this pull request Sep 2, 2025

Use specific Tuple/List APIs instead of PySequence in is_int_or_symint_list #161622

Closed

swolchok reviewed Sep 2, 2025

View reviewed changes

swolchok mentioned this pull request Sep 2, 2025

Perf nitpicks on python_arg_parser's is_int_or_symint_list #161998

Closed

yf225 mentioned this pull request Sep 5, 2025

[Fix CI] Convert tiles to sizes for all torch.* functions pytorch/helion#563

Merged

sarckk mentioned this pull request Sep 9, 2025

[Perf] Convert np array to torch tensor to index into block table for attn chunking vllm-project/vllm#24474

Merged

5 tasks

github-actions bot deleted the gh/ezyang/3128/head branch October 3, 2025 02:09

		const bool is_tuple = PyTuple_Check(obj);
		const auto size = is_tuple ? PyTuple_GET_SIZE(obj) : PyList_GET_SIZE(obj);

Detect torch function in lists as well #160256

Detect torch function in lists as well #160256

Uh oh!

Conversation

ezyang commented Aug 9, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160256

❗ 1 Active SEVs

✅ No Failures

Uh oh!

ezyang commented Aug 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Aug 10, 2025

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albanD Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ezyang commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Aug 31, 2025

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Sep 2, 2025

Uh oh!

pytorchmergebot commented Sep 2, 2025

Merge started

ezyang commented Aug 9, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 9, 2025 •

edited

Loading

albanD Aug 11, 2025 •

edited

Loading

ezyang commented Aug 16, 2025 •

edited

Loading