KEMBAR78
[mypyc] Speed up native-to-native calls using await by JukkaL · Pull Request #19398 · python/mypy · GitHub
Skip to content

Conversation

JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Jul 8, 2025

When calling a native async function using await, e.g. await foo(), avoid raising StopIteration to pass the return value, since this is expensive. Instead, pass an extra PyObject ** argument to the generator helper method and use that to return the return value. This is mostly helpful when there are many calls using await that don't block (e.g. there is a fast path that is usually taken that doesn't block). When awaiting from non-compiled code, the slow path is still taken.

This builds on top of #19376.

This PR makes this microbenchmark about 3x faster, which is about the ideal scenario for this optimization:

import asyncio
from time import time

async def inc(x: int) -> int:
    return x + 1


async def bench(n: int) -> int:
    x = 0
    for i in range(n):
        x = await inc(x)
    return x

asyncio.run(bench(1000))

t0 = time()
asyncio.run(bench(1000 * 1000 * 200))
print(time() - t0)

@JukkaL JukkaL merged commit 4a427e9 into master Jul 8, 2025
13 checks passed
@JukkaL JukkaL deleted the mypyc-await-optimize-2 branch July 8, 2025 16:46
JukkaL added a commit that referenced this pull request Jul 9, 2025
Call the generator helper method directly instead of calling
`PyIter_Next` when calling a native generator from a native function.
This way we can avoid raising StopIteration when the generator is
exhausted. The approach is similar to what I used to speed up calls
using await in #19398. Refer to that PR for a more detailed explanation.

This helps mostly when a generator produces a small number of values,
which is quite common.

This PR improves the performance of this microbenchmark, which is a
close to the ideal use case, by about 2.6x (now 5.7x faster than
interpreted):
```
from typing import Iterator

def foo(x: int) -> Iterator[int]:
    for a in range(x):
        yield a

def bench(n: int) -> None:
    for i in range(n):
        for a in foo(1):
            pass

from time import time
bench(1000 * 1000)
t0 = time()
bench(50 * 1000 * 1000)
print(time() - t0)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants