KEMBAR78
[mypyc] Faster creation of a tuple from list/tuple by 97littleleaf11 · Pull Request #10217 · python/mypy · GitHub
Skip to content

Conversation

97littleleaf11
Copy link
Collaborator

Description

mypyc: mypyc/mypyc#769

@97littleleaf11
Copy link
Collaborator Author

97littleleaf11 commented Mar 16, 2021

I went through translate_list_comprehension and comprehension_helper. I should write similar code to fill in the blank tuple after I get tuple_op, is it correct?

@JukkaL
Copy link
Collaborator

JukkaL commented Mar 16, 2021

I went through translate_list_comprehension and comprehension_helper. I should write similar code to fill in the blank tuple after I get tuple_op, is it correct?

That's right! It's also possible that you can reuse ForSequence (or ForEnumerate / ForRange) to simplify the generation of the for loop (not sure though whether it's worth it).

@97littleleaf11
Copy link
Collaborator Author

There is another implementation on a new personal branch: 97littleleaf11@4f0b92a


def set_tuple_item() -> None:
e = builder.accept(gen.left_expr)
index_val = builder.accept(index)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this works as expected. The naming is pretty confusing, but I think that this is actually a sequence item (x in for x in my_list), not the index. Actually for_loop_helper doesn't seem to give you access to the sequence index. That probably means that for_loop_helper isn't a good match for this use case, unfortunately.

Directly using ForEnumerate or maybe ForRange could be better options. If using ForEnumerate, you'll have to create a new lvalue for index1 (a register). If using ForRange, you'll need to use something similar to what transform_assignment_stmt does to assign the sequence item to the lvalue(s). The key part is this:

    target = builder.get_assignment_target(lvalue)
    builder.assign(target, rvalue_reg, line)

Copy link
Collaborator Author

@97littleleaf11 97littleleaf11 Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked the generated IR, it seems that it dose work. I will add a few IR test later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed above, can you check that this works for types such as List[str]?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I didn't consider that. > <
I would add a new helper function similar to for_loop_helper.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that should work. One option would be to have a similar helper that gives you access both to the index (starting from 0) and the item value. You don't need to support everything that for_loop_helper supports since this helper would be used in more specialized cases.

r1 = PyList_New(0)
r2 = box(tuple[bool, bool, bool, bool], source)
r3 = PyObject_GetIter(r2)
L1:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugs here.
I found that tuple source would be recognized as tuple instead of builtin_tuple, thus failed in is_tuple_rprimitive (rtype) in specialize.py

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above type annotation should help, but it would be better if the specialization also can be made to work with fixed-length tuples.

You could add a check for RTuple (fixed-length tuple) and also handle that, instead of just supporting variable-length tuples and lists. You may need to coerce it to a variable-length tuple explicitly first, using coerce.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I support RTuple in this PR?

@97littleleaf11
Copy link
Collaborator Author

IR test failed on 32bit

@TH3CHARLie
Copy link
Collaborator

c_pyssize_t has platform-specific size so we use native_int in IR output and it will be replaced with either int32 or int64 in test textual matching. So a quick fix is to replace every int64 in your code with native_int. Later you can try to improve mypyc/mypyc#776 to solve this some-what ugly workflow

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments (not a full review).

Also, can you evaluate the performance impact using a microbenchmark? For example, test something like this (call f many times in a loop):

def f(a: List[int]) -> Tuple[int, ...]):
    return tuple(x + 1 for x in a)

You can also check if this speeds up the nqueens benchmark in https://github.com/mypyc/mypyc-benchmarks. It's okay if there's no difference here, but it would be great if there is.

The above repository has a benchmark runner that can make it easier to measure performance. Just add a microbenchmark and run it using the runbench.py tool (no need to create a PR, you can experiment locally).

r1 = PyList_New(0)
r2 = box(tuple[bool, bool, bool, bool], source)
r3 = PyObject_GetIter(r2)
L1:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above type annotation should help, but it would be better if the specialization also can be made to work with fixed-length tuples.

You could add a check for RTuple (fixed-length tuple) and also handle that, instead of just supporting variable-length tuples and lists. You may need to coerce it to a variable-length tuple explicitly first, using coerce.

@TH3CHARLie
Copy link
Collaborator

I kind of feel this issue is less good-first-issue. But that's good, you get to know more aspects about the codebase and workflow :D

@97littleleaf11 97littleleaf11 marked this pull request as ready for review March 18, 2021 12:28
@97littleleaf11 97littleleaf11 requested a review from JukkaL March 18, 2021 12:54
@97littleleaf11
Copy link
Collaborator Author

97littleleaf11 commented Mar 18, 2021

@benchmark
def tuple_build_test() -> None:
    a = []
    for i in range(1000):
        a.append(i)
    for i in range(1000):
        b = tuple(x + 1 for x in a)

this branch:

interpreted: 0.059748s (avg of 16 iterations; stdev 1.2%)
compiled: 0.017971s (avg of 16 iterations; stdev 3.6%)

compiled is 3.325x faster

master:

interpreted: 0.058382s (avg of 17 iterations; stdev 0.96%)
compiled: 0.025852s (avg of 17 iterations; stdev 2.5%)

compiled is 2.258x faster

@TH3CHARLie
Copy link
Collaborator

you can use the resolve conversation button to make the PR page more concise

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! The performance improvement is great, and we could get even more performance by inlining some more operations (no need to do it now, this is already god). This is almost ready.

Left a few more comments, mostly minor.

return r11


[case testTupleBuiltFromList]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add a test where we execute the code, in mypyc/test-data/run-tuples.test. You can add more sub-cases to the existing testTupleOps test. Note that it uses the default driver, so you can just add test_<whatever> function(s). Test iterating over a list, a tuple and an unsupported iterable (e.g. set).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JukkaL ! I just add a test function including iterating over a list, a tuple and a RTuple(unsupported).

@97littleleaf11 97littleleaf11 requested a review from JukkaL March 21, 2021 17:29
Copy link
Collaborator

@TH3CHARLie TH3CHARLie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With all minor comments addressed, I think it's LGTM now! Thanks!

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the final updates!

@JukkaL JukkaL merged commit b594e6b into python:master Mar 25, 2021
@97littleleaf11 97littleleaf11 deleted the faster-creation-of-tuple-from-list branch March 26, 2021 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants