KEMBAR78
GH-111429: Speed up `pathlib.PurePath.[is_]relative_to()` by barneygale · Pull Request #111431 · python/cpython · GitHub
Skip to content

Conversation

@barneygale
Copy link
Contributor

@barneygale barneygale commented Oct 28, 2023

Avoid unnecessary calls to with_segments(). This makes both is_relative_to() and relative_to() faster when passed a PurePath object, and makes relative_to() faster when passed another kind of path-like object (like a str).

Also, use _from_parsed_parts() in relative_to() to return a pre-parsed path. Operations like str(p.relative_to(q)) are faster as a result.

@Jason-Y-Z
Copy link
Contributor

Thanks for the change! Overall LGTM. A small suggestion would be - would you mind doing a quick profiling of these 2 functions before/after the change please, just so that we can better understand the effect of the change?
Something like timeit might be helpful for that.

@barneygale
Copy link
Contributor Author

barneygale commented Oct 29, 2023

The improvement depends on the type of the argument, number of segments in each path, and in the case of relative_to(), how the result is used. So take with a pinch of salt - the important bit is that some things are faster and nothing is slower:

$ ./python -m timeit \
    -s 'from pathlib import Path; p0 = Path("foo/bar"); p1 = Path("foo")' \
    'str(p0.relative_to(p1))'
10000 loops, best of 5: 20.3 usec per loop  # before
50000 loops, best of 5: 9.13 usec per loop  # after

$ ./python -m timeit \
    -s 'from pathlib import Path; p0 = Path("foo/bar"); p1 = "foo"' \
    'str(p0.relative_to(p1))'
10000 loops, best of 5: 20.4 usec per loop  # before
20000 loops, best of 5: 14.5 usec per loop  # after

$ ./python -m timeit \
    -s 'from pathlib import Path; p0 = Path("foo/bar"); p1 = Path("foo")' \
    'p0.is_relative_to(p1)'
50000 loops, best of 5: 9.01 usec per loop  # before
50000 loops, best of 5: 4.15 usec per loop  # after

$ ./python -m timeit \
    -s 'from pathlib import Path; p0 = Path("foo/bar"); p1 = "foo"' \
    'p0.is_relative_to(p1)'
50000 loops, best of 5: 9.04 usec per loop  # before
50000 loops, best of 5: 8.91 usec per loop  # after

@Jason-Y-Z
Copy link
Contributor

@pitrou Sorry for tagging, but based on contributor history, would you mind giving a quick review?

@barneygale barneygale merged commit d7cef7b into python:main Nov 12, 2023
@barneygale
Copy link
Contributor Author

Thanks for reviewing @Jason-Y-Z!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage topic-pathlib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants