KEMBAR78
gh-117636: Remove redundant type check in `os.path.join()` by nineteendo · Pull Request #117638 · python/cpython · GitHub
Skip to content

Conversation

@nineteendo
Copy link
Contributor

@nineteendo nineteendo commented Apr 8, 2024

Benchmark:

ntpath.py

script
::test.bat
@echo off
echo 1 item && python -m timeit -s "import before.ntpath" "before.ntpath.join('foo')" && python -m timeit -s "import after.ntpath" "after.ntpath.join('foo')"
echo 10 items && python -m timeit -s "import before.ntpath; paths = ['foo'] * 10" "before.ntpath.join(*paths)" && python -m timeit -s "import after.ntpath; paths = ['foo'] * 10" "after.ntpath.join(*paths)"
echo 100 items && python -m timeit -s "import before.ntpath; paths = ['foo'] * 100" "before.ntpath.join(*paths)" && python -m timeit -s "import after.ntpath; paths = ['foo'] * 100" "after.ntpath.join(*paths)"
1 item
500000 loops, best of 5: 699 nsec per loop # before
500000 loops, best of 5: 648 nsec per loop # after
# -> 1.08x faster
10 items
50000 loops, best of 5: 5.48 usec per loop # before
50000 loops, best of 5: 5.52 usec per loop # after
# -> no difference
100 items
5000 loops, best of 5: 54.1 usec per loop # before
5000 loops, best of 5: 54 usec per loop # after
# -> no difference

posixpath.py

script
# test.sh
echo 1 item && python -m timeit -s "import before.posixpath" "before.posixpath.join('foo')" && python -m timeit -s "import after.posixpath" "after.posixpath.join('foo')"
echo 10 items && python -m timeit -s "import before.posixpath; paths = ['foo'] * 10" "before.posixpath.join(*paths)" && python -m timeit -s "import after.posixpath; paths = ['foo'] * 10" "after.posixpath.join(*paths)"
echo 100 items && python -m timeit -s "import before.posixpath; paths = ['foo'] * 100" "before.posixpath.join(*paths)" && python -m timeit -s "import after.posixpath; paths = ['foo'] * 100" "after.posixpath.join(*paths)"
1 item
1000000 loops, best of 5: 335 nsec per loop # before
1000000 loops, best of 5: 271 nsec per loop # after
# -> 1.24x faster
10 items
50000 loops, best of 5: 3.43 usec per loop # before
100000 loops, best of 5: 3.35 usec per loop # after
# -> 1.02x faster
100 items
10000 loops, best of 5: 34.4 usec per loop # before
10000 loops, best of 5: 34.1 usec per loop # after
# -> no difference

@nineteendo nineteendo marked this pull request as ready for review April 9, 2024 05:52
@nineteendo
Copy link
Contributor Author

@serhiy-storchaka, you're the one who added this check, so you can probably best evaluate if it's still needed.

@serhiy-storchaka
Copy link
Member

The first check was for cases like join(None) or join(['a', 'b']). The second checks was for cases like join('', b'a'), join(b'', 'a'), join(None, 'a'), or something like this.

It seems that after adding fspath() they are no longer needed.

@erlend-aasland erlend-aasland linked an issue Apr 9, 2024 that may be closed by this pull request
@nineteendo
Copy link
Contributor Author

nineteendo commented Apr 10, 2024

Apparently this unnecessary concatenation has been here since the first iteration of posixpath! So it wasn't even intended as a type check:

cpython/Lib/posixpath.py

Lines 11 to 14 in c636014

def cat(a, b):
if b[:1] = '/': return b
if a = '' or a[-1:] = '/': return a + b
return a + '/' + b

Copy link
Contributor

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@nineteendo
Copy link
Contributor Author

@serhiy-storchaka, can this be merged?

@hauntsaninja hauntsaninja merged commit 9ee94d1 into python:main Apr 14, 2024
@nineteendo nineteendo deleted the speedup-os.path.join branch April 14, 2024 21:11
diegorusso pushed a commit to diegorusso/cpython that referenced this pull request Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove redundant type check in os.path.join()

3 participants