KEMBAR78
performance: Update io.DEFAULT_BUFFER_SIZE to make python IO faster? · Issue #117151 · python/cpython · GitHub
Skip to content

performance: Update io.DEFAULT_BUFFER_SIZE to make python IO faster? #117151

@morotti

Description

@morotti

Bug report

Bug description:

Hello,

I was doing some benchmarking of python and package installation.
That got me down a rabbit hole of buffering optimizations between between pip, requests, urllib and the cpython interpreter.

TL;DR I would like to discuss updating the value of io.DEFAULT_BUFFER_SIZE. It was set to 8192 since 16 years ago.
original commit: https://github.com/python/cpython/blame/main/Lib/_pyio.py#L27

It was a reasonable size given hardware and OS at the time. It's far from optimal today.
Remember, in 2008 you'd run a 32 bits operating system with less than 2 GB memory available and to share between all running applications.
Buffers had to be small, few kB, it wasn't conceivable to have buffer measured in entire MB.

I will attach benchmarks in the next messages showing 3 to 5 times write performance improvement when adjusting the buffer size.

I think the python interpreter can adopt a buffer size somewhere between 64k to 256k by default.
I think 64k is the minimum for python and it should be safe to adjust to.
Higher is better for performance in most cases, though there may be some cases where it's unwanted
(seek and small read/writes, unwanted trigger of write ahead, slow devices with throughput in measured in kB/s where you don't want to block for long)

In addition, I think there is a bug in open() on Linux.
open() sets the buffer size to the device block size on Linux when available (st_blksize, 4k on most disks), instead of io.DEFAULT_BUFFER_SIZE=8k.
I believe this is unwanted behavior, the block size is the minimal size for IO operations on the IO device, it's not the optimal size and it should not be preferred.
I think open() on Linux should be corrected to use a default buffer size of max(st_blksize, io.DEFAULT_BUFFER_SIZE) instead of st_blksize?

Related, the doc might be misleading for saying st_blksize is the preferred size for efficient I/O. https://github.com/python/cpython/blob/main/Doc/library/os.rst#L3181
The GNU doc was updated to clarify: "This is not guaranteed to give optimum performance" https://www.gnu.org/software/gnulib/manual/html_node/stat_002dsize.html

Thoughts?

Annex: some historical context and technical considerations around buffering.

On the hardware side:

  • HDD had 512 bytes blocks historically, then HDD moved to 4096 bytes blocks in the 2010s.
  • SSD have 4096 bytes blocks as far as I know.

On filesystems:

  • buffer size should never be smaller than device and filesystem blocksize
  • I think ext3, ext4, xfs, ntfs, etc... follow the device block size of 4k as default, though they can be configured for any block size.
  • NTFS is capped to 16TB maximum disk size with 4k blocks.
  • microsoft recommends 64k block size for windows server 2019+ and larger disks https://learn.microsoft.com/en-us/windows-server/storage/file-server/ntfs-overview
  • RAID setups and assimilated with zfs/btrfs/xfs can have custom block size, I think anywhere 4kB-1MB. I don't know if there is any consensus, I think anything 16k-32k-64k-128k can be seen in the wild.

On network filesystems:

  • shared network home directories are common on linux (NFS share) and windows (SMB share).
  • entreprise storage vendors like Pure/Vast/NetApp recommend 524488 or 1048576 bytes for IO.
  • see rsize wsize in mount settings:
  • host:path on path type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,acregmin=60,acdirmin=60,hard,proto=tcp,nconnect=8,mountproto=tcp, ...)
  • for windows I cannot find documentation for network clients, though the windows server should have the NTFS filesystem with at least 64k block size as per microsoft recommendation above.

On pipes:

  • buffering is used by pipes and for interprocess communications. see subprocess.py
  • posix guarantees that writes to pipes are atomic up to PIPE_BUF, 4096 bytes on Linux kernel, guaranteed to be at least 512 bytes by posix.
  • Python had a default of io.DEFAULT_BUFFER_SIZE=8192 so it never benefitted from that atomic property :D

on compression code, they probably all need to be adjusted:

On network IO:

  • On Linux, TCP read and write buffers were a minimum of 16k historically. The read buffer was increased to 64k in kernel v4.20, year 2018
  • the buffer is resized dynamically with the TCP window upto 4MB write 6M read, let's not get into TCP. see sysctl_tcp_rmem sysctl_tcp_wmem
  • linux code: https://github.com/torvalds/linux/blame/master/net/ipv4/tcp.c#L4775
  • commit Sep 2018: torvalds/linux@a337531
  • I think socket buffers are managed separately by the kernel, the io.DEFAULT_BUFFER_SIZE matters when you read a file and write to network, or read from network and write to file.

on HTTP, a large subset of networking:

note to self: remember to publish code and result in next message

CPython versions tested on:

3.11

Operating systems tested on:

Other

Linked PRs

Metadata

Metadata

Assignees

Labels

performancePerformance or resource usagestdlibStandard Library Python modules in the Lib/ directorytopic-IO

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions