KEMBAR78
Off-by-one memory error in a string fastsearch since 3.11 · Issue #105235 · python/cpython · GitHub
Skip to content

Off-by-one memory error in a string fastsearch since 3.11 #105235

@bacher09

Description

@bacher09

Bug report

This bug happens in Objects/stringlib/fastsearch.h:589 during matching the last symbol. In some cases, it causes crashes, but it's a bit hard to reproduce since in order this to happen, the last symbol should be the last in this particular memory page and the next page should not be read accessible or have a different non-contiguous address with the previous one.

The simplest script that reproduces the bug for me is:

import mmap

def bug():
    with open("file.tmp", "wb") as f:
        # this is the smallest size that triggers bug for me
        f.write(bytes(8388608))

    with open("file.tmp", "rb") as f:
        with mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) as fm:
            with open("/proc/self/maps", "rt") as f:
                print(f.read())

            # this triggers bug
            res = fm.find(b"fo")

if __name__ == "__main__":
    bug()

But since the result of this script depends on a file system, kernel, and perhaps even a moon phase 😄 , here's a much more reliable way to reproduce it:

import mmap

def read_maps():
    with open("/proc/self/maps", "rt") as f:
        return f.read()

def bug():
    prev_map = frozenset(read_maps().split('\n'))
    new_map = None
    for i in range(0, 2049):
        # guard mmap
        with mmap.mmap(0, 4096 * (i + 1), flags=mmap.MAP_PRIVATE | mmap.MAP_ANONYMOUS, prot=0) as guard:
            with mmap.mmap(0, 8388608 + 4096 * i, flags=mmap.MAP_ANONYMOUS | mmap.MAP_PRIVATE, prot=mmap.PROT_READ) as fm:
                new_map = frozenset(read_maps().split('\n'))
                for diff in new_map.difference(prev_map):
                    print(diff)

                prev_map = new_map
                # this crashes
                fm.find(b"fo")
                print("---")

if __name__ == "__main__":
    bug()

This causes the bug across all Linux environments that I've tried. It uses a trick with inaccessible memory region to increase the chances of this bug happening and no files, to speed it up.
Here's some extra info from GDB:

Program received signal SIGSEGV, Segmentation fault.
0x000055555570ba81 in stringlib_default_find (s=0x7ffff6a00000 "", n=8388608, p=0x7ffff745a3e0 "fo", m=2, maxcount=-1, mode=1)
    at Objects/stringlib/fastsearch.h:589
589                 if (!STRINGLIB_BLOOM(mask, ss[i+1])) {
(gdb) pipe info proc mappings | grep -A 1 -B 1 file.tmp
      0x555555cb4000     0x555555d66000    0xb2000        0x0  rw-p   [heap]
      0x7ffff6a00000     0x7ffff7200000   0x800000        0x0  r--s   /home/slava/src/cpython/python_bug/file.tmp
      0x7ffff7400000     0x7ffff7600000   0x200000        0x0  rw-p   
(gdb) p &ss[i]
$1 = 0x7ffff71fffff ""
(gdb) p &ss[i + 1]
$2 = 0x7ffff7200000 <error: Cannot access memory at address 0x7ffff7200000>
(gdb) p i
$3 = 8388606
(gdb) p ss
$4 = 0x7ffff6a00001 ""
(gdb) p s
$5 = 0x7ffff6a00000 ""

Your environment

  • CPython 3.11.3
  • OS: Linux 6.1 (but it should be OS independent)

I've also tried a bit modified version of a script on OS X, and it crashes there as well.

cc @sweeneyde (since you are the author of d01dceb and 6ddb09f).

Linked PRs

Metadata

Metadata

Assignees

Labels

3.11only security fixes3.12only security fixes3.13bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or error

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions