Python's pathlib module
Trey Hunner
12 min. read • Python 3.9—3.13 • Nov. 18, 2024
Share
Tags
    Files
Does your Python code need to work with file paths?
You should consider using pathlib.
I now use pathlib for nearly all file-related code in Python, especially when I need to
construct or deconstruct file paths or ask questions of file paths.
I'd like to make the case for Python's pathlib module... but first let's look at a cheat
sheet of common path operations.
A pathlib cheat sheet
Below is a cheat sheet table of common pathlib.Path operations.
The variables used in the table are defined here:
>>> from pathlib import Path
>>> path = Path("/home/trey/proj/readme.md")
>>> relative = Path("readme.md")
>>> base = Path("/home/trey/proj")
>>> new = Path("/home/trey/proj/sub")
>>> home = Path("/home/")
>>> target = path.with_suffix(".txt")               # .md -> .txt
>>> pattern = "*.md"
>>> name = "sub/f.txt"
     Path-related task               pathlib approach                                 Example
Read all file contents        path.read_text()                  'Line 1\nLine 2\n'
Write file contents           path.write_text('new')            Writes new to file
Get absolute path             relative.resolve()                Path('/home/trey/proj/readme.md')
Get the filename              path.name                         'readme.md'
Get parent directory          path.parent                       Path('home/trey/proj')
Get file extension            path.suffix                       '.md'
Get suffix-free name          path.stem                         'readme'
Ancestor-relative path        path.relative_to(base)            Path('readme.md')
      Path-related task          pathlib approach                               Example
Verify path is a file       path.is_file()                  True
Verify path is directory    path.is_dir()                   False
Make new directory          new.mkdir()                     Makes new directory
Get current directory       Path.cwd()                      Path('/home/trey/proj')
Get home directory          Path.home()                     Path('/home/trey')
Get ancestor paths          path.parents                    [Path('/home/trey/proj'), ...]
List files/directories      home.iterdir()                  [Path('home/trey')]
Find files by pattern       base.glob(pattern)              [Path('/home/trey/proj/readme.md')
Find files recursively      base.rglob(pattern)             [Path('/home/trey/proj/readme.md')
Join path parts             base / name                     Path('/home/trey/proj/sub/f.txt')
Get file size (bytes)       path.stat().st_size             14
Walk the file tree          base.walk()                     Iterable of (path, subdirs, files)
Rename path                 path.rename(target)             Path object for new path
Remove file                 path.unlink()
Note that iterdir, glob, rglob, and walk all return iterators. The examples above
show lists for convenience.
The open function accepts Path objects
What does Python's open function accept?
You might think open accepts a string representing a filename. And you'd be right.
The open function does accept a filename:
filename = "example.txt"
with open(filename) as file:
    contents = file.read()
But open will also accept pathlib.Path objects:
from pathlib import Path
path = Path("example.txt")
with open(path) as file:
    contents = file.read()
Although the specific example here could be replaced with a method call instead of
using open at all:
from pathlib import Path
path = Path("example.txt")
contents = path.read_text()
Python's pathlib.Path objects represent a file path. In my humble opinion, you
should use Path objects anywhere you work with file paths. Python's Path objects
make it easier to write cross-platform compatible code that works well with filenames in
various formats (both / and \ are handled appropriately).
Why use a pathlib.Path instead of a string?
Why use Path object to represent a filepath instead of using a string?
Well, consider these related questions:
    Why use a datetime.timedelta object instead of an integer?
    Why use a datetime.datetime object instead of a string?
    Why use True and False instead of 1 and 0?
    Why use a decimal.Decimal object instead of an integer?
    Why use a class instance instead of a tuple?
Specialized objects exist to make specialized operations easier.
Python's pathlib.Path objects make performing many common path-relations
operations easy.
Which operations are easier? That's what the rest of this article is all about.
The basics: constructing paths with pathlib
You can turn a string into a pathlib.Path object by passing it to the Path class:
>>> filename = "/home/trey/.my_config.toml"
>>> path = Path(filename)
PosixPath('/home/trey/.my_config.toml')
But you can also pass a Path object to Path:
>>> file_or_path = Path("example.txt")
>>> path = Path(file_or_path)
>>> path
PosixPath('example.txt')
So if you'd like your code to accept both strings and Path objects, you can normalize
everything to pathlib land by passing the given string/path to pathlib.Path.
Note: The Path class returns an instance of
either PosixPath or WindowsPath depending on whether your code is running on
Windows.
Joining paths
One of the most common path-related operations is to join path fragments together.
For example, we might want to join a directory path to a filename to get a full file path.
There are a few different ways to join paths with pathlib.
We'll look at each, by using this Path object which represents our home directory:
>>> from pathlib import Path
>>> home = Path.home()
>>> home
PosixPath('/home/trey')
The joinpath method
You can join paths together using the joinpath method:
>>> home.joinpath(".my_config.toml")
PosixPath('/home/trey/.my_config.toml')
The / operator
Path objects also override the / operator to join paths:
>>> home / ".my_config.toml"
PosixPath('/home/trey/.my_config.toml')
The Path initializer
Passing multiple arguments to the Path class will also join those paths together:
>>> Path(home, ".my_config.toml")
PosixPath('/home/trey/.my_config.toml')
This works for both strings and Path objects. So if the object I'm working with could be
either a Path or a string, I'll often join by passing all objects into Path instead:
>>> config_location = "/home/trey"
>>> config_path = Path(config_location, ".my_config.toml")
>>> config_path
PosixPath('/home/trey/.my_config.toml')
I prefer the / operator over joinpath
Personally, I really appreciate the / operator overloading. I pretty much never
use joinpath, as I find using / makes for some pretty readable code (once you get
used to it):
>>> BASE_PATH = Path.cwd()
>>> BASE_PATH / "templates"
PosixPath('/home/trey/proj/templates')
If you find the overloading of the / operator odd, stick with the joinpath method.
Which you use is a matter of personal preference.
Current working directory
Need to get the current working directory? Path() and Path('.') both work:
>>> Path()
PosixPath('.')
>>> Path(".")
PosixPath('.')
However, I prefer Path.cwd(), which is a bit more explicit and it returns an absolute
path:
>>> Path.cwd()
PosixPath('/home/trey')
Absolute paths
Many questions you might ask of a path (like getting its directory) require absolute
paths. You can make your path absolute by calling the resolve() method:
>>> path = Path("example.txt")
>>> full_path = path.resolve()
>>> full_path
PosixPath('/home/trey/example.txt')
There's also an absolute method, but it doesn't transform .. parts into references to
the parent directory or resolve symbolic links:
>>> config_up_one_dir = Path("../.editorconfig")
>>> config_up_one_dir.absolute()
PosixPath('/home/trey/../.editorconfig')
>>> config_up_one_dir.resolve()
PosixPath('/home/trey/.editorconfig')
Most of the time you find yourself using absolute(),
you probably want resolve() instead.
Splitting up paths with pathlib
We commonly need to split up a filepath into parts.
>>> full_path
PosixPath('/home/trey/example.txt')
Need just the filename for your filepath? Use the name attribute:
>>> full_path.name
'example.txt'
Need to get the directory a file is in? Use the parent attribute:
>>> full_path.parent
PosixPath('/home/trey')
Need a file extension? Use the suffix attribute:
>>> full_path.suffix
'.txt'
Need the part of a filename that doesn't include the extension? There's a stem attribute:
>>> full_path.stem
'example'
But if you're using stem to change the extension, use the with_suffix method instead:
>>> full_path.with_suffix(".md")
PosixPath('/home/trey/example.md')
Listing files in a directory
Need to list the files in a directory? Use the iterdir method, which returns a
lazy iterator:
>>> project = Path('/home/trey/proj')
>>> for path in project.iterdir():
...       if path.is_file():
...            # All files (not directories) in this directory
...            print(path)
...
/home/trey/proj/readme.md
/home/trey/proj/app.py
Need to find files in a directory that match a particular pattern (e.g. all .py files)? Use
the glob method:
>>> taxes = Path('/home/trey/Documents/taxes')
>>> for path in taxes.glob("*.py"):
...       print(path)
...
/home/trey/Documents/taxes/reconcile.py
/home/trey/Documents/taxes/quarters.py
Need to look for files in deeply-nested subdirectories as well?
Use rglob('*') to recursively find files/directories:
>>> for path in taxes.rglob("*.csv"):
...       print(path)
...
/home/trey/Documents/taxes/2030/raw/bank_dividends.csv
/home/trey/Documents/taxes/2031/raw/bank_dividends.csv
/home/trey/Documents/taxes/2031/raw/stocks.csv
Want to look at every single file in the current directory and all subdirectories?
The walk method (which mirrors the use of os.walk) works for that:
>>> for path, sub_directories, files in taxes.walk():
...       if any(p.stem.lower() == "readme" for p in files):
...            print("Has readme file:", path.relative_to(taxes))
...
Has readme file: organizer
Has readme file: archived/2017/summarizer
Navigating up a file tree, instead of down? Use the parents attribute of
your pathlib.Path object:
>>> for directory in taxes.parents:
...       print("Possible .editorconfig file:", directory /
".editorconfig")
...
Possible .editorconfig file: /home/trey/Documents/.editorconfig
Possible .editorconfig file: /home/trey/.editorconfig
Possible .editorconfig file: /home/.editorconfig
Possible .editorconfig file: /.editorconfig
Reading and writing a whole file
Need to read your whole file into a string?
You could open the file and use the read method:
with open(path) as file:
      contents = file.read()
But pathlib.Path objects also have a read_text method that makes this common
operation even easier:
contents = path.read_text()
For writing the entire contents of a file, there's also a write_text method:
path.write_text("The new contents of the file.\n")
Many common operations are even easier
The pathlib module makes so many common path-related operations both easier to
discover and easier to read.
Paths relative to other paths
Want to see your path relative to a specific directory?
Use the relative_to method:
>>> BASE_PATH = Path.cwd()
>>> home_path = Path("/home/trey/my_project/templates/home.html")
>>> home_path
PosixPath('/home/trey/my_project/templates/home.html')
>>> print(home_path.relative_to(BASE_PATH))
templates/home.html
Checking for file/directory existence
Need to see if a file/directory exists? There's an exists method, but
the is_file and is_dir methods are more explicit, so they're usually preferable.
>>> templates = Path.cwd() / "templates"
>>> templates.exists()
False
>>> templates.is_dir()
False
>>> templates.is_file()
False
Ensuring directory existence
Need to make a new directory if it doesn't already exist? Use the mkdir method
with exist_ok set to True:
>>> templates.mkdir(exist_ok=True)
>>> templates.is_dir()
True
Need to automatically create parent directories of a newly created directory? Pass
the parents argument to mkdir:
>>> css_directory = Path.cwd() / "static/css"
>>> css_directory.mkdir(exist_ok=True, parents=True)
The home directory
Need to check a config file in your home directory? Use Path.home().
>>> user_gitconfig = Path.home() / ".gitconfig"
>>> user_gitconfig
PosixPath('/home/trey/.gitconfig')
Has the user passed in a path that might have a ~ in it? Call expanduser on it!
>>> path = Path(input("Enter path to new config file: "))
Enter path to new config file: ~/.my_config
>>> path
PosixPath('~/.my_config')
>>> path.expanduser()
PosixPath('/home/trey/.my_config')
Directory of the current Python file
Here's a trick that's often seen in Django settings modules:
BASE_DIR = Path(__file__).resolve().parent.parent
This will set BASE_DIR to the directory just above the settings module.
Before pathlib, that line used to look like this:
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
I find the pathlib version much more readable.
No need to worry about normalizing paths
All file paths in pathlib land are stored using forward slashes. Why forward slashes?
Well, they work pretty universally on Windows, Linux, and Mac and they're easier to
write in Python (backslashes need to be escaped to use them in strings).
>>> documents1 = Path("C:/Users/Trey/Documents")               # preferred
>>> documents2 = Path(r"C:\Users\Trey\Documents")               # this works also
though
>>> documents1
WindowsPath('C:/Users/Trey/Documents')
>>> documents2
WindowsPath('C:/Users/Trey/Documents')
When a file path is actually used, if you're on Windows those forward slashes will all be
converted to backslashes by default:
>>> print(documents1)
C:\Users\Trey\Documents
You might ask "why even care about / vs \ on Windows if / pretty much always
works"? Well, if you're mixing and matching \ and /, things can get weird... especially
if you're comparing two paths to see whether they're equal!
With the automatic normalization done by pathlib.Path objects, you'll never need to
worry about issues with mixing and matching / and \ on Windows.
Built-in cross-platform compatibility
Pretty much any sort of splitting or segment-modification operation you can imagine
with a file path is relatively simple with pathlib.Path objects. Using Path objects for
these operations also results in code that is self-descriptive and cross-platform
compatible.
Cross-platform compatible?
Why not just use the split and join string methods to split and join
with / or \ characters?
Because manually handling / and \ characters in paths correctly is a huge pain.
Compare this code:
directory = input("Enter the project directory: ")
# Normalize slashes and remove possible trailing /
directory = directory.replace("\\", "/")
directory = directory.removesuffix("/")
readme_filename = directory + "/readme.md"
To this pathlib.Path-using code:
from pathlib import Path
directory = input("Enter the project directory: ")
readme_path = Path(directory, "readme.md")
If you don't like pathlib for whatever reason, at least use the various utilities in
Python's much older os.path module.
A pathlib conversion cheat sheet
Have the various os.path and os path-handling approaches in muscle memory?
Here's a pathlib cheat sheet, showing the new pathlib way and the old os.path, os,
or glob equivalent for many common operations.
      Path-related task                pathlib approach                                Old approach
Read all file contents         path.read_text()                    open(path).read()
Get absolute file path         path.resolve()                      os.path.abspath(path)
Get the filename               path.name                           os.path.basename(path)
Get parent directory           path.parent                         os.path.dirname(path)
Get file extension             path.suffix                         os.path.splitext(path)[1]
Get extension-less name        path.stem                           os.path.splitext(path)[0]
Ancestor-relative path         path.relative_to(parent)            os.path.relpath(path, parent)*
Verify path is a file          path.is_file()                      os.path.isfile(path)
Verify path is directory       path.is_dir()                       os.path.isdir(path)
Make directory & parents       path.mkdir(parents=True)            os.makedirs(path)
      Path-related task             pathlib approach                        Old approach
Get current directory     pathlib.Path.cwd()              os.getcwd()
Get home directory        pathlib.Path.home()             os.path.expanduser("~")
Find files by pattern     path.glob(pattern)              glob.iglob(pattern)
Find files recursively    path.rglob(pattern)             glob.iglob(pattern, recursive=Tr
Normalize slashes         pathlib.Path(name)              os.path.normpath(name)
Join path parts           parent / name                   os.path.join(parent, name)
Get file size             path.stat().st_size             os.path.getsize(path)
Walk the file tree        path.walk()                     os.walk()
Rename file to new path   path.rename(target)             os.rename(path, target)
Remove file               path.unlink()                   os.remove(path)
[*]: The relative_to method isn't identical
to os.path.relpath without walk_up=True, though I rarely find that I need that
functionality.
Note that a somewhat similar comparison table exists in the pathlib documentation.
What about things pathlib can't do?
There's no method on pathlib.Path objects for listing all subdirectories under a given
directory.
That doesn't mean you can't do this with pathlib. It just means you'll need to write
your own function, just as you would have before pathlib:
def subdirectories_of(path):
     return (
          path
          for subpath in path.iterdir()
          if subpath.is_dir()
     )
Working with pathlib is often easier than working with alternative Python tools.
Finding help on pathlib features is as simple as passing a pathlib.Path object to the
built-in help function.
Should strings ever represent file paths?
So pathlib seems pretty great... but do you ever need to use a string to represent a file
path?
Nope. Pretty much never.
Nearly every utility built-in to Python which accepts a file path will also accept
a pathlib.Path object.
For example the shutil library's copy and move functions
accept pathlib.Path objects:
>>> import shutil
>>> path = Path("readme.txt")
>>> new_path = path.with_suffix(".md")
>>> shutil.move(path, new_path)
Note: in Python 3.14, pathlib.Path objects will also gain copy and move methods, so
you won't even need to import the equivalent shutil functions!
Even subprocess.run will accept Path objects:
import subprocess
import sys
subprocess.run([sys.executable, Path("my_script.py")])
You can use pathlib everywhere.
If you really need a string representing your file path, pathlib.Path objects can be
converted to strings either using the str function or an f-string:
>>> str(path)
'readme.txt'
>>> f"The full path is {path.resolve()}"
'The full path is /home/trey/proj/readme.txt'
But I can't remember the last time I needed to explicitly convert
a pathlib.Path object to a string.
Note: Technically we're supposed to use os.fspath to convert Path objects to
strings if we don't trust that the given object is actually a Path, so if you're writing a
library that accepts Path-like objects please be sure to use os.fspath(path) instead
of str(path). See this discussion from Brett Cannon. If you're writing a third-party
library that accepts Path objects, use os.fspath instead of str so that any path-like
object will be accepted per PEP 519.
Unless I'm simply passing a single string to the built-in open function, I pretty much
always use pathlib when working with file paths.
Use pathlib for readable cross-platform code
Python's pathlib library makes working with file paths far less cumbersome than the
various os.path, glob, and os equivalents.
Python's pathlib.Path objects make it easy to write cross-platform path-handling
code that's also very readable.
I recommend using pathlib.Path objects anytime you need to perform manipulations
on a file path or anytime you need to ask questions of a file path.
                                                 Mark as read
                                 A Python tip every week
Need to fill-in gaps in your Python skills?
Sign up for my Python newsletter where I share one of my favorite Python tips every
week.
                                         Get weekly Python tips
                                                   ↑
A Python Tip Every Week
                                          Sign up for w eekly Python tips
                                             Table of Contents
     A pathlib cheat sheet
     The open function accepts Path objects
     Why use a pathlib.Path instead of a string?
     The basics: constructing paths with pathlib
     Joining paths
     Current working directory
     Absolute paths
     Splitting up paths with pathlib
     Listing files in a directory
     Reading and writing a whole file
     Many common operations are even easier
     No need to worry about normalizing paths
     Built-in cross-platform compatibility
     A pathlib conversion cheat sheet
     What about things pathlib can't do?
     Should strings ever represent file paths?
     Use pathlib for readable cross-platform code
                                                Mark as read
                                               Next Up 02:58
Unicode character encodings
When working with text files in Python, it's considered a best practice to specify the character encoding that
you're working with.
© 2024