How To Manage Temporary Files with Pytest tmp_path

Temporary directories play a vital role in testing by providing a controlled environment to execute and validate code.

However, managing temp directories and files can be challenging and often complex.

How do you handle issues like directory cleanup and portability across different platforms? What about ensuring isolation between your tests?

What if multiple tests need to operate on the same directories?

The answer - Pytest’s tmp_path and tmp_path_factory fixtures.

Pytest has some powerful features that addresses and solves all our concerns.

In this article, we’ll explore the significance of temporary directories in testing and delve into the common pain points encountered when working with them.

We’ll also uncover how Pytest tmp_path and tmp_path_factory fixtures address these challenges with ease, making your testing experience cleaner, smoother and more efficient.

Let’s get started!

Link to the Example Code if you just wanna dive in.

What You’ll Learn

By reading this article you should

  • Gain a clear understanding of how Pytest’s tmp_path fixture simplifies the management of temporary directories.
  • Understand how tmp_path contributes to cleaner and more readable test code by abstracting directory management.
  • Learn how Pytest’s tmp_path ensures cross-platform compatibility, making it easier to share and collaborate on test code across different operating systems.
  • Recognize the time-saving benefits of using tmp_path for automatic directory creation and cleanup.
  • Learn how tmp_path_factory fixture allows you to easily share temporary directory information across tests.

Local Setup

Here is the project structure of our example code which might be helpful to set up beforehand:

pytest-tmp-path-repo

To get started, clone the GitHub repo here.

We’ll explore the code in detail as we go along.

Prerequisites:

  • Python 3.11+
  • Virtualenv

Create a virtual environment and install any requirements by using the command below:

1
pip install -r requirements.txt

The Problem With Using Real Directories in Testing

Using real directories in unit testing presents the following challenges and drawbacks.

Data Interference

Real directories can accumulate data from previous tests or even unrelated processes, leading to data interference.

This means that one test may unintentionally influence the results of another, making it challenging to isolate and reproduce test cases reliably.

Data Residue

Tests that rely on real directories may leave behind residual files or artifacts, even after the test has been completed.

These leftovers can clutter your file system, increase the risk of false positives/negatives in subsequent tests, and require manual cleanup efforts.

Platform Dependency

Real directories might have different paths and naming conventions across various operating systems.

This inconsistency can lead to platform-specific code, reducing the portability and maintainability of your test suite.

Think Windows vs MacOSX vs Linux file system handling.

Complex Setup and Teardown

When using real directories, you need to set up the directory structure, create files, and ensure proper cleanup at the beginning and end of each test.

This additional setup and teardown code can make your test cases more complex and harder to maintain.

Although Pytest fixtures provide automatic setup and teardown functionality, it doesn’t automatically take care of directory management.

Test Parallelization Challenges:

In scenarios where tests are run in parallel, real directories can become a bottleneck.

Multiple tests trying to access and modify the same directories concurrently can lead to race conditions and test failures.

Performance Overheads

Real I/O operations, like creating and deleting directories and files, can introduce performance overhead, especially when running a large number of tests.

This can slow down your test suite and increase the time required for test execution.

Debugging Complexity

Identifying the source of test failures and issues can be more challenging when using real directories.

Tracking down the impact of a single test on the file system and resolving issues can be a time-consuming process.

Solution?

In contrast, using tools like tmp_path or tmp_dir fixtures in Pytest provide a controlled, isolated, and platform-agnostic approach to managing temporary directories for testing.

It ensures that each test operates within a clean, predictable environment, eliminating the problems associated with real directories and contributing to more reliable, maintainable, and efficient testing practices.

Introducing Pytest tmp_path and tmp_path_factory Fixtures

Pytest offers a valuable solution to the challenges of managing temporary directories in testing through the tmp_path and tmp_path_factory fixtures.

The Pytest tmp_path fixture is a built-in feature that provides a temporary directory path for each test function.

It abstracts the complexities of directory creation, cleanup, and management, ensuring that each test operates within an isolated and pristine environment.

One of the most significant advantages of using tmp_path is that it automates the creation and cleanup of temporary directories.

This eliminates the need to write manual setup and teardown code, reducing the potential for errors and saving valuable development time.

Pytest’s tmp_path abstracts away platform-specific path issues.

It provides a consistent directory path regardless of the operating system, ensuring test portability and cross-platform compatibility.

By isolating directory management from the test code, tmp_path enhances test code readability and maintainability.

Test cases become more focused and easier to understand, as they no longer contain extraneous directory-related operations.

tmp_path Fixture

tmp_path is an inbuilt Pytest fixture and pathlib.Path` object.

Note on Pathlib:

The Python _pathlib_ module provides a user-friendly and platform-independent way to work with file system paths and directories.

It simplifies path manipulation, making code more readable and maintainable.

_pathlib_ offers benefits such as concise syntax, automatic path type detection, and improved cross-platform compatibility, enhancing file and directory handling in Python.

Let’s look at some example code to see the Pytest tmp_path fixture in action.

tests/test_tmp_path.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def test_create_and_check_file(tmp_path):  
# Use tmp_path to create a temporary directory
temp_dir = tmp_path / "my_temp_dir"
temp_dir.mkdir()

# Create a file inside the temporary directory
temp_file = temp_dir / "test_file.txt"
temp_file.write_text("Hello, pytest!")

# Check if the file exists
assert temp_file.is_file()

# Read the file's contents
assert temp_file.read_text() == "Hello, pytest!"

Let’s break down the code snippet to understand what’s happening.

This code snippet defines a Pytest test function named test_create_and_check_file.

It utilizes the tmp_path fixture provided by Pytest to create a temporary directory.

Inside this directory, a new file, “test_file.txt,” is created and filled with the text “Hello, pytest!”

The test then checks if the file exists using temp_file.is_file().

Subsequently, it reads the file’s content using temp_file.read_text() and asserts that the content matches the expected value, “Hello, pytest!”

This example illustrates how Pytest’s tmp_path fixture simplifies temporary directory and file management for testing, ensuring isolation and ease of use.

We can run the test as follows:

1
pytest tests/test_tmp_path.py -v -s

pytest-tmp-path-test

tmp_path_factory Fixture

The pytest tmp_path_factory fixture is useful for creating arbitrary temporary directories from any other fixture or test.

It allows you to create a large number of temporary directories to use across your test suite or to easily share directories across multiple tests.

Let’s look at a Live Example to understand how tmp_path_factory works.

Live Example

Our code example is a simple image processor using the Pillow library to download an image and generate its thumbnail.

image_processor/core.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
import requests  
from PIL import Image
import os


def download_image(image_url: str, file_path: str) -> bool:
"""
Downloads an image from the specified URL and saves it to the current directory.

Parameters
----------
image_url : str
The URL of the image to download.
file_path : str
The path where the downloaded image will be saved.

Returns
-------
bool
True if the image was downloaded successfully, False otherwise.
"""
# Send an HTTP GET request to the image URL
response = requests.get(image_url)

# Check if the request was successful (HTTP status code 200 indicates success)
if response.status_code == 200:
# Get the content of the response, which contains the image data
image_data = response.content

# Open a file and write the image data to it
with open(file_path, "wb") as image_file:
image_file.write(image_data)

print(f"Image downloaded and saved as '{file_path}'")
return True
else:
print(f"Failed to download image. Status code: {response.status_code}")
return False


def generate_thumbnail(image_path: str) -> str:
"""
Generates a thumbnail for the specified image using the Pillow library.

Args:
image_path (str): Path to the input image.

Returns:
str: A message indicating the success or failure of thumbnail generation.
"""
size = (128, 128)

# Check if the input image exists
if not os.path.exists(image_path):
return f"Image not found: '{image_path}'"

# Generate a thumbnail
try:
with Image.open(image_path) as im:
im = im.resize(size) # Resize to a fixed size (disregards aspect ratio)
file, ext = os.path.splitext(image_path)
thumbnail_path = file + "_thumbnail.png"
im.thumbnail(size)
im.save(thumbnail_path, "PNG")
print(f"Thumbnail generated for '{image_path}' as '{thumbnail_path}'")
return thumbnail_path
except Exception as e:
raise e

Although we’re only performing one operation, you may want to perform several operations on the image e.g. apply vision analysis, play with the pixels, do Machine Learning analysis, image tagging etc.

Now let’s look at how to efficiently test it.

First, we define our image_file fixture in conftest.py .

If you’re not familiar with conftest, here’s a good introduction to conftest.py and how to use it.

tests/conftest.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import os  
import pytest
from image_processor.core import download_image

@pytest.fixture(scope="session")
def image_file(tmp_path_factory):
image_url = "https://unsplash.com/photos/Ejpx_sdKEKo/download?ixid=M3wxMjA3fDB8MXxhbGx8fHx8fHx8fHwxNjk4NzI2MDIzfA&force=true"
# Download the image
print("Downloading image...")
tmp_file_path = tmp_path_factory.mktemp("data") / "img.png"
download_image(image_url=image_url, file_path=tmp_file_path)
yield tmp_file_path
print("Removing image...")
os.remove(tmp_file_path)

In this code snippet, we define an image URL that contains the image we want to download.

We then use the inbuilt tmp_path_factory fixture to make a temporary directory.

Then we use that temp directory to download the image.

Note the use of the yield statement for teardown.

Pytest setup and teardown is an important concept and are covered in detail in this article.

Now it’s time to use this fixture.

tests/test_image_processor.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import os  
from PIL import Image
from image_processor.core import generate_thumbnail


def test_generate_thumbnail(image_file):
temp_file_path = image_file
print("Generating thumbnail...")
thumbnail_path = generate_thumbnail(image_path=temp_file_path)
print(thumbnail_path)

# Check if the thumbnail file exists
assert os.path.exists(thumbnail_path)

# Check content type
assert thumbnail_path.endswith("_thumbnail.png")

img = Image.open(thumbnail_path)

# get width and height
width, height = img.size

# assert width and height
assert width == 128
assert height == 128

In our test, we use the image_file fixture that we defined in conftest.py .

This allows us to only download the image once and share it across n number of consumer tests.

You can see how powerful this is and allows you to easily share the image in its temp path.

We check if the thumbnail file exists in the correct path and lastly assert its thumbnail dimensions.

Neat! right?

Let’s run the test:

1
pytest tests/test_image_processor.py -v -s

pytest-tmp-path-factory-test

Difference Between tmp_path and tmp_path_factory Fixtures

The main difference between tmp_path and tmp_path_factory fixtures in Pytest is their scope and purpose:

tmp_path Fixture:

  • Scope: Function-level (local).
  • Purpose: Provides a temporary directory path for individual test functions.
  • Use: Typically used for scenarios where each test function needs its isolated temporary directory.

tmp_path_factory Fixture:

  • Scope: Session-level (global).
  • Purpose: Offers a factory for creating temporary directories, allowing the generation of multiple temporary directories within a single test session.
  • Use: Useful for more complex test scenarios where multiple tests or fixtures require their dedicated temporary directories, often within the same test suite.

In summary, tmp_path is used for creating a single temporary directory per test function, whereas tmp_path_factory allows you to create and manage multiple temporary directories for different purposes within a test session.

The choice between them depends on the specific testing needs and scope of your project

Where does Pytest create tmp_path directories?

Pytest creates tmp_path directories in a temporary directory specific to the operating system on which the tests are run.

The exact location of these temporary directories may vary depending on the OS, but Pytest abstracts these details, making it easy to work in a platform-independent way.

  1. Linux/Unix: On Linux and Unix-based systems, pytest typically creates the tmp_path directory in the /tmp directory.
  2. Windows: On Windows, pytest typically creates the tmp_path directory in the system’s temporary directory, which is usually located at C:\Users\<username>\AppData\Local\Temp.

The actual path to the temporary directory can be accessed through the tmp_path fixture.

It’s important to note that you don’t need to worry about the specific location, as Pytest abstracts this and provides you with a consistent and reliable way to work with temporary directories across different platforms.

This platform independence is one of the benefits of using tmp_path in your tests.

Overriding the Default Base Temporary Directory

Now as programmers we’re often stubborn or have a need to override the default base temporary directory in Pytest.

You can use the --basetemp command-line option followed by the desired directory path.

For example:

1
pytest --basetemp=mydir

By specifying the --basetemp option, you set a custom base temporary directory.

It’s important to note that the contents of the specified directory (in this case, ‘mydir’) will be completely removed before each test run.

Therefore, it’s essential to use this directory exclusively for temporary test-related data, as its contents will be cleared each time Pytest is executed.

This customization allows you to have control over where temporary directories are created and ensures that only the temporary directories from the most recent run are retained.

For e.g.

1
pytest tests/test_image_processor.py -v -s --basetemp=eric_test

Gives us a temp directory eric_test

pytest-tmp-path-basetemp

How Many Test Runs Does Pytest keep in tmp_path

Pytest typically keeps up to 3 test runs (each run is a test session) in tmp_path by default.

This means that the temporary directories created for previous test runs are retained, but if there are more than three, the oldest ones are removed to maintain a maximum of three temporary directories.

Combining Pytest Fixtures, Parameters and tmp_path

Here’s an example of how you can combine Pytest fixtures, parameters, and tmp_path.

Suppose you have a function multiply that multiplies two numbers and you want to test it with different inputs while storing the results in temporary files.

tests/test_fixture_param_tmp_file.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import pytest  
import os

# Function to be tested
def multiply(a, b):
return a * b

# Define a fixture for creating a temporary directory
@pytest.fixture
def temp_dir(tmp_path):
return tmp_path

# Define a parameterized test that takes different input values
@pytest.mark.parametrize("a, b, expected", [(2, 3, 6), (4, 5, 20), (0, 7, 0)])
def test_multiply(temp_dir, a, b, expected):
result = multiply(a, b)

# Create a temporary file to store the result
result_file = temp_dir / f"result_{a}_{b}.txt"

with open(result_file, 'w') as f:
f.write(str(result))

assert result == expected

# Clean up: Remove the temporary files after the tests
@pytest.fixture(autouse=True)
def cleanup_temp_files(temp_dir):
yield
for file in temp_dir.iterdir():
if file.is_file():
os.remove(file)

# This will run three parameterized tests, storing results in temporary files

We define a multiply function that multiplies two numbers.

We create a temp_dir fixture that provides a temporary directory using the tmp_pathfixture. This directory is used to store the results of the tests.

We use the @pytest.mark.parametrize decorator to specify different input values for the test_multiply function.

If you need a refresher on Pytest parameters, this article on Pytest parameterized testing is a great read.

This creates three test cases with different inputs.

Inside the test_multiply function, we calculate the result, store it in a temporary file within the temp_dir fixture, and then assert that the result matches the expected value.

We define an autouse fixture called cleanup_temp_files to automatically clean up the temporary files created during the tests.

It removes files in the temp_dir after the tests are executed.

When you run this test, it will perform the multiplication for different input values, store the results in temporary files, and ensure that the results match the expected values.

The temporary files are cleaned up after the tests are complete.

1
pytest tests/test_fixture_param_tmp_file.py -v -s

pytest-tmp-path-fixture-param

Conclusion

To conclude, in this article, we explored the importance of temporary directories in the context of testing and the common challenges associated with managing them.

First, we introduced Pytest’s tmp_path and tmp_path_factory fixtures as powerful tools to simplify the handling of temporary directories.

We detailed the problems with using real directories in testing, including data interference, data residue, platform dependency, complex setup and teardown, and other issues.

The tmp_path fixture is a way to create a temporary directory for individual test functions, automating directory creation and cleanup offering cross-platform compatibility.

You also learned about the tmp_path_factory fixture with an image_processor example.

The article also showed you how to override the default base temporary directory using the --basetemp option.

Lastly, you learnt how to combine tmp_path, fixtures and parameters.

Overall, Pytest’s tmp_path and tmp_path_factory fixtures offer a robust solution for managing temporary directories in testing, providing a more reliable, maintainable, and efficient testing environment.

If you have ideas for improvement or like me to cover anything specific, please send me a message via Twitter, GitHub or Email.

Till the next time… Cheers!

Additional Learning

How to use temporary directories and files in tests - pytest documentation
How to Effortlessly Generate Unit Test Cases with Pytest Parameterized Tests
What is Setup and Teardown in Pytest? (Importance of a Clean Test Environment)
Pytest Conftest With Best Practices And Real Examples
A Step-by-Step Guide To Using Pytest Fixtures With Arguments
3 Simple Ways To Define Your Pytest Environment Variables With Examples