How To Practice Test-Driven Development In Python? (Deep Dive)
Have you heard about Test-Driven Development (TDD) but have no idea what it means?
Maybe you know what it is in theory, but never applied it in practice.
Does the expectation of writing tests before code cause stress or anxiety that you’re wasting time in lost productivity?
What about if you’re working at a start-up and the code will be irrelevant in a few months? — Should you still practice TDD?
If you feel this way, you’re not alone. A large proportion of developers feel the same way about TDD.
Companies salivate at the idea of well-tested code churned out via TDD best practices and try to enforce its universal adoption.
TDD has its place in the development cycle and in this article, we’ll break down the mystery behind it, its pros and cons, and when to use it and when not.
We’ll go through a real example of how to get started step by step with TDD so it becomes second nature (where required) and you don’t have to debate this within your team or yourself to get started.
In the end, you’ll gain enough knowledge to decide if you should work with TDD practices or the old-school way (write tests after code), depending on your project, requirements, deliverables, type of company, and expectations.
So let’s get into it.
What You’ll Learn
In this article, you’ll learn
- What is TDD and how to use it so that it becomes second nature
- The Red-Green-Refactor concept
- How to decide whether you should use TDD or not
- Pros and Cons of using TDD
- Practical Example of TDD in action
- How to write maintainable and modularised code using TDD practices
What Is Test-Driven Development (TDD)?
TDD is a software development methodology where tests are written before the actual code.
In TDD, you first write a test for a new feature, then write the minimal amount of code needed to pass that test, and finally refactor the code (from minimal to efficient) to meet the necessary standards.
This cycle of “Red-Green-Refactor” ensures that software is designed to be testable, and modular and encourages simple designs and high-quality output.
According to a study by Microsoft and IBM engineers, the pre-release defect density of products was reduced by 40–90% relative to similar projects that did not use the TDD practice.
The caveat, initial development time increased by 15–30%.
TDD is a highly controversial topic in the software community so let’s understand it for what is it and make your judgment call.
Test-Driven Development — Origins
Test-driven development (TDD) traces its roots back to the late 1990s as part of the agile programming movement, with Kent Beck often credited as a key proponent and developer of the technique.
TDD emerged from the need to improve code quality and responsiveness to changing requirements in a fast-paced development environment.
The approach was a shift from traditional coding practices, emphasizing writing tests before the actual software code.
This methodology encouraged developers to think about the desired functionality and the conditions under which the code would operate before implementation.
By doing so, TDD aimed to reduce the debugging time and enhance the reliability of the software, fostering a development process that could more readily adapt to change while maintaining high quality.
The TDD Process (Red-Green-Refactor)
OK now that we’ve got the background, let’s understand TDD in its practical form - Red-Green-Refactor.
Write the Test (Red)
The first step is to write a test.
Now you may ask — “what test, I haven’t written any code”.
That’s exactly it. You import the hypothetical classes or methods you intend to create and write the test. It fails as expected.
Start with only 1 test. You shouldn’t have more than 1 failing test at a time.
Write the Minimum Code (Green)
The next step is to write the minimum amount of code required to pass the test.
Not optimized code, not the fastest loop or most efficient data structure, but the bare minimum.
Run The Test (Green)
Now run your test.
If it passes, great. If it doesn’t, improve your code such that the test passes with the bare minimum code.
Refactor/Improve Code (Refactor)
At this stage, you have a choice.
Go back and refactor the code — maybe you have a better way that consumes less resources — Think Big O Notation.
Perhaps a better algorithm, more efficient data structure, less looping, and so on.
Now the good news is, you already have a working test as a net to let you play around and catch you if you fall.
As long as your test passes, you’re free to experiment and refactor the code to improve efficiency.
The other choice is to leave the bare minimum code and move on to the next feature and test.
Repeat
The last stage in the process is the repeat stage.
You take what you learned in the previous stages and repeat, one test at a time.
Write a failing test, write bare minimum code that passes the test, refactor the code for efficiency, make sure the test passes, and move on.
In the end, you’ll end up with a robust test suite making sure that any refactoring or modifications will always check existing functionality.
To recap, I’ve shamelessly included the TDD rules from the Clean Architectures in Python Book by Leonardo Giordani (fantastic read).
TDD Rules
- Test first, code later
- Add the bare minimum amount of code you need to pass the tests
- You shouldn’t have more than one failing test at a time
- Write code that passes the test. Then refactor it.
- A test should fail the first time you run it. If it doesn’t, ask yourself why you are adding it.
- Never refactor without tests.
Having learned the theory, let’s look at how to apply TDD in practice.
I will try and make a video in the future to explain it better but for now, let’s do this in written form.
Real Example
Our code example today is deliberately kept simple to focus on the TDD principles and avoid complex classes and methods.
We’ll be developing a simple String Manipulator with the following requirements.1
2
3
4
5"""
Build a String Manipulator with the following requirements
- convert to lower case
- remove a defined pattern from the string
"""
*Always get clear requirements before writing a single line of code, this is fundamental to TDD or any software development practice.
Setup Local Environment
Let’s set up your local environment so you can follow along.
Create a repository or clone the example code here.
Basic Python knowledge is recommended, you can use any recent version.
Set up the virtual environment — using conda
, pipenv
, venv
or package manager of your choice1
2pipenv shell
pipenv install --dev pytest
For a guide on how to use Pipenv please check out this article.
Now let’s move on to the Red Stage.
Feature 1
Convert string to lowercase.
Write Test
Create a test file with the following test.
tests/test_string_manipulator.py
1
2
3
4
5
6
7from string_manipulator.string_manipulator import StringManipulator
def test_convert_lower_case():
sm = StringManipulator()
res = sm.to_lower_case("PYTEST")
assert res == "pytest"
I know what you’re thinking. “There is NO StringManipulator
class, what am I doing here”. Bear with me, please :)
This is on purpose.
We’ve written our first test.
- Initialized a
StringManipulator
object. - Called a method
to_lower_case
with an argument. - Assert our result is as expected.
Now run the test.
Run Test (Red)
1 | pytest |
You can see our test fails miserably with an ImportError
.
This is OK and expected as there is no code.
Write Code (Green)
Let’s now write the bare minimum code to pass this test.
Create a file under
string_manipulator/string_manipulator.py
1
2class StringManipulator:
pass
Let’s run the test.
Now we have a different error — AttributeError: ‘StringManipulator’ object has no attribute ‘to_lower_case’
Great! getting closer.
Let’s add a to_lower_case
method.
string_manipulator/string_manipulator.py
1
2
3class StringManipulator:
def to_lower_case(self):
pass
Cool. We now have a different error.
TypeError: StringManipulator.to_lower_case() takes 1 positional argument but 2 were given
Now let’s add the my_string
argument.
string_manipulator/string_manipulator.py
1
2
3class StringManipulator:
def to_lower_case(self, my_string: str):
pass
Result
Failed Assertion!
OK now let’s finish it off with the correct method.
string_manipulator/string_manipulator.py
1
2
3class StringManipulator:
def to_lower_case(self, my_string: str):
return my_string.lower()
And finally success!
This is the bare minimum code you wrote to pass the test.
Congratulations!
You just wrote your first piece of code via TDD.
Refactor/Improve Code
So, are you happy with the code? Or can we refactor it to be more robust?
How about edge cases — if we pass an int or bool datatype to the string manipulator?
Let’s define these tests.
tests/test_string_manipulator.py
1
2
3
4
5
6
7
8
9
10
11
12
13from string_manipulator.string_manipulator import StringManipulator
def test_convert_lower_case():
sm = StringManipulator()
res = sm.to_lower_case("PYTEST")
assert res == "pytest"
def test_convert_lower_case_input_type_int():
sm = StringManipulator()
res = sm.to_lower_case(123)
assert res == "Invalid input"
Now we have 2 tests, the second test checks if our function returns an “Invalid input” message if we pass an integer.
Let’s run the test.1
pytest tests/test_string_manipulator.py::test_convert_lower_case_input_type_int
We get the error
FAILED tests/test_string_manipulator.py::test_convert_lower_case_input_type_int - AttributeError: 'int' object has no attribute 'lower'
Let’s go ahead and fix it.
string_manipulator/string_manipulator.py
1
2
3
4
5class StringManipulator:
def to_lower_case(self, my_string:
if not isinstance(my_string, str):
return "Invalid input"
return my_string.lower()
We’ve added an if condition to check if the string is of string
type, if not, return the expected “Invalid input” message.
Let’s run the test again.
And it passes.
Let’s quickly write a similar test with a boolean.
tests/test_string_manipulator.py
1
2
3
4def test_convert_lower_case_input_type_bool():
sm = StringManipulator()
res = sm.to_lower_case(True)
assert res == "Invalid input"
This test passes out of the box too.
Is this all?
What did you think about Null
values or an empty string? Does the code handle outliers?
Let’s add a test
tests/test_string_manipulator.py
1
2
3
4def test_convert_lower_case_empty_string():
sm = StringManipulator()
res = sm.to_lower_case("")
assert res == "String is empty"
We expect the method to tell us the string is empty and hence it cannot convert it to lower case.
Let’s run this.
As expected we get a failure. AssertionError: assert ‘’ == ‘String is empty’
Returning to our source code,
string_manipulator/string_manipulator.py
1
2
3
4
5
6
7class StringManipulator:
def to_lower_case(self, my_string: str):
if not my_string:
return "String is empty"
if not isinstance(my_string, str):
return "Invalid input"
return my_string.lower()
We’ve added a condition to check if the string value exists.
Run this and boom!
Adding another test for input type None
tests/test_string_manipulator.py
1
2
3
4def test_convert_lower_case_none_string():
sm = StringManipulator()
res = sm.to_lower_case(None)
assert res == "String is empty"
Nice, all 5 pass.
You can go on and on improving your source code.
Now you see how TDD forces you to write modular code, think about code structure and testability, how functions are called and optimize input arguments (including only necessary ones), and so on.
Let’s move on to the next feature.
Feature 2
Remove a defined pattern from a string.
Write Test
Let’s now write a test that removes a pattern from a string.
tests/test_string_manipulator.py
1
2
3
4def test_remove_pattern():
sm = StringManipulator()
res = sm.remove_pattern("Pytest with Eric", "Eric")
assert res == "Pytest with "
For the sake of brevity, I’m going to skip some steps and not go through every iteration of TDD.
string_manipulator/string_manipulator.py
1
2def remove_pattern(self, my_string: str, pattern: str):
pass
Let’s run it.
Run Test (Red)
We get an assert error as our function doesn’t return anything. Let’s fix this.
Write Code (Green)
string_manipulator/string_manipulator.py
1
2
3def remove_pattern(self, my_string: str, pattern: str):
new_string = my_string.replace(pattern, "")
return new_string
The bare minimum code is this. We will come back and improve it later.
Let’s run it.
Nice, it passes.
This isn’t perfect and you need to handle a variety of scenarios and inputs and scenarios.
Refactor/Improve Code
Let’s write a couple of tests to make the remove_pattern
method robust.
tests/test_string_manipulator.py
1
2
3
4
5
6
7
8
9def test_remove_pattern_empty_string():
sm = StringManipulator()
res = sm.remove_pattern("", "Eric")
assert res == "String is empty"
def test_remove_pattern_none_string():
sm = StringManipulator()
res = sm.remove_pattern(None, "Eric")
assert res == "String is empty"
Similar to the previous example, we’ll modify our source code.
string_manipulator/string_manipulator.py
1
2
3
4
5
6
7def remove_pattern(self, my_string: str, pattern: str):
if not my_string:
return "String is empty"
if not isinstance(my_string, str):
return "Invalid input"
new_string = my_string.replace(pattern, "")
return new_string
Everything passes.
Another simple test I quickly thought of was — what if the pattern doesn’t exist? Can we test it?
tests/test_string_manipulator.py
1
2
3
4def test_remove_pattern_doesnt_exist():
sm = StringManipulator()
res = sm.remove_pattern("Pytest with Eric", "John")
assert res == "Pytest with Eric"
Running this test passes, as the Python replace
method gives back the same string.
But what would be useful to the user is to know their pattern doesn’t exist in the string.
When Refactor Breaks Previous Tests
Let’s look at an interesting example — if the pattern is not found in the test let’s return a message saying so.
tests/test_string_manipulator.py
1
2
3
4def test_remove_pattern_doesnt_exist():
sm = StringManipulator()
res = sm.remove_pattern("Pytest with Eric", "John")
assert res == "Pattern not found"
Modifying our source code.
string_manipulator/string_manipulator.py
1
2
3
4
5
6
7
8
9def remove_pattern(self, my_string: str, pattern: str):
if pattern not in my_string:
return "Pattern not found"
if not my_string:
return "String is empty"
if not isinstance(my_string, str):
return "Invalid input"
new_string = my_string.replace(pattern, "")
return new_string
Let’s run it.
Looks like this test passed, however, it broke 2 tests.
This is EXACTLY the kind of danger that TDD saves you from.
We added “new functionality” (simple message return) — but broke existing functionality.
Now let’s fix it.
string_manipulator/string_manipulator.py
1
2
3
4
5
6
7
8
9def remove_pattern(self, my_string: str, pattern: str):
if not my_string:
return "String is empty"
if pattern not in my_string:
return "Pattern not found"
if not isinstance(my_string, str):
return "Invalid input"
new_string = my_string.replace(pattern, "")
return new_string
Checking for None or Empty string before checking if the pattern exists, makes the tests pass.
While this code may not be perfect or the most efficient — it highlights an example of how you can use TDD to test new functionality without breaking existing functionality.
Benefits/Drawbacks/Limitations Of TDD
Now that you’ve had some hands-on experience with TDD, you’re better positioned to understand the benefits and limitations or drawbacks of a TDD-based approach.
Benefits
- TDD forces you to detail your application requirements before writing a line of code and makes sure there is no ambiguity in specifications.
- It makes your code easily maintainable — this means when you’re testing new code, having tests for older code in place gives you confidence that the older code is still working as intended and there are no side effects.
- It forces you to think about the modularization and testability of your code i.e. should you use an object-oriented approach or basic functions, is your function doing multiple operations or just one, do you need to pass
n
number of arguments or can set a few as default. - It promotes writing only the code necessary to pass tests, which often results in simpler, cleaner code designs.
- It fits well with continuous integration practices, as tests are run frequently, allowing teams to detect and fix integration errors early.
- Writing tests first gives you confidence that your code meets the required specifications and behaves as expected under various conditions.
Drawbacks or Limitations
- It takes longer to write tests before the code.
- It makes little sense to invest in TDD when writing potentially disposable code for unproven features e.g. when you’re working at an early-stage start-up.
- There is a learning curve in using and getting comfortable with TDD practices and it also involves upskilling on the test framework e.g. Pytest.
- TDD requires a consistent, disciplined approach to writing tests before any new code. Skipping this step can diminish the benefits of TDD, and maintaining discipline can be challenging, especially in fast-paced or high-pressure environments.
- There’s a risk of writing too many tests or testing trivial aspects of the code, leading to wasted effort and over-engineering.
- Applying TDD to existing legacy codebases can be challenging, as these systems were not designed with testing in mind. Retrofitting tests can be a significant and sometimes impractical effort.
- Poorly designed or too many tests can lead to a false sense of positivity, leading to less effort invested into integration or e2e testing.
- As the codebase grows, so does the test suite. Maintaining this suite, ensuring tests are up-to-date and relevant, and managing deprecated tests can add overhead.
Recommendations and Best Practices
So you may ask, what are some of the TDD best practices?
Here are the most important ones and you may not be able to implement each one but that’s fine.
- Separate testing from development — This means having somebody else on your team test your code or even write part of it, it helps reduce blindspots.
- Write other important tests like integration, end-end, regression, performance, behavior-driven tests and so on.
- Isolate tests — each test should run in isolation and not share the state of data structures or databases amongst tests. Fixture Setup and Teardown can greatly help you out here.
- Tests should be fast — Make sure each test does one small job making them fast and efficient to run. Avoid setup teardown or other complex operations in each test, handle that externally.
- Tests should be idempotent — Your tests should run and produce the same outcome every time, you want them to be deterministic. It’s different with property-based testing but make sure to follow principles.
- There’s no need to test standard Python libraries, you can be sure they are well-tested and fairly robust.
- Red-Green-Refactor: Stick to the TDD mantra of writing a failing test (Red), writing the minimal code to pass the test (Green), and then refactoring both code and tests for clarity and efficiency.
- Test the Behavior, Not the Implementation: Focus on what your code should do, not how it does it. This approach makes your tests more resilient to changes in the codebase.
- Mock or Patch External systems: Utilize mocks, stubs, and fakes to isolate the piece of code you’re testing, especially when dealing with external systems or complex dependencies.
- Encourage Collaboration: TDD is not just a solo activity. Pair programming and code reviews can help spread TDD best practices and improve the quality of both code and tests.
Startup Life vs Big Company
An important topic to address for which only you know the answer.
How do TDD practices matter when you’re working at a start-up or a large organization?
Start-up companies (early stage typically) will often build products or features, test them with their target userbase, and keep the ones that stick, and throw away the ones that don’t.
Or they may even pivot entirely from what they set out to do.
Let’s say you’re a developer at such a company, should you still practice TDD?
If you’re somewhat confident that the product you’re working on is here to stay and you need it to work robustly in a production environment, yes it’s recommended to develop using TDD principles.
If not, then I’d be included to say it’s probably not worth it at this stage and just focus on agile methodology and delivering as many working features to test as fast as possible.
Now, in large organizations things work differently.
They often have a stable product that likely won’t be thrown away — so makes absolute sense to work with TDD practices and produce the best code you can.
Conclusion
So, we’re here at the end. I enjoyed writing this article but it only scratches the surface.
I’ll do a video in the future to explain the concepts in greater detail as I feel it’s much easier to explain the TDD iterations via video than text.
In this article, you explored what is TDD, where it came from, and the Red-Green-Refactor concept.
You also learned how it works in practice using a simple String Manipulator example.
We also saw a test where a small refactor broke existing functionality which is a perfect demonstration of how this works in practice.
Lastly, you explored the pros, cons, and best practices to ultimately help you decide if you should adopt it or not, based on your current situation and projects.
So go ahead and try it out with some of your examples, it’s the best way to learn and will give you immense insight into how the TDD practice works.
If you have ideas for improvement or like me to cover anything specific, please send me a message via Twitter, GitHub or Email.
Till the next time… Cheers!
Additional Learning
This article was made possible due to the below amazing resources from their wonderful creators.
Example Code Used
TDD in Python with pytest - Part 1
Clean Architectures in Python - Book
Test Driven Development: By Example - Kent Beck
Arjan Codes - Test-Driven Development In Python // The Power of Red-Green-Refactor
How to Use Hypothesis and Pytest for Robust Property-Based Testing in Python
What is Setup and Teardown in Pytest? (Importance of a Clean Test Environment)
Python Unit Testing Best Practices For Building Reliable Applications
A Complete Guide To Behavior-Driven Testing With Pytest BDD