Test-Driven Development – TDD – is a simple, quick and easy-to-learn technique for developing high quality code by inserting simple but detailed tests directly into the code.
It aims at testing code at the most detailed level, and aims to answer questions such as:
- Does this calculation given the right result?
- Does this class I have just written work the way I think it should?
- If I pass thesevalues to this method, does it return the results I’m expecting?
- Do allthe branches of this CASE statement work?
- And so on.
Self-testing code has been a routine development method at least since the beginning of high-level programming languages, but increasingly TDD creates a more structured process and uses specialised tools, notably the xUnit frameworks.
TDD is a ‘white box’ testing technique – i.e., the tester checks how the code works, not just what it does (‘black box’, functional testing). It’s very widely used in Agile for testing code during development.
TDD is not restricted to new applications. Although it requires considerable investment before TDD substantially increases productivity, there is no reason why the maintenance and support of legacy systems should/could not be converted to TDD.
Scope of TDD
TDD is frequently referred to as a kind of ‘unit’ testing. This is misleading, although the situation is not helped by the TDD frameworks as a whole generally being called ‘xUnit’.
The main differences between TDD and unit testing are:
- TDD tests are not written after the code is written, but carried out continuously whilethe code is being written. Unit tests are generally written separately from, and after, code.
- Although TDD originated with coding and unit testing, there is in principle no reason it should not also be used at other levels – with integration or acceptance testing, for database development or for building and provisioning environments. Some xUnit-style frameworks are currently available (e.g., DbUnit) for these purposes.
On the other hand, TDD cannot replace all other forms of test and verification.
- See Limits of TDD for things TDD cannot do or replace.
- TDD cannot be used to implement your entire Definition of Done. Other types of verification and validation will still be needed.
The Red-Green-Refactor cycle
This core TDD cycle is often summarised like this:
- Red– write a test that will fail unless the code is correctly updated.
- Green– write code that will pass the test.
- Refactor– ensure that the code not just functional but of good quality.
As this suggests, what you don’t do is start with code and then, when you think the code’s OK, write your unit test and run the test on the code to identify defects. Testing and developing are merged into a single action, with testing coming first.
This is the most innovative change TDD makes to development: testing starts from the very beginning of coding. From before beginning, in fact:
- Start with a test that describes the capability you plan to create.
- Check that the tests does indeed fail.
- Write the code needed to pass that test.
- Check that this code does indeed pass the test.
- When the code has passed the test, optimise its quality.
- Add another test, then write the code needed to pass that
- While still passing the previous test(s).
- Continue like this until:
- You have defined all the tests the code needs to pass.
- The code passes them all.
- The user is still happy with what you’ve built.
- All code is of high quality.
TDD, BDD and ATDD
TDD began life as a developer technique, but it was quickly recognised that a TDD-like approach could be used at other levels:
- Unit-level, white-box, technical testing – the original version of TDD.
- BDD (Behaviour-Driven Development):
- This is the level on which the user is likely to think: it is asks what the application does.
- It consists of defining tests to satisfy the standard Agile story structure:
- As a… [role]
- I want… [function]
- So that… [outcome/benefit]
- ATDD (Acceptance Test-Driven Development):
- ATDD tests the software from the business’s point of view.
- Like BDD it is asks what the application does, but now sets its tests at the level of the original story’s acceptance criteria:
- Given that…
In the industry as a whole, BDD and ATDD are often treated as synonyms, or the definitions overlap confusingly. So at Agile201 we have tried to clarify these definitions.
Although these are useful distinctions, because the heart of Agile is a user and a developer collaborating to build code, in practice TDD, BDD and ATDD quickly merge. So at Agile201 we tend to treat all of these as simply different aspects of the same thing: TDD.
Start with a test? Really?
The idea that development should start with writing a test rather than code may seem counter-intuitive – especially writing a test that you fully expect to fail. But there are good reasons for this odd step:
- You check that the test harness is working properly.
- You derive your tests directly from the demands of the user (or other) story.
- You’re not tempted to define a test that simply validates the code you’ve already written.
- If you focus solely on passing the test, you only write the minimum code you need to write to pass it.
- This avoids the mistake of inventing things the code ‘might’ need to do.
- These things are usually either not needed or not needed in the form that is invented in advance.
- This is called the YAGNI principle – ‘You ain’t gonna need it’.
- By describing a test that defines the developer’s understanding of the story, the user can confirm that the developer understands the story properly before they try to code for it.
- If the code passes the test but the user is dissatisfied, you need more/better tests.
- If the test passes even though no code has been written, this will show one of:
- A function for the task in question already exists.
- The test is invalid.
- The best time to write a test is when the developer and user are thinking about the story and the code at the same time.
- When written first, a test is effectively a detailed design that the user can evaluate before coding starts.
- Then subtleties will emerge that might otherwise be forgotten until later testing showed that your code is wrong.
- You code is more likely to be generally testable.
- A test exists for every feature and function.
Other benefits of TDD are defined here.
Although the principle is simple, getting the detailed logic of TDD right can be quite complex. For still more detail, see Test-Driven Development (practice).
Good practice in TDD
There are plenty of things you can do to make TDD easier.
The development process
- Code/test in very small increments – 1 to 10 edits before running the tests.
- Write only the code you need to pass tests.
- Although the tests must eventually cover the story as a whole, test failure is not the only reason to do more development. Even if all tests pass, the user may still be dissatisfied. This outcome should also lead to new or improved tests, even if all previous tests are passed.
- All good software engineering practices should be maintained, even if some (such as certain aspects of object-oriented design) can reduce the visibility of objects.
- If your code calls external objects:
- Distinguish clearly between the integration aspect of this relationship (i.e., a successful call) and its functional aspect (i.e., how it contributes to the functionality under test.
- Test and develop progressively along the chain of modules – otherwise the location of any broken link can be very hard to establish.
- Be careful how you use substitute objects – dummies, mocks, stubs, etc.
- Unless they are themselves subject to TDD-style development, these objects should absolutely minimise the functionality they embed.
- When replaced by real objects, the TDD tests should be thoroughly reviewed.
- If you are calling external objects (network resources, code libraries, etc.), avoid making and testing changes to the code that simply exercise different aspects of the library (unless you believe the library to be faulty or inadequate).
- It’s often quicker to undo/revert a failed change than to debug it.
- The depth of testing applied by TDD is open to debate.
- Each function and feature the user explicitly describes should have an associated test.
- So there should be at least one test for every path through your application, each use for each polymorphism, etc.
- Below this level, however:
- Some TDD experts argue that individual statements are implementation details, and should not be expected to remain the same indefinitely. So TDD tests should not be written at this level.
- Others argue that TDD methods are equally valuable at this level. So statement-level tests should also be included.
- The correct answer to this question is likely to be unique to your product and organisation. It’s generally less risky to limit testing to higher-level aspects of the code where:
- The story is not critical to the success of the product.
- The user and developer are highly experienced in TDD.
- The user and developer have extensive experience of the product.
- The solution is not very innovative technically (to the individual developer, not the team or the product).
- Each function and feature the user explicitly describes should have an associated test.
- Code coverage should not be less than 90%.
- Avoid tests that access or rely on complex or dynamic external objects such as databases or networks.
- They will generally slow and complicate testing and deter developers from executing the whole test suite.
- Make sure your tests run independently of one another (so code is readily restructured, cascading false negatives are avoided, etc.).
- Your test data should be:
- Real (or at least reliably realistic).
- Easy to access.
- Available at the same level of granularity as your tests.
- Easy to interpret (e.g., not complex, dynamically changing data structures).
- Test code should always be removed from production code.
- If not using specialised TDD tools (e.g., xUnit frameworks), this may require additional development steps that themselves create (usually small) risks to the integrity of the application code.
- Install the appropriate xUnit (or similar) test framework.
- Make sure that the same framework is shared by all developers working on the same codebase.
- Makes sure you environment (compiler, test suite, etc.) is fast enough to allow code to be tested whenever a developer wants – dozens of time each day, if needed.
- Set up a continuous integration server, to ensure that the impact of your changes on the rest of your team is also checked and communicated.
- Build and maintain a comprehensive regression test suite.
The benefits of TDD
TDD brings a lot of benefits:
- See here and here for the benefits of a test-first approach.
- By embedding tests directly in the code, you continuously check whether later changes break the code you have already written, other parts of the application, etc.
- By forcing the developer to embed the test first, the developer is made to focus on what the code is for (the user’s perspective) rather than how it works (the developer’s more usual point of view).
- This creates an extra layer of protection against misunderstanding the need.
- The developer develops a better understanding of what the code ‘means’.
- Confidence in the software is established from the start.
- If the code passes its tests, yet the user is not satisfied with the result, a test-first approach compels the user to explain what they want explicitly.
- Because code is debugged as it is written, later bugs are generally:
- easily identified.
- easily fixed.
- TDD is fast compared to traditional methods of code testing.
- Setup, execution, validation and teardown times are often measurable in seconds or minutes, not hours or days.
- Tests are saved with the code, creating a built-in, constantly updated regression test pack.
- Unlike debugging, most of whose elements (watching variables, echo/print statements, break points) need to be deleted once a bug has been removed, the tests created for TDD are a permanent improvement to the code that will benefit all future developers.
- Extremely high levels of code coverage are realistically achievable.
- TDD ensures that code is highly testable (in other ways than TDD, that is).
- Developers and support teams often don’t read formal documentation, so embedded test-first tests are effectively a (very low-level) design specification, with the additional advantage that it is always fully synchronised with the production code.
- The problem of tracking configurations of code, stubs, test data, etc. is eliminated.
- Small tests are easier to understand than large, complex test cases.
- TDD tends to encourage the development of well-structured code, with robust interfaces, clear modularisation, etc.
- TDD makes refactoring quicker and safer.
- Tests are created for all features and functions.
- Running the tests takes only a couple of clicks.
- Feedback is very rapid.
Limits of TDD
Test-driven development isn’t always the way to go for every kind of development.
Or more correctly, TDD is never the only sort of testing you should apply. Apart from problems it shares with other tests – cost of maintenance, hard-coding, mocking, weak tests, inter-test dependencies, etc. – here are some of the issues specific to TDD:
- As the coder and tester are the same individual, coding and testing may share identical blind spots, false assumptions, misinterpretations, etc.
- If your system needs documentation (e.g., business, operational and user documentation, or because it is subject to detailed regulation), TDD assertions are no substitute for more formal or higher-level specifications, manuals, etc.
- Certain kinds of problem are not well tested using TDD:
- Evaluating the ‘look and feel’ of user interfaces is more of a qualitative issue, requiring professional judgement more than objective testing (although it’s fine for checking system control, flow and behaviour).
- Verifying multi-threading and parallelism.
- Anticipating the full range of uses in polymorphic code.
- This may be extremely difficult to define & legitimate uses may not be tested for.
- This in turn may lead to many false positives & negatives.
- Detailed execution timing or overall performance.
- Making sure you have really covered all the detail of, for example, regulatory requirements or detailed technical architectures.
- Developing complex algorithms or architectural elements that cannot be reduced to a number of discrete tests.
- Functionality that relies on the precise performance of external systems (databases, networks, etc.).
- Testing for multiple platforms and configurations.
- Testing non-functional features and attributes.
- Working in exploratory mode (experiments, prototyping, etc.).
- Changes to legacy systems can be hard to do by TDD, because a huge amount of related functional code is not self-testing.
- TDD is no substitute for #user approval.
- No matter how well the code does what the story or the user have told the developer is needed, both are fallible.
- The final say lies with the user judging the final product, not with TDD tests.
- TDD cannot avoid some of the usual problems with testing:
- There’s a problem with the test – it doesn’t work properly or it doesn’t test what you think it tests.
- The code does things it shouldn’t but errors in the test mean that it passes the test anyway.
- Hard-coded tests (e.g., test strings) may themselves create problems for later coding cycles or production code.
- If TDD isn’t generally accepted in the organisation, other stakeholders & functions may insist on additional tests to check what TDD has concluded. So effort is wasted.
- Likewise, it’s harder to maintain TDD tests if TDD isn’t a widely used practice in the organisation. TDD tests may even become actively counter-productive.
- It is as essential to continuously refactor TDD tests as to refactor the code.
- This adds complexity, risk and cost to the development process.
- By ensuring rapid feedback, TDD may actually make coding cheaper. But if it doesn’t, the additional cost needs to be quantified and migrated to coding costs.
- This creates complications for existing planning and budgeting methods and tools.
- Changes can easily invalidate TDD tests in code that is not otherwise affected by the change.
- So detection & update can be difficult.
- Scaling is an issue:
- Large, complex test suites are unlikely to be fast enough to routinely use TDD.
- TDD is not suited to load or performance testing.
- However, a well-structured, carefully partitioned combination of program and test suite will allow partial running (of either side), and so speed up execution.
One reason why TDD is simple and easy to do is that it is widely supported by tools, many integrated directly into standard development environments.
Similarly, many modern programming languages include TDD functions – you don’t need learn a new language just to test.
Most of these test tools are referred to as ‘xUnit testing frameworks’, though there are competing, non-xUnit approaches. They exist for all major programming languages and tend to have names like JUnit (for Java), CppUnit (for C++), and unittest (for Python). They all share a common structure that descends from a simple framework (now called as SUnit) created in 1998 for Smalltalk by Kent Beck (an original signatory of the Agile Manifesto, who also created Extreme Programming, or XP – another cornerstone of Agile).
How xUnit frameworks work
xUnit frameworks are based on a small number of shared ideas.
The most central of these is the idea of ‘assertion’.
An assertion is a statement that something is true. The point of TDD is to put assertions into the code that, if true, the assertion code evaluates without comment, but if they are false, the code throws up an error. For instance, I may want to assert that:
- A specified variable has a specified value.
- Variable A is equal to twice variable B.
- An object is null.
- An object is not
And so on. xUnit languages have code for all these options and many more.
Conversely, if the code is not right, then it’s wrong, and you should stop and fix it. And from Agile’s point of view (‘debug first’), fix it now. After all, who knows what the effect of ignoring this defect might be – or how long it will be before it surfaces again (quite possibly fatally).
So that’s what TDD encourages you to do: it just stops execution and leaves it to you to do something about it. It won’t force you to do anything, but on the other hand it will break every time you each this point (if, that is, you have testing switched on). It does not handle the error in any way or offer helpful diagnostics or ‘manage’ the defect as though it were an exception – it just stops.
The test code is stripped out during compilation, so it will not clutter up your production code.
Some programming languages, like Java, have the assert keyword built in.
Others need a unit testing library to be included. Both allow you to put in a simple statement, usually something like this:
assert a = b + c
If it turns out that variable a really is equal to variable b plus variable c then execution passes on to the next line of code without pausing. If it doesn’t, an error occurs.
This is a very trivial example, of course. Beyond simple assertions, you can also have things like:
assertArrayEquals (which compares arrays).
assertNull and assertNotNull.
assertSame and assetNotSame (which test whether two references relate to the same underlying object).
assertTrue and assertFalse (which test Booleans).
And so on.
That’s all there is to it – that is the test. The test is run silently and if it passes, then – nothing. And if it fails then – stop. Then nothing more.
However, just as a programming language typically has very few ‘words’, but what can be ‘said’ with it is immensely complex, so TDD frameworks offer the ability to use a very small vocabulary of tests to ask a lot of very insightful questions.
The specifics can vary from language to language. For example, Objective C has NSAssert instead, and sometimes you can add an error message or a log message to be output when failure occurs.