Tests vs. Test Specifications

When you first get introduced to the idea of test driven development, it may seem strange that tests become such a central focus. Sure, testing is important, and without doing so you wouldn’t catch unintended behavior before releasing it to customers. But why should tests be the driving concern of development? Why are they the driver instead of, say, a design document, or user stories?

In order to understand this, we need to draw an important distinction. I’ll start with an example. Consider the following sequence of instructions:

1: Place 4 cups of water in a large pot
2: Bring the water to a boil, then reduce to a simmer
3: Place 2 cups of white rice into the water, and cover
4: Cook for 25 minutes

Are you looking at food? No, you’re looking at a recipe for food. The recipe is a sequence of instructions that, if followed, will result in food. You can’t eat a recipe. You can follow the recipe, and then eat the food that gets created.

When we talk about “tests” in test-driven development (TDD), we’re not actually talking about the act of “testing”. We’re actually talking about the recipes for testing. When a developer who writes “automated” tests hears the word “test”, he most likely thinks of something like this:

@Test
public void testSomeBehavior() {

prepareFixturesForTest();

SomeClass objectUnderTest = createOUT();

Entity expectedResult = createExpectedResult();
Entity actualResult = objectUnderTest.doSomeBehavior();

Assert.assertEquals(expectedResult, actualResult);
}

That sequence of instructions is what we mean we way say “test”. But calling this a “test” is potentially confusing, because it would be like calling the recipe I printed above “food”. The “test”, meaning the process that is performed and ends with a “success” or “failure”, is what happens when we follow the instructions in this block of code. The code itself is the instructions for how to run the test. A more accurate term for it is a “test recipe”, or test specification. It is the specification of how to run a test. Testing means actually executing this code, either by compiling it and executing in on a machine, or having a human perform each step “manually”.

Before “automated” tests that developers write in the same (or similar) language in which they write their production code, testers were writing documents in English to describe what to do when it is time to test a new version. The only difference is the language. Both of these are test specifications, which are the instructions followed when doing the actual testing.

When we say “test-driven development”, we’re not talking about the driving force being the act of running tests on an application. We’re really talking about the creation of test specifications. We really mean “test-specification-driven development”. Once that is clear, it starts to make sense why it is so effective for test specifications to be the driver.

The full, explicit realization of what test specifications actually are is, arguably, the defining characteristic of “behavior-driven development” (BDD). By building on top of TDD, BDD recognizes that tests (really, test specifications) are the most thorough, accurate and meaningful form in which the specification for behavior/requirements exist. After all, what is the difference between a “story”, or “design spec”, or some other explanation of what a piece of software is supposed to do, and the instructions for how to validate whether it actually does that or not? The answer is… nothing! Well, the real difference is that stories or design specs can be vague, ambiguous, missing details, etc., and it’s not obvious. When you interpret a design spec as the step-by-step instruction for how to validate the behavior, so exact and detailed that a machine can understand it, suddenly those missing details will become obvious, and they’ll need to be filled in.

Before the underlying equivalence of design spec and test spec was properly understood, testers often became the ones who filled in the missing details, as they were turning vague spec requirements into fleshed out test scripts (whether they were writing them down, or just storing them in their heads). Ultimately, the testers were the true product owners. They ultimately dictated the minute details of behavior in the app, by defining exactly what behavior is “pass”, and what is “fail”. Of course a necessary step in releasing software is that it “passes” QA. When the software ends up in the hands of product owners and they aren’t happy with what they see despite it “passing” the tests, (or, the opposite, they are happy with what they see but QA insists it “failed” the tests), it creates a lot of confusing noise in the development pipeline, in the form of undocumented change requests (that will typically re-trigger confusion on future releases) or bogus bug reports. Furthermore, developers won’t really know if they coded what they were supposed to until after they send something to testers and get the feedback. In the “classic”, more siloed shops, with less communication between the “dev” org and the “QA” org, devs often wouldn’t see the test scripts QA is using, and would have to gradually discover what they consider “correct” behavior to be through a game of back-and-forth of releasing, failing, re-releasing, etc.

TDD and BDD are the solution to these problems. If it’s not the same developers who will eventually implement the behavior who also write the tests for that behavior (one of the common objections to TDD is that the coders and testers should be different, but they still are. Automated tests are run by machines, not the coders), they at least have access to that test and are using it as the basis for what code to write and when to decide it is satisfactorily completed. The creation of a test specification is correctly placed at the beginning, rather that the end, of the development cycle, and is actively used by the developers as a guide for implementation. This is exactly what they used to do, except they used the “design spec” or “story acceptance criteria” instead of the exact sequence of steps, plus the exact definition of what is “pass” and “fail”, that the testers will eventually use to validate it.

The alternative to TDD is “X-driven-development”, where X is whatever form in which a design requirement exists in the hands developers as they develop it. Whatever that requirement is, the testers also use it to produce the test script. The error in this reasoning is failing to understand that when the testers do this, they are actually completing the “design spec”, which is really an incomplete, intermediate form of a behavioral requirement. TDD, and especially BDD, move this completion step to where it should be (at the beginning), and involve all the parties that should be in attendance (most importantly the product owners and development team).

Also note that while the creation of the test spec is moved to the beginning of the development, the passing execution of the test is still at the end, where it obviously must be (another major benefit TDD has is adding test executions earlier, when they are supposed to fail, which tests the test to ensure it’s actually a valid test). The last step is still to run the test and see it pass. Understanding this requires explicitly separating what we typically call “tests” (which are actually test specifications) from the act of running tests.

With this clarified, hopefully developers will acquire the appropriate respect for the tests in their codebase. They aren’t just some “extra” thing that gets used at the end as a double-check. They are your specifications. In your tests lie the true definition of what your code is supposed to do. It is the best form of design specification and code documentation (much better than a code comment explaining in often vague words what the author intends, is a test that can be read to understand exactly what will make it pass, plus the ability to actually run it and confirm it does pass) that could possibly exist. That’s why they are arguably more important that the production code itself, and why a developer who has truly been touched by the TDD Angel and “seen the light” will regard the tests as his true job, and the production code as the secondary step.

This, I believe, is the underlying force that additionally makes TDD a very effective tool at discovering the best design for code, which I think is its most valuable feature. Well-designed code emerges from a thorough understanding of exactly what problem you are trying to solve. The fact that writing unit tests helps you discover this design earlier than you otherwise would (through writing a version of it, then experiencing the pain points of the initial design firsthand and refactoring in response to them) is because tests (test specifications) are specifications placed on every level, in every corner, of the codebase.

Code that is “not testable” is code whose behavior cannot be properly specified. The reason why “badly designed” code is “bad” is because it cannot be made sense of (if it works, it’s a happy, and typically quite temporary, accident). Specifying behavior down to unit levels requires making sense of code, which will quickly reveal the design forces contributing to it being un-sensible. This is really the same thing that happens on the product level. Instead of waiting until a defective product is out and discovering misunderstandings, the misunderstandings get resolved in the communal development of the behaviors. Likewise, developers who use TDD to drive design, which is when development truly becomes test-driven, don’t have to wait until a problem is solved to realize that the solution is problematic. Those design defects get discovered and corrected early on.

What’s driving development in TDD isn’t the act of validating whether the code is correct. It is the act of precisely defining what correctness means that drives development.