Sunday 19 October 2014

Types don't substitute for tests

When reading discussions about the benefits of types in software construction, I've come across the following claim:
When I use types, I don't need as many unit tests.
This statement is not consistent with my understanding of either types or test-driven design. When I've inquired into reasoning behind the claim, it often boils down to the following:
Types provide assurance over all possible arguments (universal quantification). Unit tests provide assurance only for specific examples (existential quantification). Therefore, when I have a good type system I don't need to rely on unit tests.
This argument does not hold in my experience, because I use types and unit tests to establish different kinds of properties about a program.

Types prove that functions within a program will terminate successfully for all possible inputs (I'm ignoring questions of totality for the sake of simplifying the discussion).

Unit tests demonstrate that functions yield the correct result for a set of curated inputs. The practice of test-driven design aims to provide confidence that the inputs are representative of the function's behaviour through the discipline of expanding a function's definition only in response to an example that doesn't yet hold.

All of the examples that I use in my practice of test-driven design are well-typed, whether or not I use a type system. I do not write unit tests that exercise the behaviour of the system in the presence of badly-typed input, because in an untyped programming language it would be a futile exercise and in a typed programming language such tests would be impossible to write.

If I write a program using a type system, I still require just as many positive examples to drive my design and establish that the generalisations I've created are faithful to the examples that drove them. Simply put, I can't think of a unit test that I would write in the absence of a type system that I would not have to write in the presence of one.

I don't use a type system to prove that my functions return the output I intend for all possible inputs. I use a type system to prove that there does not exist an input, such that my functions will not successfully terminate (again, sidestepping the issue of non-total functions). In other words, a type checker proves the absence of certain undesirable behaviours, but it does not prove the presence of the specific desirable behaviours that I require.

Type systems are becoming more sophisticated and are capable of proving increasingly interesting properties about programs. In particular, dependently typed programming languages like Idris can be used to establish that lists are always non-empty or the parity of addition.

But unless the type system proves that there is exactly one inhabitant of a particular type, I still require a positive example to check that I've implemented the right well-typed solution. And even if the type provably has only one inhabitant, I would still likely write a unit test to help explain to myself how the abstract property enforced by the type system manifests itself.

A type system is complementary to unit tests produced by test-driven design. The presence of a type system provides additional confidence as to the correctness of a program, but as I write software it does not reduce the need for examples in the form of unit tests.

5 comments:

  1. I take it you're using "unit tests" in the original narrow sense of a test that exercises a single unit of functionality in isolation? I'm not sure if you're right in this post, but it seems to me that your argument is strongest when it comes to this kind of test.

    Or, to put it another way: It seems to me that your argument gets weaker as we consider tests more broadly.

    I don't know about you, but *I* need to write *some* functional and integration tests in dynamic languages that are *not* needed in a language with a sensible type system. This is because I make all kinds of brain-dead stupid errors when my components start interacting even when all the components are working well in isolation (as confirmed by unit tests). I pass an int where a string was required, etc. Stuff breaks at the boundaries of components, in their interactions. When I go to work and write Python, I just have to test for goofy stuff like that (along multiple code paths). When I come home and play around with Haskell, I just don't: I line up a bunch of functions and the compiler tells me right away that I did something stupid.

    Anyway, it seems to me that there *are* a class of tests that a good type system saves us from writing, and that these tests are not likely to be unit tests strictly speaking. It also seems that people often, though not always, have *these* tests in mind when they brag about the tests that their type system saved them from writing. I wonder if people are talking past each other when they argue about types and tests because they have different tests in mind.

    To turn this into a question: Do you think your thesis that types and tests are complementary gets weaker when we start to consider tests that aren't pure unit tests? Do you think some people making claims about types as a replacement for some tests have different kinds of tests in mind (i.e., not unit tests) than the ones you're talking about in this post?

    ReplyDelete
    Replies
    1. Hear, hear.

      Delete
    2. Good point, Chris. I was specifically thinking of unit tests.

      I've had feedback from yourself and others about the value of integration/component tests that check contracts from a high-level, and that these are proving properties of data that are type-related.

      A large number of these integration tests can be a sign of too few unit tests or code that's hard to reason about, but that's another argument.

      Delete
  2. When you're working in an untyped language, you write all the unit tests you would write in a typed language, sure, and establish the same properties. But what about the additional properties that you would have established via the type system? How do you establish those properties in the untyped language? Either those properties are not valuable to establish, not something you make mistakes about - in which case typing is useless. Or they are valuable, which means you do need to establish them in the untyped languages. In which case, what other method can you use to do so except by writing more unit tests that you wouldn't have written in the typed language?

    ReplyDelete
    Replies
    1. In my experience, writing in an untyped language means doing without the kind of guarantees a type system provides. Having strong testing in other ways makes up for that to some extent, but I don't think that users of untyped languages replace type safety.

      Having said that, property-based testing looks like a really interesting way to prove things about a system, even things that are hard to express in a type system.

      Delete