Friday, 5 November 2010

Communicative testing

A couple of weeks ago I proposed that tests could be thought of as facts that have to be 'explained' by code. In a comment on that post, p.j. hartlieb pointed out that this paradigm relies high tests dependability. And @hlangeveld suggested that test runs should be seen as analogous to experiments.

p.j. hartlieb and @hlangeveld help drive home the point that the purpose of tests is to provide information. If your tests aren't telling you anything, they're useless.

Normative tests

Test runs tell you whether you've finished new features and if you've broken old ones. I would call that normative information, because it reports on conformance to requirements. That kind of knowledge can answer questions like "Is this change ready to commit?" or "Can we go live on Monday?".

Management love normative information because it helps them make decisions and measure progress. This naturally leads to an over-emphasis on tests' role as a source of normative information.

Informative tests

Good tests are also be informative. They explain the meaning of failures and communicate intent. Tests can serve as alternative requirements documentation. Indeed, systems like Fitnesse unify the two concepts by converting requirements into executable acceptance tests.

The audience for informative tests is almost exclusively the development team. Informative tests provide an intimate perspective on the system's concepts that's necessary to work with the software on a daily basis. This is not information required by management, so the impetus to improve the tests' informative qualities needs to come from the development team themselves.

A Selenium system test that reports failure by dumping a raw exception stacktrace serves its normative function perfectly well. There has been a regression. We are not ready to release. Someone tell management so that they can manage the client's expectations. From Issue 658 in the Selenium bug tracker:

org.openqa.selenium.ElementNotVisibleException: Element is not currently visible and so may not be clicked
System info: os.name: 'Mac OS X', os.arch: 'x86_64', os.version: '10.6.1', java.version: '1.6.0_15'
Driver info: driver.version: remote
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:94)
 at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:327)
 at org.openqa.selenium.firefox.FirefoxDriver.execute(FirefoxDriver.java:191)
 at org.openqa.selenium.remote.RemoteWebElement.execute(RemoteWebElement.java:186)
 at org.openqa.selenium.remote.RemoteWebElement.click(RemoteWebElement.java:55)
 at org.openqa.selenium.internal.seleniumemulation.Click.handleSeleneseCommand(Click.java:33)
 at org.openqa.selenium.internal.seleniumemulation.Click.handleSeleneseCommand(Click.java:23)
 at org.openqa.selenium.internal.seleniumemulation.SeleneseCommand.apply(SeleneseCommand.java:30)
 at org.openqa.selenium.WebDriverCommandProcessor$1.call(WebDriverCommandProcessor.java:271)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:637)

If this was all that appeared in your test log, it would be very difficult to interpret the failure. There is no context. It's not apparent what functionality the user has lost, whether the error was handled gracefully or even if the problem is a conflict between user stories.

One way to make the result above more informative would be to catch the exception and log a message like "Error when an administrator attempted to reactivate a blocked account". Product owners don't care about the presence of divs. They care about functionality.

Communicative tests

Testing consumes a lot of effort. The return for that investment is readily available information on the state of the software. The more useful and accessible that information is, the more valuable the tests are.

Donald Knuth's description of literate programming is even more pertinent to testers than other programmers because the only purpose of tests is "explaining to human beings what we want a computer to do."

Blunt quantitative statements are sufficient to communicate normative information to people outside the development team. But to fulfill their potential within the team, test results must be qualitative, explanatory and communicative.

2 comments:

  1. The main problem is time to turn normative tests into informative tests.

    I can tell you from the stack trace that the script failed because it has tried to interact with a hidden element, so it is reasonably informative.

    Will management provide us with the time and resource to take this information and make it even more informative so that you don't need a tester who understands the nuts and bolts of the system to interpret it? Or will the testers be given a whole load of new work to test with an assumption made that code shouldn't really break so we don't need to worry about informative reporting until it does break (lets hope the guy who wrote that script is still working for us when it does break....)?

    The question is how do we convince those higher up the chain that the extra time and investment to make the tests informative is worthwhile?

    ReplyDelete
  2. It's a similar dilemma to convincing managers that code quality is important (or that it even exists).

    I suppose the answer is that teams have to earn enough trust that they are empowered to choose the practices that will make them most productive.

    If people outside the team are make inappropriately detailed decisions, they'll probably get them wrong.

    Sometimes extra investment in testing makes sense, and sometimes it doesn't. The people best placed to make that call are the developers themselves.

    ReplyDelete