Tuesday, 14 December 2010

Coding with ghosts

Software development is one of the most intensely collaborative processes I know. But stereotypically, programming is often seen as an activity for loners.

The majority of collaboration between developers is hidden because it happens across months and years. When I write code, I am working incredibly closely both with the original architect and with the future employee who hasn't graduated yet. I am communicating with people I may never meet, but also to past and future versions of myself.

When I leave work, most of the collaboration that I have participated in that day hasn't even happened yet.

Sunday, 5 December 2010

Wikileaks is Napster

Remember Napster, the first widely successful peer-to-peer file-sharing service? In 2001, the Recording Industry Association of America won a lawsuit that killed Napster as a free service.

Lawsuits against Napster did not shut down peer-to-peer file-sharing because, like Wikileaks, Napster was neither the source nor the ultimate destination of the information flowing through it.

Wikileaks is a peer-to-peer file-sharing service. Its current architecture is centralised, which is a weakness that it shares with Napster. But if the central node is taken out, it won't be long before a new service with a decentralised architecture springs up.

The only thing that shutting down Wikileaks could possibly achieve is to make whistle-blowing slightly less convenient. For about a month.

Sunday, 28 November 2010

Architecture is the opposite of surprise

Architects often disagree on technical matters. But there's also a surprising amount of disagreement on what software architecture actually is. Here are a few definitions that I've come across:
  • an abstract description of the system
  • work done by developers with the title "architect"
  • that diagram with clouds and arrows that somebody put on the network drive at the start of the project before we really knew what we were building
  • that document signed by the client at the start of the project before we really knew what we were building
The problem with all these definitions is that they imply that systems that don't have architects don't have architectures. Any definition that can't cope with the majority of applications that have emerged without considered architectural analysis isn't very useful.

Martin Fowler's definition is more useful and widely applicable. He suggests that software architecture is the set of “things that people perceive as hard to change.” [PDF] This characterisation is successful because it shifts the focus from the production to the consumption of architecture.

Fowler's definition challenges the intentional fallacy as it applies to software architecture, which is the idea that the meaning of a text belongs to its author. Fowler's architecture can therefore include elements that were never deliberately envisaged by an architect, which in turn lets us consider systems that never had an architect.

A similar idea was advanced in the essay The Death of the Author by the literary critic Roland Barthes:
As soon as a fact is narrated no longer with a view to acting directly on reality but intransitively, that is to say, finally outside of any function other than that of the very practice of the symbol itself, this disconnection occurs, the voice loses its origin, the author enters into his own death, writing begins.
Substitute "architect" for "author" and "development begins" for "writing begins" and Barthes could be talking about what happens when a carefully prepared architecture document is handed over for implementation.

Claude E. Shannon's information theory formally analyses the consumption of texts. He measured the information content of written English by showing test subjects a truncated piece of English text and asking them to guess what letter would come next. They guessed correctly about half the time (which means that English contains roughly 1 bit of information per letter).

The beauty of this experiment is that Shannon didn't need a model of his subjects' knowledge of English. All he had to do was observe was what happened when they applied that knowledge.

Implicit in Shannon's experiment is the idea that English is the sum of all cues that inform speakers as to what could come next. Following his approach, I would define architecture as the sum of all cues that suggest to a developer how a feature should be implemented in a particular system.

These cues can take many forms. Perhaps the arrows and clouds diagram tells a developer in which tier of the system to put a particular piece of logic. But developers are also guided by the language used by stakeholders, organisational structure and the culture of the technology stack.

Under this definition, the more prescriptive a system's architecture, the less information developers need to absorb in order to understand a given feature. In other words, architecture is the opposite of surprise.

Wednesday, 17 November 2010

Russell on programming language design

A good notation has a subtlety and suggestiveness which make it seem, at times, like a live teacher. 
- Bertrand Russell, in the introduction to Ludwig Wittgenstein's Tractatus Logico-Philosophicus 

Friday, 5 November 2010

Communicative testing

A couple of weeks ago I proposed that tests could be thought of as facts that have to be 'explained' by code. In a comment on that post, p.j. hartlieb pointed out that this paradigm relies high tests dependability. And @hlangeveld suggested that test runs should be seen as analogous to experiments.

p.j. hartlieb and @hlangeveld help drive home the point that the purpose of tests is to provide information. If your tests aren't telling you anything, they're useless.

Normative tests

Test runs tell you whether you've finished new features and if you've broken old ones. I would call that normative information, because it reports on conformance to requirements. That kind of knowledge can answer questions like "Is this change ready to commit?" or "Can we go live on Monday?".

Management love normative information because it helps them make decisions and measure progress. This naturally leads to an over-emphasis on tests' role as a source of normative information.

Informative tests

Good tests are also be informative. They explain the meaning of failures and communicate intent. Tests can serve as alternative requirements documentation. Indeed, systems like Fitnesse unify the two concepts by converting requirements into executable acceptance tests.

The audience for informative tests is almost exclusively the development team. Informative tests provide an intimate perspective on the system's concepts that's necessary to work with the software on a daily basis. This is not information required by management, so the impetus to improve the tests' informative qualities needs to come from the development team themselves.

A Selenium system test that reports failure by dumping a raw exception stacktrace serves its normative function perfectly well. There has been a regression. We are not ready to release. Someone tell management so that they can manage the client's expectations. From Issue 658 in the Selenium bug tracker:

org.openqa.selenium.ElementNotVisibleException: Element is not currently visible and so may not be clicked
System info: os.name: 'Mac OS X', os.arch: 'x86_64', os.version: '10.6.1', java.version: '1.6.0_15'
Driver info: driver.version: remote
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:94)
 at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:327)
 at org.openqa.selenium.firefox.FirefoxDriver.execute(FirefoxDriver.java:191)
 at org.openqa.selenium.remote.RemoteWebElement.execute(RemoteWebElement.java:186)
 at org.openqa.selenium.remote.RemoteWebElement.click(RemoteWebElement.java:55)
 at org.openqa.selenium.internal.seleniumemulation.Click.handleSeleneseCommand(Click.java:33)
 at org.openqa.selenium.internal.seleniumemulation.Click.handleSeleneseCommand(Click.java:23)
 at org.openqa.selenium.internal.seleniumemulation.SeleneseCommand.apply(SeleneseCommand.java:30)
 at org.openqa.selenium.WebDriverCommandProcessor$1.call(WebDriverCommandProcessor.java:271)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:637)

If this was all that appeared in your test log, it would be very difficult to interpret the failure. There is no context. It's not apparent what functionality the user has lost, whether the error was handled gracefully or even if the problem is a conflict between user stories.

One way to make the result above more informative would be to catch the exception and log a message like "Error when an administrator attempted to reactivate a blocked account". Product owners don't care about the presence of divs. They care about functionality.

Communicative tests

Testing consumes a lot of effort. The return for that investment is readily available information on the state of the software. The more useful and accessible that information is, the more valuable the tests are.

Donald Knuth's description of literate programming is even more pertinent to testers than other programmers because the only purpose of tests is "explaining to human beings what we want a computer to do."

Blunt quantitative statements are sufficient to communicate normative information to people outside the development team. But to fulfill their potential within the team, test results must be qualitative, explanatory and communicative.

Friday, 29 October 2010

Against technical debt

Technical debt is a very useful concept for explaining the consequences of dirty code to management. However, there is a problem that I have with the debt metaphor. The phrase technical debt implies that it's possible to avoid the debt. If I don't write shoddy code today, I wont have to pay for it tomorrow.

This obscures the fact that though dirty code costs more than clean code, every line of code impedes your agility. Sometimes product owners ask for features that compromise a system's architecture or domain model. When I've tried to describe the technical debt that will be incurred by an awkward feature, I've (quite reasonably) been asked how much effort it would take to "do it properly". I'm stumped, because no matter how thoroughly I implement the feature, it will still cause problems down the line.

Sometimes I fall back on depreciation, which I can use to explain anything that reduces the system's ability to meet future needs. Unlike debt, depreciation isn't automatically reversible. I've also considered that fear-driven estimation might produce estimates that more accurately reflect the long-term cost of a story.

I don't want to see the technical debt analogy deprecated, but I do want to encourage people to think critically about how they use it, because all metaphors have their limits.

Sunday, 24 October 2010

As a stakeholder

A common template for user stories is "As a user, I want". This forces stakeholders to make the business value of the story explicit and encourages consistency.

However, there are some stories that this doesn't make sense for, including ones that are to the business' advantage and the users' detriment. Stating all stories in terms of users' wants can result in bizarre stories that conceal who has a stake in the their completion:
As a user, I want my DVDs to not work in other regions, so that I have to buy them again if I move countries.
As much as we focus on users, we don't build commercial software for them. It just so happens that satisfying users is a necessary part of achieving our other aims - like making money.

Users are stakeholders, but they aren't the only stakeholders. If we revise the template to "As a stakeholder, I want", then we're able to state anti-user stories much more naturally:
As the sales department, I want to prevent DVDs bought in one region from being played in another, so that I can release and price DVDs in different markets independently.
Thanks to @MrsSarahJones for pointing this out to me.

Saturday, 16 October 2010

Tests are facts. Code is theory.

Programmers have turned to science to help resolve the software crisis. But they're doing it wrong.

Science envy

Programmers have science envy. We feel that, unlike much of our code, science works. Scientists have spent hundreds of years honing a methodology that helps them assimilate new knowledge and correct error, while we have spent decades frantically accumulating complexity that we can't handle. Strangely, scientific theories become more accurate over time, whereas software systems often decay.

The software industry has tried to learn from science and engineering's success. We call our programming degrees "Computer Science" and "Software Engineering", though they are neither. "Computer Science" students do almost no experiments. The "Software Engineering" concept of exhaustive up-front design has become so discredited that even those who can't imagine any other way feel obliged to pretend that they "don't do Waterfall".

Of course, science and engineering are just analogies when applied to programming. They are meant to be useful ways of imagining our profession, not to be literally true. But in their naive form, I don't think analogies between programming and science are very useful. If we want to benefit from scientific rigour, we need to be more rigorous in how we appropriate scientific concepts.

Scientific testing

Some software testers have used the scientific method as a way of framing their testing activities. For example, David Saff, Marat Boshernitsan and Michael D. Ernst explicitly cite Karl Popper and the scientific method in their paper on test theories. Test theories are invariant properties possessed by a piece of code which Saff et al attempt to falsify over a wide range of data points with an extension to the JUnit testing framework.

I find the reciprocal of this approach useful when debugging. I start with a defect, form a theory as to its cause, then design a test to try and falsify that theory. If I suspect that the issue is caused by rogue javascript, I'll disable javascript and attempt to reproduce the issue. If I can, I've disproved my theory and I need to find another explanation. This helps me to eliminate false causes and gradually home in on the bug.

The problem with analogies that treat tests as theories and code as a phenomena is that they tell us nothing about how to write code. The software under test is like gravity, a chemical reaction or the weather. It may or may not have an underlying structure and beauty, but any insights we gain during testing are inevitably after-the-fact.

Worse, they are static models. When software changes over time, the knowledge gathered through "scientific testing" may no longer apply. The scope of scientific testing is confined to a specific version of the software. For example, a tested and verified "theory" about the memory profile of an application may become invalid when a programmer makes a small change to a caching policy.

Tests are facts. Code is theory.

Science's strength is its ability to assimilate new discoveries. If we want to share in its success, a scientific model of software development needs to preserve science's adaptability.

We can go some way to achieving this by reversing the roles of testing and coding in the scientific testing model. Tests are facts. Code's role is as a theory that explains those facts as gracefully and simply as possible.

New requirements mean new tests. New tests are newly discovered facts that must be incorporated into the code's model of reality. Software can be seen as a specialised theory that attempts to embody what the stakeholders want the application to do.

How does that help us?

Once we accept that code as a theory, we are then in a position to justify employing the most powerful weapon in science's armoury - Occam's razor. Our role is to write the simplest possible code that is consistent with the facts/tests/requirements. Whenever we have the opportunity to eliminate concepts from our code, we should.

Simple code isn't just cheaper. It's more valuable too, because it's easier to change and extend. We can justify this with reference to scientists' experience that the simplest theory is the most likely to survive subsequent discoveries.

As new requirements arrive and our understanding of the domain deepens, we have the opportunity to refactor. Refactoring isn't rework or throwing away effort. Refactoring is enhancing code's value by incorporating new knowledge on what we want our software to do. This could be by adding functionality, or in reducing complexity. Either makes the software as a whole more valuable.

Science celebrates refactoring. Each new piece of evidence clarifies scientists' understanding of phenomena and helps yield more useful theories. Often these refinements are small, but occasionally Einstein will have an insight that supercedes Newton's laws of motion. Domain driven design founder Eric Evans describes such pivotal moments on software projects as "breakthroughs".

Non-developers often assume an application is invariably more valuable with a feature than without it. Yet the example of special relativity allows us to explain otherwise. Newton's laws of motion are perfectly adequate for ordinary use. Unless we are interested in bodies moving close to the speed of light, it's not worth bothering with the additional complexity Einstein's theories bring.

If stakeholders are willing to accept that the application targets the common case and excludes troublesome edge cases, they will enjoy software that is simpler and therefore cheaper and more valuable. Sometimes, there is value in absent features. Always, there is value in simpler code.

Crave simplicity. Celebrate deletion. If science responded to new information by adding special cases then science would be in as big a mess as the software industry. As you incorporate new requirements, attempt to refine your code so that it remains flexible enough to accomodate tomorrow's requirements. Otherwise, your code will become less and less fit for its purpose, which is to provide business value.


When John Maynard Keynes was attacked for repeatedly revising his economic theories, he said, "When the facts change, I change my mind – what do you do, sir?" Take the same attitude with your code, but treat requirements and tests as your facts. And remember, your code is just your best approximation of what your stakeholders want it to be.

Saturday, 2 October 2010

My agile canon

As much as I enjoy reading blog posts, they cannot match the sustained argument from a well-written book. Fortunately, the agile movement has many articulate advocates who have been busy committing their thoughts to paper (and eReader) over the past few years.

Here are some agile books that I recommend:
The exciting thing about reading these books is that they are part of an ongoing conversation. We haven't worked out how to build software well yet, but I think that these books bring us a little closer.

Tuesday, 24 August 2010

Tolstoy on iterative development

But their mother country was too far off, and a man who has six or seven hundred miles to walk before reaching his destination must be able to put his final goal out of his mind and say to himself that he will 'do thirty miles today and then spend the night somewhere'; and during this first stage of the journey that resting-place for the night eclipses the image of his ultimate goal and absorbs all his hopes and desires.

- Leo Tolstoy, War and Peace

Monday, 16 August 2010

Fear-based estimation

I prefer agile development because early feedback mechanisms like TDD, pair-programming and frequent releases reduce my fear of the unknown. I don't like feeling afraid, but when I am I try to take notice, because it's usually an indication that something is not right.

A great agile tradition is acknowledging the human element in software development. In that spirit, I'd like to suggest that developers' fear can be harnessed to improve estimation.

Humans are notoriously bad at temporal reasoning and programmers are even-more-notoriously bad at guessing how long a given piece of work will take. Agile teams work around this by estimating effort in 'story points' that measure the comparative size of tasks. We might not be sure how long designing the new schema will take, but we assign it 2 story points because we know it will take twice as long as another task we gave 1 point.

The rate at which a team completes points is known as its 'velocity' and can be determined by examining the team's progress over time. It's a robust, self-correcting system, but I've observed two drawbacks on projects I've worked:
  • The confidence of estimates isn't captured.
  • The technical debt incurred by a particular piece of work is ignored altogether.
A story that has a chance of going horribly wrong and that will in any case cripple the system's architecture could receive a low story point score because it (probably) won't take long to implement. Confronted with such a story, developers are terrified - and helpless.

I propose that we estimate in terms of 'fear points' to give product owners a disincentive to prioritise dangerous stories. The question we ask in estimation meetings shouldn't be "how big is this task" but "how afraid does this task make you?".

So long as the developers have an appropriate level of professional cowardice, a story's fear points will reflect the danger the story represents to the project's current and future timelines. And I think that 'danger' is as useful to communicate to management as 'size'.

If product owners want to negotiate a reduction in a story's fear points then they need to reduce uncertainty, remove risky features and compromise on ideas that would incur a lot of technical debt. They could also prioritise other stories that improve the team's confidence in the development process, like improved Continuous Integration.

Whenever I have seen developers afraid of a feature, it turns out badly. It ends up buggy, expensive to change and probably doesn't serve its intended purpose. By gauging developers' fear, agile organisations have an opportunity to avoid trouble before it happens.

If your developers are afraid of a feature - be very afraid.

Monday, 17 May 2010

Safe habits

When I was growing up, I found some of my parents' habits irritatingly conservative. Examples included:
  • Always lock doors from the outside using the key
  • Always turn on the cold tap before the hot tap
The idea was that following these habits guaranteed you could never burn yourself or lock yourself out. But I found these precautions frustrating because I knew that it was unlikely that I'd forget my keys or put my hand in the scalding water.

However, I was mistaking the most likely outcome for the entire distribution. In Life's Grandeur, Stephen Jay Gould calls this "reification" - fixating on the average case and ignoring variation and atypical outcomes.

On any given occasion, my cavalier attitude would probably suffice. But sooner or later, if I locked doors by setting the snib and pulling them shut, my keys would not be in my pocket.

My parents understood my fallibility better than I did. And they were in a better position to appreciate the myriad of door-locking scenarios that would confront me over my lifetime.

That's why I like to think that if my mother were a programmer (and she'd make a good one) she would advocate TDD, which goes as follows:
  1. Write a failing test
  2. Write the minimum of code to pass the test
  3. Refactor
If you make TDD a permanent habit then you are inevitably at most 5 or 10 minutes away from working, tested code. And you know that every bit of code you write has a corresponding test.

Be mindful of your development habits, and design them for the bad days, not the average days.

Monday, 15 February 2010

Descartes on Unit Testing

For he who attempts to view a multitude of objects with the one and the same glance, sees none of them distinctly; and similarly the man who is wont to attend to many things at the same time by a single act of thought is confused in mind.

- René Descartes, 'Rules for the direction of the mind' in Key Philosophical Writings

Saturday, 6 February 2010

Deconstructing The Savoy

The software industry has long looked to the construction industry for inspiration.

We appropriate its vocabulary - programmers "build" software designed by "architects". We draw on its ideas - the seminal Gang of Four book adapted architect Christopher Alexander's concept of design patterns for use in software construction. And the discipline of software engineering was founded on a desire to employ civil engineering practices to help us build complex software systems.

For me, the most striking similarity between the two industries is the frequency of budget blowouts and schedule overruns. The great thing about this for software developers is that it gives us a tangible way of describing our otherwise inexplicable travails and catastrophes to ordinary people.

Yesterday in The London Evening Standard I read an article about the renovations of The Savoy, the famous London hotel. It read almost word-for-word like a story about an overly ambitious IT migration project.
When The Savoy closed on 15 December 2007 for a planned 16-month, £100 million makeover, it was hoped the hotel would quickly resume its status, buffed and restored to its former glory.
In the planning stages, optimism rules. The project is so large and complex that we can't possibly plan accurately, so in the absence of evidence one way or the other we assume the best.
The original £100 million estimate for the work has been ripped up and although operators Fairmont Hotels and Resorts will not disclose the actual figure, the 15 months of work suggests it could be close to double that sum.
Once work begins, the fragility of the initial estimates is exposed. Often, the 'estimate' of how much a project will cost is as much based on the depth of the client's pockets as the actual effort required to get the job done (which of course no one knows in advance anyway).
Part of the problem was The Savoy's unplanned, organic growth.
We have to live with the sins of those who came before us. In my experience, the quality of a system's legacy code base has more impact on a project than the inherent difficulty of the project in question.
Although we had done two years of planning and tried to assess the level of issues behind the walls, it's only when you close the doors and open it up that you realise the amount of work is much more serious and extensive than first envisaged.
Unsurprisingly, once you are up to your elbows in a system's viscera, you have a much better idea about what you're in for. Exploratory surgery is the only way to be certain of how long changes will take.
"What was an open courtyard suddenly became a room, with a mix of internal and external walls."
A system accrues idiosyncrasies because it is inevitably patched, hacked and enhanced. Modules that were designed with one use-case in mind are re-purposed as business needs change. And scar tissue accumulates.
Digging up the roadway in Savoy Place, off the Strand entrance - still the only place in the UK where one must legally drive on the right - he found a huge gulley running around the perimeter, instead of a solid foundation. "We don't even know what it's for."
Users will adapt to visible peculiarities. They may even grow attached to them, even if the rationale for them has become obsolete (cars drive on the right in Savoy Place so that hansom cab drivers could open the door for their customers without leaving their seat).

But much more frightening are the dull, blank blocks of obsolete code that loom like monoliths erected by a vanished civilisation. No one remembers what they were originally intended to do, and you can never be quite sure that the earth won't mysteriously stop turning if they are ever removed.
The huge expense and loss of revenue mean The Savoy has to "hit the ground running" when it reopens if the money is ever to be recouped.
Counter intuitively, the level of optimism rises as the schedule slips. It's very tempting to think that although we fell behind in phase 1, we can make up the time in phase 2. However, it's much more likely that if one part of a project runs into trouble, the rest will too.
As a manager at one said: "Hotel travellers are very promiscuous, they will, as it were, sleep around. While you are off the scene many will have happily moved on and it could take years to get them back."
Software users are even more promiscuous, especially on the web. They will cheerfully see your competitors behind your back, even in the good times. They will not tolerate a prolonged outage and they will complain loudly if your service is unavailable even for an hour.

But I am not entirely pessemistic about the software development process. There is one important attribute that software has that buildings don't - malleability.

We are able to follow agile methodologies and incrementally improve our programs. We don't have to follow The Savoy's example and attempt to implement an enormous modification in one go. We can embrace change and use the information we gather along the way to improve the end product. And if we refactor as we develop, we can reduce the amount of technical debt we bequeath to our successors.

Thursday, 14 January 2010

Clean code

At the beginning of Clean Code, Uncle Bob Martin enlists various well-respected programmers to explain what code cleanliness means to them. These luminaries include Bjarne Stroustrup (inventor of C++) and Ward Cunningham (inventor of the wiki).

Here is my definition:
Clean code imposes minimal impedance between the reader and the intent of the author. It contains little accidental complexity and its meaning can be easily understood, verified and manipulated.
When I work with clean code I have a sensation of reaching through the code to directly engage with the system's concepts.

When I work with unclean code my vision is clouded by weak naming, murky structure, inadequate commenting, convoluted dependencies and duplicated logic. Unclean code makes me afraid, because I cannot predict or understand the consequences of my changes.