Unit Tests As Throw Away Design
Unit tests are brittle: if you change the class under test there’s a more than average chance that you will have to change a load of unit test as well. On the other hand unit tests help you think about design on a micro level. The test shows what a method is supposed to do, without room for the interpretation errors you get when using abstractions as design.
So, should we use unit tests or not?
A few weeks ago we had a discussion at one of our knowledge exchange sessions about unit tests. The problem is that tests are brittle because they strongly depend on the code being tested. If you change the code under test it’s highly likely the test will break. If the test has intimate knowledge of the code being tested, this is what we would expect. If you take the test driven paradigm to the max, unit tests are supposed to test only a single class. So the class under test may not depend on any other code. A class that is independent of all other code is a rare thing, so we need a trick to ensure independence in tests. The standard solution is to mock all classes the class under test depends on. Because mocks and unit tests make assumptions explicit, they are a good way to design systems. Test code not only ensures the business code works, but it also documents business code. Making a change in this setup, however, may lead to quite a lot of work that doesn’t directly provide business value. The tests and mocks will not make it into production; they are only needed to create the real thing.
No problem so far, but lets go back in time to the hey day of case tools and code generators, just before the Internet hype hit us and made us all forget most of what we learned before. Back when life was easy and you never got hit by unpredictable loads and split second marketing decisions.
Way back, end of the eighties, systems were mainly designed in two phases. During design the system was specified using entity-relationship diagrams and functional decomposition. The first ended up as tables in a database. The functions were transformed to either reports or input forms. Rules about the data were specified as constraints that got translated to client and server side code in some database dialect. The transformation was mostly a manual process with only one exception: translating an ERD into a database schema is usually trivial and could therefore be automated easily. The input screen definitions specified data usage: what can a user change in the database using the input screen. Because constraints were specified on the data, it was easy to generate code to check the constraints in the input form as well.
But that was basically as far as you got. Code generators got you roughly half way there and everything else was craftsmanship with VI or Emacs. Tools got better in the Windows 3.11 and 95 timeframe but still we were writing most code using a text editor. The consequence was that the original design became less and less valuable over time: the details were added to the code while the design remained a stack of paper in a binder. Under such circumstances there is hardly any value in changing the design while coding and consequently the design slowly gathers dust on a shelf.
On a side note: we used to say designs were meant to convey the meaning of the system to end users. They were supposed to be understandable for customers. In my experience this is not the case. As IT personnel we live in a different parallel universe. Any form of abstraction is just too much for other people. If you don’t work in the IT industry you can say if the system does what you want it to do once you see it and not earlier. No amount of ERDs or class diagrams or whatever is going to help, ever. Only a working system will.
Back to unit tests. The parallel should be clear: logically Java code follows JUnit tests like Oracle Forms follows function diagrams. Unit test don’t make it into production just like ERDs. When you change a unit test the code has to be changed. Here the analogy starts to come apart: if you changed the ERD in the eighties typically nothing would break. Worse, you could have one group of people happily changing code while another group of people was changing the binders with ERDs and both would be happy living in parallel universes until the system went live.
Unit tests as design are like the recognition of the side note above: if we can’t make end users understand us anyway why bother them and ourselves with abstractions nobody uses? We might as well write test code that at least is useful during development.
The question is whether it is necessary to keep all this test code around even after we’re done with the part of the system the test code is validating. After a while the test becomes a nuisance: something you have to change while you actually want to change code. Fist thick binders full of functional decompositions spring to mind.
So how much unit tests do we actually need? My answer is that a unit test is required for interfaces to a component or module. A module is defined as a set of 10-15 classes accessible through an interface.
Since one of the most important requirements of unit tests is that they should be very fast, we need mocks if the module connects to slow or heavy weight resources like web services or databases. Mocking a database might be considered a bridge too far: databases are fast and easily available so often you might as well use the real thing in stead of a mock.
You should start development of the module by writing a test for the interface. It is perfectly acceptable to develop code inside the module by writing a unit test for a method of a class inside the module, but you should treat this as throw-away code. Once you’re done you can delete the test. Or you can let it lie around to be deleted when it gets in the way. The really important test is the test at the module level. This test has to be maintained along with the production code. All other tests are like the scraps of stone left behind after carving a statue.