20111213

Why I Created Daily Feed Recycler

The motivation behind Daily Feed Recycler is that there is too much content on the Internet: good and bad. Once I found a good source, I do not have enough time to actually read everything. I just can’t read 5k word articles in a row. Even if I had the time to do it, I cannot mentally. It’s not fun.


Feeds are a wonderful tool for authors and readers. They allow to stay informed about the changes on a page. Feed aggregators are part of this reading experience. They let you manage feeds: read articles, mark as read, subscribe and aggregate feeds. Applications creating feeds get this for free and there are a multitude of feed aggregators available.

Feeds help with time problem to some extent, but they - wonderful as they are - have a dark side. Instead of solving the problem of organizing content in a way that you can read really good stuff, they organize content in a way that you can read the latest stuff. Yes, the new content is cool, but the classics are here to stay. You will not get a tweet from Goethe. With Daily Feed Recycler the good stuff is on an equal footing with the latest stuff. The content for every day is presented as new stuff in your feed aggregator.

Content has to be presented in a digestible manner. Deep reading, not just skimming to get an overview. Time and mental energy has to be organized in a way to allow reading. The Daily Feed Recycler is thought of as a way to break down content in smaller parts and add a reminder that there’s still good content to read. Using your feed aggregator you can decide when you read it, if you ignore it or if you read it at all. Feed aggregators are quite good nowadays and their flexibility is useful here.

You can now create your own channel of daily content: the full bash reference, the mayor Linux man pages or the list of all decision biases from Wikipedia. Everything you like. I’m often looking for daily feeds but there are not that many of them out there, because it’s a lot of work to do and depends on a certain level of expertise. Daily Feed Recycler is not a competitor for the existing curated daily feeds. A curated feed can have a much better quality through an expert selection and logical ordering of content.

Daily Feed Recycler follows the “Release early, Release often” philosophy. There are bugs, missing features and rough edges. I hope you find it useful nonetheless.

20110715

Find cruft with a new Mercurial extension

After some fun with the quick and untested shell script that finds the oldest code in a Subversion repository, it is the next step to write a Mercurial extension. The simple Mercurial extension cruft does basically the same job as the shell scripts for Subversion. Being an extension it’s nicely integrated into Mercurial as the other extensions.


Python and Mercurial are relatively easy to get into. Mercurial provides the Developer Info page which is really good. Additionally, there’s a guide how to write a Mercurial extension. The guide is good start for the Mecurial development. The rest can be easily picked up by reading the code of other commands and extensions.
The code is readable and there are no big hurdles.

The only thing I missed while writing the extension is type information in method signatures. As much as I like Python it’s ridiculous to write the type information in the pydoc and let the developer figure out the types. This one of the trade-offs you have to live with.

Testing Mercurial extensions


It suffices to understand the integration tests tool Mercurial uses to test the extension itself. There’s some documentation for this as well. The basic idea behind Cram is to start a process and check against the expected output.

The integration test tool defines a small language. All lines that have no indentation are comments. Indented lines starting with $ are executed and all other lines are the expected output. For example a test looks like this:

init

 $ hg init

 $ cat <<EOF >>a
 > c1
 > c2
 > EOF
 $ hg ci -A -m "commit 0"
 adding a

cruft

 $ hg cruft
 0 a c1
 0 a c2

First a repository is initialized: a file called a with the content (c1,c2) is committed and then the Mercurial is started with the cruft command. Without options the cruft extension prints all lines with newest lines first. The expected output is (0 a c1, 0 a c2) which is means the revision 0 file a and line c1; revision 0 file a and line c2.

It’s fairly easy to get started with this tool. The only downside in my tests is that the they reuse the same test fixture and do not reset the fixture for each test. They are not executed in isolation, which has a whole range of problems - redundancy and readability for example - but I didn’t feel that it was worth the effort to structure the tests otherwise.

Installing the extension


The easiest way to install the extension is to download the cruft.py to a local folder and add a link to the extension file in the .hgrc file.

[extensions]
cruft=~/.hgext/cruft.py

Using the extension


After the installation you can execute pretty much the same commands as with the shell script version.

hg help cruft

hg cruft

(no help text available)

options:

-l --limit VALUE   oldest lines taken into account
-c --changes       biggest change sets
-f --files         biggest changes per file
-X --filter VALUE  filter lines that match the regular expression
    --mq            operate on patch repository

use "hg -v help cruft" to show global options

I use here the quickcheck source code to show some sample output.

hg cruft -l 5 -X "^(\s*}\s*|\s*/.*|\s*[*].*|\s*|\s*@Override\s*|.*class.*|import.*|package.*)$" quickcheck-core/src/main
This finds the oldest 5 lines using the Java source code specific exclusion pattern (parentheses, imports, class definitions etc.) for the quickcheck-core/src/main folder. The output contains the revision number, source file and source code line.

5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java public Object[] next() {
5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java ArrayList<Object> next = new ArrayList<Object>(generators.length);
5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java for (Generator<?> gen : generators) {
5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java next.add(gen.next());
5 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/TupleGenerator.java return next.toArray();
You can also find the biggest change sets for the last 500 lines.

hg cruft -X "^(\s*}\s*|\s*/.*|\s*[*].*|\s*|\s*@Override\s*|.*class.*|import.*
|package.*)$" -l 500 -c quickcheck-core/src/main
This prints the revisions number, number of changed lines and commit comment of the change set.

49 41 removed getClassification method from Property interface
moved Classification into quickcheck.property package
177 43 MutationGenerator, CloningMutationGenerator and CloningGenerator added
139 50 fixed generic var arg array problems
5 53 initial check in

Finally, you can find the files with the most lines changed by a single change set (again with the filter and for the 500 oldest lines).

hg cruft -X "^(\s*}\s*|\s*/.*|\s*[*].*|\s*|\s*@Override\s*|.*class.*|import.*
|package.*)$" -l 500 -f quickcheck-core/src/main
This prints the revision number, file name, number of changes and change set commit comment.

176 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/AbstractTreeGenerator.java 27 added tree generator
177 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/CloningGenerator.java 28 MutationGenerator, CloningMutationGenerator and CloningGenerator added
139 quickcheck-core/src/main/java/net/java/quickcheck/generator/support/DefaultFrequencyGenerator.java 36 fixed generic var arg array problems
49 quickcheck-core/src/main/java/net/java/quickcheck/characteristic/Classification.java 41 removed getClassification method from Property interface
moved Classification into quickcheck.property package

Conclusion


Developing a Mercurial extension is relatively easy given Python, the good Mercurial documentation, the good readability of the code and integration test tool. If you’re using Mercurial you should give Mercurial extension development a try. I’ve only recently read into Python again so this is the Python beginner’s version of a Mercurial extension. Help to improve the implementation is always appreciated.

Learning Python and seeing how things are implemented there is fun. Looking at the PEPs and the associated process, they feel much more accessible and open than JSRs. The PEPs are also a track record of the advances the language makes and problems it tries to solve one after the other. There’s stuff in Python that you’ll probably never see in Java like the generator expressions. Everyone who had to replace an internal loop with an iterator will understand that this is not a toy. The language features seem to sum up quite nicely and result in a productive environment. As always, some things are unfamiliar or missing but there’s no perfect platform.

20110530

Find cruft in your source code repository

Micheal Feathers wrote in his blog post “The Carrying-Cost of Code: Taking Lean Seriously”
that is necessary to remove old code from your product to be able to add new features. His argument is that you get a better understanding of your production code this way. Rewriting your code constantly leads to more readable and compact code.


"There are many places in the industry where existing mountains of code are a drag on progress.
[..]
Younger organizations without as much software infrastructure often have a competitive advantage provided they can ramp up to a base feature set quickly and provide value that more encumbered software-based companies can't.  It's a scenario that plays out over and over again, but people don't really talk about it.
[..]
I'd like to have code base where every line of code written disappears exactly three months after it is written.
[..]
I have the suspicion that a company could actually do better over the long term doing that, and the reason is because the costs of carrying code are real, but no one accounts for them."

His reasoning goes so far as to ask product owners to remove features that are not needed. Software size seems to increase strictly monotonic. This makes maintenance harder and more costly. I’m not sure if you have to follow the advice strictly too improve your situation. Before you start arguing with your boss about removing features, it is a good idea to look for low-hanging fruit first: the oldest lines.

Metric


The heuristic comes from the observation that a) software has bugs and that b) if the software is actually used bugs will be found and fixed. Fixing the bugs leads to new code as does changes in coding style, new APIs etc. Old unchanged code is either bug-free, feature-complete and state-off-the-art or something nobody cares about. I’d say the metric is not too bad to find some victims. (A metric like this one should be a tool to find problems not an absolute measurement. Metrics should not be taken too seriously and nobody should be tempted to cheat.)

To put the idea into practice I’ve hacked some scripts to find suspects in a subversion repository. The scripts are:
  • find the oldest lines in your repository
  • find biggest change sets in your repository considering the oldest lines
  • find files that are changed the most by a change set considering the n oldest lines

Too get some data I’ll use the legacy subversion repository of the Quickcheck project. Quickcheck moved to Mercurial some time ago. It’s a test to see if something significant can be found with this metric.

The readme contains instructions how you can run the scripts with your subversion repository. The script are based on a local repository mirror to speed up analysis. The analysis can be execute on any subtree of the repository.

Oldest lines


Finding the oldest lines is quite simple first get all file names with svn list and then use svn blame to get the date for every line. These output is sorted by the revision (descending) of each line.

The output of the oldest_lines.sh is unfiltered. To extract useful information it has to be filtered. The filter.sh does this for Java source code: removing empty lines, single closing braces, package declaration, imports and comments.

These are the last lines of the filtered output for Quickcheck:
$ ./filter.sh | tail -5

6     blob79    public int compare(Pair<Object, Double> o1, Pair<Object, Double> o2) { File: characteristic/Classification.java Line: 162
6     blob79    next.add(gen.next()); File: generator/support/TupleGenerator.java Line: 34
6     blob79    ArrayList<Pair<Object, Double>> toSort) { File: characteristic/Classification.java Line: 150
6     blob79    @SuppressWarnings("unchecked") File: generator/CombinedGenerators.java Line: 126
6     blob79     Object[] next = generator.next(); File: generator/CombinedGenerators.java Line: 128

A potential victim here is the Classification class. It’s rudiment from the original Quickcheck implementation but never was used heavily. It’s a nice idea to do statistical testing but Classification could be removed from Quickcheck without loosing a significant feature.

Biggest change sets


The second script top_change_sets.sh finds the biggest change sets considering only the n oldest lines. This results in an interesting output for the code base (oldest 1500 lines, top 5 change sets):

$ ./top_change_sets.sh 1500 5

r182 | blob79 | 20071219 19:15:24 +0100 (Wed, 19 Dec 2007) | 1 line
basic failed test instances serialization feature implementation
136 changes

r270 | blob79 | 20090603 18:52:52 +0200 (Wed, 03 Jun 2009) | 1 line
added pojo (a.k.a object) generator for interfaces
104 changes

r6 | blob79 | 20070707 07:29:14 +0200 (Sat, 07 Jul 2007) | 1 line
initial check in
68 changes

r204 | blob79 | 20080323 19:29:28 +0100 (Sun, 23 Mar 2008) | 1 line
added svn keyword id
52 changes

r198 | blob79 | 20080323 18:29:26 +0100 (Sun, 23 Mar 2008) | 3 lines
fixed logging for serializing and deserializing runner
mandate a user set characteristic name (for serialization of test values)
added system property for number of runs
48 changes

Revision 182,198 were commits related to the obscure test data serialization and deserialization scheme. Something I’ve already removed in the latest release. The two changes resulted in 184 lines still present in the current source.

The revision 270 is not less obscure. It’s a declarative POJO object generator. The revision is so high in the list because it forced a lot of changes. This is not a good sign: obscure feature and lots of changes. That’s something worth to investigate.

Revision 6 is the initial check in. So this should be okay.

The last open issue revision 204 is the attack of the code formatters. They should be used with prudence as long as the down-stream tools can’t handle the changes properly. (Source control system should understand the AST of the source language.)

File changes


Now we can take a look at the files with most changes from a single revision. If you execute the query top_changes_in_file.sh (500 oldest lines, top 5) for the Quickcheck source code you’ll see:

$ ./top_changes_in_file.sh 500 5
r182 | blob79 | 2007-12-19 19:15:24 +0100 (Wed, 19 Dec 2007) | 1 line
basic failed test instances serialization feature implementation
34 changes | file: RunnerImpl.java

r180 | blob79 | 2007-12-07 18:59:59 +0100 (Fri, 07 Dec 2007) | 3 lines
MutationGenerator, CloningMutationGenerator and CloningGenerator added
26 changes | file: generator/support/CloningGenerator.java

r182 | blob79 | 2007-12-19 19:15:24 +0100 (Wed, 19 Dec 2007) | 1 line
basic failed test instances serialization feature implementation
24 changes | file: SerializingRunnerDecorator.java

r6 | blob79 | 2007-07-07 07:29:14 +0200 (Sat, 07 Jul 2007) | 1 line
initial check in
22 changes | file: characteristic/Classification.java

r179 | blob79 | 2007-10-13 09:14:30 +0200 (Sat, 13 Oct 2007) | 1 line
added tree generator
22 changes | file: generator/support/AbstractTreeGenerator.java

Besides the usual suspects serialization support and the Classification class two new suspects emerge: mutation generator and a tree generator. In favor of the tree generator and mutation generator implementation, they might be useful but aren’t widely used so this something worth to look at.

Conclusion


The metrics found multiple source files that are worth investigating. One feature that is already removed (serialization support), one likely victim (classification) and multiple places that are worth checking (mutation generator, tree generator, declarative POJO generator). The metrics seems to find unloved children in the code that are good candidates for removal or implementation improvements.

I always like to remove code. Fewer lines of code means fewer spots where problems may emerge. Nobody can argue that if you can remove unused code that it’s better to keep the useless code - even if it’s tested and production-quality. That’s something like a reverse YAGNI. If you really care the code will never disappear. You can find it in your source code management system. You should be okay with that fact that the old code will lose it’s relevance due to changes to the production system implementation. It can be a inspiration how it could be done if the world hadn’t changed. The burden of these changes are also the reason why it’s better to remove the code in the first place. Dragging it with you without any gain is plain waste.

20110509

Quickcheck 0.6 Release

The version 0.6 of Quickcheck is ready to use. The main features are support for deterministic execution of generators and improvements on generators. The JUnit runner support was removed in this release.

You can read an in detail description of deterministic execution in this blog post.

The 0.6 release adds the following generators:
  - map generator maps(Generator<K> keys, Generator<V> values)
  - subset generator sets(Set<T>)
  - submap generator maps(Map<K, V>)
  - unique generator using a Comparator<T> to decide if two values are considered equivalent: uniqueValues(Generator<T>, Comparator<? super T>)
  - excluding generator based on a collection of input and a collection of excluded values: excludeValues(Collection<T> values, Collection<T> excluded
  - content generator type parameter are now co-variant for lists, iterators, sets and object arrays to allow creation of super type container generators (like Generator<List<Object>> = lists(integers()))
- PrimitiveGenerator added generators:
  - generator for java.lang.Object instances

The dropped Junit Runner support means that the @ForAll annotation is no longer supported. Until lambda expression are supported  in Java (hopefully) the Iterable adapter is a good workaround that allows to execute tests without too much boiler-plate. If you need all features the inner class will work just fine. Inner classes will become much better with the SAM-type conversion in Java 8 which is part of the language changes in Project Lambda.

The general development direction and the main theme I’ve been working on besides the release was the support for generator expressions. This is a good way to implement tests for equals methods where a equals method should return false when one of the significant attributes of an object is not equal. You have to write a lot of boiler-plate to test a simple statement like: “This is not equal if one of the attributes is not equal” now. With generator expression this should become much easier.

It’s quite tricky to create a nice API for expressions. One cul-de-sac was to implement it as a builder with a fluent interface. The method chaining is not an adequate. The chaining forces you to linerialize the definition of the expression. This does not fit well into the world of generators where delegation and nesting are natural concepts. I burned some time before I got that the underlying problem cannot be fixed with a clever API.  I hope the current approach terminates and the expression support is something you can work with in the 0.7 release.

20110322

Using deterministic generators with Quickcheck

The 0.6 release of Quickcheck supports deterministic generators. The goal is to be able to make the generation of values reproducible. This is useful when you are working with a bug from your favourite continuous integration server or when you would like to run a piece of code in the debugger repeatedly with the same results.

A non-goal of the support is to remove the random nature Quickcheck. Values are still random to allow a good coverage but reproducibility is supported when needed. This way you have the best of both worlds.

Quickcheck uses internally the linear congruential random number generator (RNG) implemented in the Java’s Random class. The interesting property of the RNG in the context of reproducible values is stated in the javadoc.

If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers.

You can configure the seed used by Quickcheck with the RandomConfiguration class. It’s important to set the seed for every individual test method otherwise a RNG’s return values are dependent on execution order of the test methods. If you run different tests, add a new tests or execute the tests in a different order other values will be generated.

The seed is generated randomly for the normal execution. This is the result of the RandomConfiguration.initSeed method call. This way Quickcheck still produces random values. Use the setSeed method to set the seed for a test method.

Instead of using the RandomConfiguration directly you should use the SeedInfo JUnit method rule that will run with every test method. Additionally, it adds the seed information, that is needed to reproduce the problem, into the AssertionError thrown.

The SeedInfo can be used like every other JUnit method rule. It’s added as an member of the test class. The example generates values in a way that the assertion always fails.

@Rule public SeedInfo seed = new SeedInfo();

@Test public void run(){
  Generator<Integer> unique = uniqueValues(integers());
  assertEquals(unique.next(), unique.next());
}

An example error message is:
java.lang.AssertionError: expected:<243172514> but was:<-917691317> (Seed was 3084746326687106280L.)
You can also use the SeedInfo instance to set the seed for a test method to reproduce the problem from the AssertError.

Rule public SeedInfo seed = new SeedInfo();

@Test public void restore(){
  seed.restore(3084746326687106280L);
  Generator<Integer> unique = uniqueValues(integers());
  assertEquals(unique.next(), unique.next());
}

Instead of setting the seed for individual tests you can also set the initial seed once for the random generator used by the JVM. If you run the test example from above (without the SeedInfo method rule member) and the configuration -Dnet.java.quickcheck.seed=42:

@Test public void run(){
   Generator<Integer> unique = uniqueValues(integers());
   assertEquals(unique.next(), unique.next());
}

You should get the result:
java.lang.AssertionError: expected:<977378563> but was:<786938819>

The configuration of seed values replaces the serialization and deserialization support of earlier Quickcheck versions. Setting the seed is a much simpler way to reproduce values over multiple JVM executions.

20110321

Revert - Sometimes going back is the way forward

Revert is the reverse gear of your version control software. It removes all local changes and brings the local workspace back to clean state of a committed revision. It is an important tool in the revision control software tool box. Once in a while there is no way forward so you have to go backward to make progress.

This may sound unintuitive. We are trying to make a change not reverse it. Why should the tool that destroys all this hard work be the best option in some circumstances? Firstly, you do not lose everything. Even if you revert everything you gain some knowledge. At least that this exact way does not work. This is a good data point. Secondly and more obviously, revert let’s you start with a fresh state. More often than not we are able to reach a working state again. Removing everything is the fastest way to get to there.




I see mainly two scenarios for the revert command: planned mode and the accidental mode.

Planned mode revert

 

Starting with a working state of your committed source code you can do some exploratory work. Find out what you where looking for and revert.

Now you can start the work in a informed way from a working state. The artifacts of the exploration are removed. After reverting you do know that the state you are starting from works. To verify that an workspace state works you do need tools to catch the problems: a decent test coverage and other quality assurance measures.

A corollary is that because you are planning to revert anyway you can change your workspace in every way you need for the exploration.

Accidental mode revert

 

The first scenario was a bit too idyllic: you started your work with an exploratory mind set, found the precious information and clean up after yourself. Everything is planned, clean and controlled. This scenario is valid. You can do the exploratory work voluntarily. More often it is the case that you have dug yourself in. You need to find a way out.

Is this a hole or the basement?

 

The first issue is to know when your in a hole and there is little chance to get out.

Say you commit roughly every hour. Now you did not commit for four hours. Your change set becomes bigger and bigger. You see no way to get your tests running again. Different tests are broken after multiple trials to fix everything. Your in a hole.

You made a change and it resulted in absolutely unexpected problems. Your tests are broken. You do not know why. There are red lights all over the place. Your in a hole.

You made small, controlled, incremental changes for some time without committing. You did not bother to commit because everything was so simple. Now the changes become bigger you would like to commit but you can’t because you can’t bring the whole system to run again. You are in a hole.

The commonality of the three examples is that your not in control of the process. The world you created determines your next steps. This happens to everyone. It’s normal. It happens all the time. Otherwise our work would be predictable day in and out - how boring. (I would go so fare as to say that in other circumstances it’s a good sign that you can follow the inherent conclusions of your system. This way it’s productive that you are determined by the conclusions of your system because it is consistent.)

If there is such a thing as experience in hole digging it’s to see the problem coming and to stop early. If it happened often enough to you, you should know the signs. You’ll know that knee deep holes are deep enough to stop and that it’s not necessary to disappear completely.

Ways out

 

Now after you found out that have a problem all energy should be put in it. Don’t try to be too smart. Solve this one problem. You have two options to get out of the hole: fixing the current state or revert.

Fixing the current state can work. You find enough information to fix the problem. You’ll lose some time but nothing of your work. Once the current state works it’s good a idea to commit now. This creates a save point. If there are more problems lurking down the road you can always come back to this state. The problem is that you might not find the problem. Finding a way out now is hard. Your change set adds to the complexity of the underlying problem. Your changes obfuscate the problem and make it harder to analyze. Everything you do will increase the change set complexity further.

When fixing the current state is too hard, you have to revert your work to keep up the pace. Now you have the problem that you have already sunk so much time and the next step is to roll everything back to the state you started from. This does not feel pleasant. The upside is that even though you reverted the code not everything is lost. You still have more knowledge about the problem. This knowledge can be used on the second and hopefully last attack. Make notes if you need them to remember the information you gathered.

The first attempt was in the wrong direction and/or too big. It is a good idea to make smaller steps with interim commits to create save points you can revert to. This creates a safety net if you bump into the problems again. You can revert repeatedly to chop smaller portions of the problem until it is solved. You decrease the size of the changes until you can understand a problem. Once in a while strange things happen and a single line change has crazy effects. After removing such road blocks you can make bigger steps again.

There is of course a middle way: trying to revert only partially. Without creating and applying patches you have only one direction to go (revert) and you'll swiftly have to revert everything (because your change history is lost). I’ll come back to an approach to use diff and patch to do partial reverts in a controlled way later.

Bringing the costs of reverts down


The problem with reverts is that they are expensive. Work you've already done is removed from the source tree. Not something we are especially proud of.

The problem is only as big as the change set that is flushed down the toilet. You should commit as often as your infrastructure allows: the execution time of tests and the integration costs are the main factor here. (You can move some of the cost into the continuous integration environment you're using.) As always this is a trade-off between the work lost and the overhead created by frequent commits. Committing every hour is probably a good idea. Just do whatever fit’s your needs best.

The other factor is the right attitude to the revert operation. If you have already spent a lot of time on a problem and could not find a fix, it’s likely you won’t find it in this direction and a fresh approach is needed. You can actually save a lot of effort by aborting this failed attempt. This will also bring the total costs of a inevitable later revert down.

Conclusion


Failed attempts are not the problem. We have to learn from our failures. They are just too often and valuable to loose. Making failures is okay. Samuel Beckett put it nicely:

Ever tried. Ever failed. No matter. Try again. Fail again. Fail better.

20110207

Members in unit tests considered harmful

I’ve been reading Domain Specific Languages by Martin Fowler. It is a good book to formalize API design and gives you a new perspective to think about your software. You can buy this book even if you’re not planning to implements DSLs in the future. A DSL is a differently marketed API anyway - but that’s another blog.

I don’t like the style of the tests presented in the book. Martin captures testing of DSLs at the beginning of the book (3.6. Testing DSLs page 53). The tests use members heavily. I’ve seen this style already in Robert C. Martin’s Clean Code. Both are TDD pundits so I thought it’s not too widely known that you can write tests with better readability.


The code that is tested is a controller that uses a state machine to react on events with transitions to the next state.

Use of members in unit tests


The testing style I object to looks like this:

@Test public void event_causes_transition(){
    fire(trigger_a);
    assertCurrentState(a);
}

Sorry, this looks nice but it’s not readable. What is the fixture? What is the class tested and the methods called? What are the parameters of the call and the result? What does the assertion check? You can figure it out, but it’s not easy and you have to navigate a lot in the source file.

No members in unit tests


Now contrast that with this member-less style:

@Test public void event_causes_transition(){
    Event trigger_a = anyEvent();
    State a = anyState();
    Controller controller = controller(stateMachine(transition(trigger_a, a)));

    controller.handle(trigger_a.getId());

    assertEquals(a, controller.getCurrentState());
}

It’s longer than the original version, but everything is in one place. You still setup an initial state but it is completely local. The naming is kept in synch with the original version so you can correlate implementations. The naming is local to the test. It can be specific in member-less style: "a" should be a "state" and "trigger_a" a "trigger".

As you can see in the test, it creates an event and a state. The details of both seem not to be significant for the test. This insignificance is stated in the test explicitly. Then a controller is setup up with a state machine that has one transition. After you have done the setup, you can actually call the method you want to test and check the result.

Every test method following this style has the sequence: setup, execute, assert while keeping everything in the scope of the test: no members!

Code organization and readability


After contrasting the two implementation styles let’s try to explain why the member-less style is more readable.

If tests make heavy use of members, they look nice at the surface: no redundancy, nice method names and helper methods to keep the intend clear. Why is this less readable that the member-less style? The problem is the missing context and necessary navigation in the test code to get the context before you can understand what is actually tested.

If you recorded the navigation of a reader you would see something similar to the following:
  • the setup methods
    • memorize setup members
  • test method: execute statement (fire)
  • execute helper implementation
    • recall setup members
    • memorize result members
  • test method: assert statement (assertCurrentState)
  • assert statement implementation
    • recall result members

As you can see, the reader has to navigate and remember a lot to get the initial setup of the test and keep track of the transition triggered by the execution.

In every place were the reader has to remember some information there’s the potential that an additional navigation is necessary, because he could simply not remember enough context. This becomes even harder when you start to organize the tests into test class hierarchies. The context to remember is the context of all super classes members, execution method implementations and assertion method implementations. Naturally, our ability to store information is limited. Less context information to remember is better for the readability of a test.

So the nice test from above actually looks like this in reality:

@Test a(){
    //a lot of navigation and to remember
    a();
    //dito
    b();
    //dito
}

Ideal organization of code


This is not readable. Ideally, something readable is a linear text laid out exactly in the way you need it to work on your current problem. You build up the context as you go without loosing time navigating. A second ideal of readability is that every information necessary is in one spot. All your context is visible to you at the same time.

The two ideals effect each other. Given a problem complex enough, you cannot have a compact linear representation. Either it’s linear or compact.

Additionally, you make trade-offs to remove redundancy from code. If you are asking yourself whether you reorganize to remove redundancy don’t forget that the most important feature of a test is readability. Introduce some redundancy as long as it helps to understand your test.

Organize to allow effective navigation


You cannot organize code in a way it can be read linearly for all purposes, but you can organize it in a way that the navigation capabilities of IDEs allow effective navigation. Every navigation starts in the test method and is done by looking up the definition of a helper method. The full context is kept in local variables, parameters and return values.

You can start to read a test class from every test method without losing vital information. No navigation to members is necessary:

Event trigger = anyEvent();
State state = anyState();
Controller controller = controller(stateMachine(transition(trigger, state)));

The member-less style makes the setup explicit. Try to name all setup helper methods following the Principle of Least Astonishment. They are nicely named and should state what their individual guarantee is. Accordingly, a method that creates an valid event without further guarantees is named anyEvent().

controller.handle(trigger.getId());

Now the actual method to test can be executed. The member-less style shows exactly what is going on. It presents crystal clear the parameters used and the result value received. There is no navigation overhead. The instances from the setup are used to run the actual test. If there is an execution result it is stored in the local context of the test method.

assertEquals(state, controller.getCurrentState());

After the execution you can do the assertion on the resulting state or result. This again uses only instances from the method scope. Sometimes it is useful to introduce custom assertions to remove redundancy. In this style the assertion method works only on the parameters given, e.g. assertState(state, controller).

If you tracked the navigation for the member-less test style you would get something like this:
  • test
  • setup methods
    • memorize state
  • test
    • recall setup state
    • execute method
    • assert method


The result is not the ideal of a linearly readable test, but the navigation to the definition and back to the test is relatively swift. You can be sure that you did not miss vital information as everything is in one place. Removing uncertainty is the key to a simpler reasoning about the test at hand.

Avoid test class hierarchies


The member-less test setup also helps to avoid test class hierarchies that are are only introduced to allow different fixtures for a sets of tests. Sometimes this is avoided in member style test by doing some setup globally in members and in the test method, but this complicates the reasoning even more. With a member-less test you can have a different setup for every test method. Normally, you organize the setup methods in a way that they build on each other to avoid redundancy.

Avoid isolation from the API


If you put the method execution and assertion into separate methods - like the member heavy style does - you introduce the danger of hiding API problems. One argument for TDD is to experience your API from the point of view of a user. When the actual method calls are in helper methods you do isolate yourself from the API reality your users are facing. It should be nice enough to do the setup, execution and assertion directly with it. Otherwise you have a design problem. The test code should be as much as possible analogous to the code users of an API have to write.

Conclusion


I hope I provided a new perspective to look at the organization of unit tests. If you have objections with this approach feel free to add a comment. If you are interested in this and other aspects of testing software, you can have a look at Test Principles from my co-worker Gaetano Gallo. He covered this idea under "Self containment" in his blog.