Thomas Andreas Jung's Blog

Showing posts with label test. Show all posts

20121219

Analogous function testing

For a long time I wanted to write about a testing technique I call analogous function testing. Some time ago I already wrote about the inverse function test pattern and the analogous function testing pattern is mentioned in my piece about functional hash functions.

Instead of describing some synthetic examples I can now present the pattern with a real world example. I wrote tests for a Java port of a small C library. The phonet library returns the same string for words with the same pronunciation in German. In this way it’s similar to the Soundex and Metaphone algorithms in English.

The test, I’m going to present, found bugs in the Java port. More surprisingly it also found bugs in the C library (that’s more than 10 years old) and the JNI-bridge code we used for years.

If you have a look at the C code you’ll find out that it is not trivial. Well, that’s the most polite way to say that it’s devilish hard to read. It’s beyond my means to read a code that initializes 10 static variables and has 26 automatic variables in the main function. It’s not obvious how you can port this code to Java. Luckily, there was already a Java port that is more or less analogous to the C code - it’s not very idiomatic Java.

My goal was to fix the port. I ignored performance and maintainability issues.

How who you figure out that the quite complex function behaves the same way in the original and the ported version?

The trivial test approach to explicit check values is costly and it’s hard to get good code coverage.

assertEquals("a", phonet("a"));

This way you need a lot of test code before you get some coverage. Line and branch coverage might be easy to attain, but the phonet library has 200+ rules that are evaluated by precedence. Now are we going to write x * 200 tests to check that the right values are returned? If we check all rules there are still all characters left that have to be processed consistently. When we are done with that phonet rules could change and we have to redo some of them. Not very exciting.

The trivial test approach is simply too expensive (and boring). Let’s try another approach that reduces our work to one line of code: write the test as an analogous function test.

The analogous test ensures that f(x) = g(x) for all x. f(x) is the C version and g(x) is the Java version. The only thing left is to generate x and see if f(x) == g(x) holds.

That’s the single line of code we have to write to get the test coverage. Okay, we have to write one line to check that f(x) = g(x) and some code to generate x. It all boils down to comparing the output of the C and the Java versions.

assertEquals(in, cPhonet.phonet(in), jphonet.code(in));

Unfortunately, we cannot create all values with brute force. The challenge is to find ways to generate the right input values and the quality of the test depends on the generated input values. I wrote the same test with multiple approaches each and every one useful for a facet of input values:

There’s a Quickcheck-based random generator test for large input
There’s a test with German dictionaries of known values
There’s a exhaustive test that generates all possible values up to a certain (small) size

As an example I’ll show you the exhaustive test for small strings. It uses Permutation.kPermutationWithRepetition(all, STRING_ADD, size) to create an Iterable that returns all permutations of size k from a set of input values with repetition of values (i.e. for the input AB and the size k = 2 it generates AA, AB, BA, BB).

@Test public void phonetDefinedCharacters() {
    Coder jphonet = createCoder();
    CPhonet cPhonet = new CPhonet();

    for (int size = 1; size <= 2; size++) {
        for (String in : kPermutationWithRepetition(
                CharEncoding.allChars(), STRING_ADD, size)) {
            assertEquals(in, cPhonet.phonet(in, rules()), jphonet.code(in));
        }
    }
}

You can have a look at the full implementation in the CPhonetVsPhonetikTest class.

As you can see from the test code the idea is simple. Following this approach will not guarantee 100% test coverage as long as you cannot generate all values. The hard part is to come up with the right test data. A pure Quickcheck-based approach is a good start, but might need to run too long to find critical bugs and additional tests are necessary.

Porting is only one class of tests to use the analogous function testing for. Another class are tests where you already have a simple implementation. You are able to verify the correctness of the simple code easily. The simple implementation is then used to test the harder implementation (with preferable non-functional characteristics). It might be a good approach if you do not trust a piece of code, but you’re unable to read all of the source code. Just check it against a simple implementation.

A blind spot of an analogous function test is that bugs in f(x) and g(x) can mask each other. If you implement the same bug in both versions the test will not indicate the problem.

The moral from this story is that black-box tests with generated values can be a valuable tool besides classical tests and code reviews - but as everything in life it’s not perfect. It’s only one tool. Quality is still expensive.

20121212

Testing: Start with the result

If you write tests based on generated values, it’s incredible how often it is easier to start from the result value and work your way backwards to the input value (that results in return value).

One part of the story is that you should not repeat the algorithm in the test. The only thing you’re testing is that you can write the same code in two places. Every error in your reasoning is repeated. It’s natural to start with a known return value and create input values that lead to this result.

I’ll demonstrate the principle with a functionality I recently used: k-permutation with repetition. (k defines the size of the output values. Permutation indicates that we care about order. Repetition says that output values can be created by reusing input values.) For example the k-permutation with repetition for the input set {A B} and k = 3 is {AAA, AAB, ABA, ABB, BAA, BAB, BBA, BBB}. We know that the size of the permutation is n^k. (n is the size of the input set.)

@Test public void kPermutationWithRepetitions() {
    int maxSize = 6; 
    
    for(String expected : toIterable(strings(1, maxSize))) {
        Set<Character> allowed =
            newHashSet(Chars.asList(expected.toCharArray()));
        Set<Character> additional =
            anySet(characters(), 0, maxSize - allowed.size());
        allowed.addAll(additional);
    
    
        int k = expected.length();
        Iterable<String> actual =
            kPermutationWithRepetition(allowed, STRING_ADD, k);
        
        assertEquals(
            Math.pow(allowed.size(), k), Iterables.size(actual), 0);
        assertTrue(Iterables.contains(actual, expected));
    }
}

The test uses Quickcheck and Guava. It starts by generating one expected result value. The interesting input value is the set of allowed characters. This set contains all characters of the result string plus some additional characters. We have to take care that the input set is small enough, otherwise the result size is too large to run the test in practice. With this in place we can call kPermutationWithRepetition that generates the actual result. The test checks that the expected value is in the result and that the size of the permutation is correct.

This example demonstrates that it can be much easier to start with an expected return value. Depending on the situation it’s appropriate to create the entire result or just aspects of it. If you can generate at least one result value, the probability of the generated values is evenly distributed over all results and you know the total result size, your test is sufficient.

20121010

My best Python HTTP test server so far

I’ve implemented a bunch of test HTTP servers in Python and now I think I’ve got the implementation right:

Test client code doesn’t have to care about the specific HTTP port. Any free port can be used by the test without interference with other running processes.
The HTTP server is up and running when the test is ready.
Resource handling uses the the with statement. The HTTP server is shut-down at the end of the test.
The concrete request urls (host and port) are transparent for the test.
Test can be run fully in memory. The only resource allocated is the HTTP socket.

The actual test is brief. We call the http_server with a HTTP handler and the function url is returned. The test can use the url function to create the request url. This is handy as the allocated port of the HTTP server is not fixed. In the test below we check that the returned content matches.

with http_server(Handler) as url:
    assert list(urllib2.urlopen(url("/resource"))) == [content]

To run this test we need a HTTP handler implementation. You could use the SimpleHTTPServer.SimpleHTTPRequestHandler that comes with python and work with files served from a directory. This is any good point to start but setting up a test folder with the necessary content is cumbersome and inflexible.

This handler runs in memory without any additional setup. It will always returns with the 200 response code writes the content into the request.

code, content = 200, "Ok"
class Handler(BaseHTTPServer.BaseHTTPRequestHandler):
  def do_GET(self):
      self.send_response(code)
      self.wfile.write("\n" + content )

The http_server implementation starts a thread, opens a socket and yields the url function. The HTTP request handler runs in the spawned thread.

@contextlib.contextmanager
def http_server(handler):
  def url(port, path):
      return 'http://%s:%s%s' % (socket.gethostname(), port, path)
  httpd = SocketServer.TCPServer(("", 0), handler)
  t = threading.Thread(target=httpd.serve_forever)
  t.setDaemon(True)
  t.start()
  port = httpd.server_address[1]
  yield functools.partial(url, port)
  httpd.shutdown()

I leave it as an exercise to you to write an implementation that reuses the HTTP server in multiple tests. This could be necessary if the overhead of allocating ports dominates the test running time.

20110322

Using deterministic generators with Quickcheck

The 0.6 release of Quickcheck supports deterministic generators. The goal is to be able to make the generation of values reproducible. This is useful when you are working with a bug from your favourite continuous integration server or when you would like to run a piece of code in the debugger repeatedly with the same results.

A non-goal of the support is to remove the random nature Quickcheck. Values are still random to allow a good coverage but reproducibility is supported when needed. This way you have the best of both worlds.

Quickcheck uses internally the linear congruential random number generator (RNG) implemented in the Java’s Random class. The interesting property of the RNG in the context of reproducible values is stated in the javadoc.

If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers.

You can configure the seed used by Quickcheck with the RandomConfiguration class. It’s important to set the seed for every individual test method otherwise a RNG’s return values are dependent on execution order of the test methods. If you run different tests, add a new tests or execute the tests in a different order other values will be generated.

The seed is generated randomly for the normal execution. This is the result of the RandomConfiguration.initSeed method call. This way Quickcheck still produces random values. Use the setSeed method to set the seed for a test method.

Instead of using the RandomConfiguration directly you should use the SeedInfo JUnit method rule that will run with every test method. Additionally, it adds the seed information, that is needed to reproduce the problem, into the AssertionError thrown.

The SeedInfo can be used like every other JUnit method rule. It’s added as an member of the test class. The example generates values in a way that the assertion always fails.

@Rule public SeedInfo seed = new SeedInfo();

@Test public void run(){
  Generator<Integer> unique = uniqueValues(integers());
  assertEquals(unique.next(), unique.next());
}

An example error message is:

java.lang.AssertionError: expected:<243172514> but was:<-917691317> (Seed was 3084746326687106280L.)

You can also use the SeedInfo instance to set the seed for a test method to reproduce the problem from the AssertError.

Rule public SeedInfo seed = new SeedInfo();

@Test public void restore(){
  seed.restore(3084746326687106280L);
  Generator<Integer> unique = uniqueValues(integers());
  assertEquals(unique.next(), unique.next());
}

Instead of setting the seed for individual tests you can also set the initial seed once for the random generator used by the JVM. If you run the test example from above (without the SeedInfo method rule member) and the configuration -Dnet.java.quickcheck.seed=42:

@Test public void run(){
   Generator<Integer> unique = uniqueValues(integers());
   assertEquals(unique.next(), unique.next());
}

You should get the result:

java.lang.AssertionError: expected:<977378563> but was:<786938819>

The configuration of seed values replaces the serialization and deserialization support of earlier Quickcheck versions. Setting the seed is a much simpler way to reproduce values over multiple JVM executions.

20110207

Members in unit tests considered harmful

I’ve been reading Domain Specific Languages by Martin Fowler. It is a good book to formalize API design and gives you a new perspective to think about your software. You can buy this book even if you’re not planning to implements DSLs in the future. A DSL is a differently marketed API anyway - but that’s another blog.

I don’t like the style of the tests presented in the book. Martin captures testing of DSLs at the beginning of the book (3.6. Testing DSLs page 53). The tests use members heavily. I’ve seen this style already in Robert C. Martin’s Clean Code. Both are TDD pundits so I thought it’s not too widely known that you can write tests with better readability.

The code that is tested is a controller that uses a state machine to react on events with transitions to the next state.

Use of members in unit tests

The testing style I object to looks like this:

@Test public void event_causes_transition(){
    fire(trigger_a);
    assertCurrentState(a);
}

Sorry, this looks nice but it’s not readable. What is the fixture? What is the class tested and the methods called? What are the parameters of the call and the result? What does the assertion check? You can figure it out, but it’s not easy and you have to navigate a lot in the source file.

No members in unit tests

Now contrast that with this member-less style:

@Test public void event_causes_transition(){
    Event trigger_a = anyEvent();
    State a = anyState();
    Controller controller = controller(stateMachine(transition(trigger_a, a)));

    controller.handle(trigger_a.getId());

    assertEquals(a, controller.getCurrentState());
}

It’s longer than the original version, but everything is in one place. You still setup an initial state but it is completely local. The naming is kept in synch with the original version so you can correlate implementations. The naming is local to the test. It can be specific in member-less style: "a" should be a "state" and "trigger_a" a "trigger".

As you can see in the test, it creates an event and a state. The details of both seem not to be significant for the test. This insignificance is stated in the test explicitly. Then a controller is setup up with a state machine that has one transition. After you have done the setup, you can actually call the method you want to test and check the result.

Every test method following this style has the sequence: setup, execute, assert while keeping everything in the scope of the test: no members!

Code organization and readability

After contrasting the two implementation styles let’s try to explain why the member-less style is more readable.

If tests make heavy use of members, they look nice at the surface: no redundancy, nice method names and helper methods to keep the intend clear. Why is this less readable that the member-less style? The problem is the missing context and necessary navigation in the test code to get the context before you can understand what is actually tested.

If you recorded the navigation of a reader you would see something similar to the following:

the setup methods

memorize setup members

test method: execute statement (fire)
execute helper implementation

recall setup members
memorize result members

test method: assert statement (assertCurrentState)
assert statement implementation

recall result members

As you can see, the reader has to navigate and remember a lot to get the initial setup of the test and keep track of the transition triggered by the execution.

In every place were the reader has to remember some information there’s the potential that an additional navigation is necessary, because he could simply not remember enough context. This becomes even harder when you start to organize the tests into test class hierarchies. The context to remember is the context of all super classes members, execution method implementations and assertion method implementations. Naturally, our ability to store information is limited. Less context information to remember is better for the readability of a test.

So the nice test from above actually looks like this in reality:

@Test a(){
    //a lot of navigation and to remember
    a();
    //dito
    b();
    //dito
}

Ideal organization of code

This is not readable. Ideally, something readable is a linear text laid out exactly in the way you need it to work on your current problem. You build up the context as you go without loosing time navigating. A second ideal of readability is that every information necessary is in one spot. All your context is visible to you at the same time.

The two ideals effect each other. Given a problem complex enough, you cannot have a compact linear representation. Either it’s linear or compact.

Additionally, you make trade-offs to remove redundancy from code. If you are asking yourself whether you reorganize to remove redundancy don’t forget that the most important feature of a test is readability. Introduce some redundancy as long as it helps to understand your test.

Organize to allow effective navigation

You cannot organize code in a way it can be read linearly for all purposes, but you can organize it in a way that the navigation capabilities of IDEs allow effective navigation. Every navigation starts in the test method and is done by looking up the definition of a helper method. The full context is kept in local variables, parameters and return values.

You can start to read a test class from every test method without losing vital information. No navigation to members is necessary:

Event trigger = anyEvent();
State state = anyState();
Controller controller = controller(stateMachine(transition(trigger, state)));

The member-less style makes the setup explicit. Try to name all setup helper methods following the Principle of Least Astonishment. They are nicely named and should state what their individual guarantee is. Accordingly, a method that creates an valid event without further guarantees is named anyEvent().

controller.handle(trigger.getId());

Now the actual method to test can be executed. The member-less style shows exactly what is going on. It presents crystal clear the parameters used and the result value received. There is no navigation overhead. The instances from the setup are used to run the actual test. If there is an execution result it is stored in the local context of the test method.

assertEquals(state, controller.getCurrentState());

After the execution you can do the assertion on the resulting state or result. This again uses only instances from the method scope. Sometimes it is useful to introduce custom assertions to remove redundancy. In this style the assertion method works only on the parameters given, e.g. assertState(state, controller).

If you tracked the navigation for the member-less test style you would get something like this:

test
setup methods

memorize state

test

recall setup state
execute method
assert method

The result is not the ideal of a linearly readable test, but the navigation to the definition and back to the test is relatively swift. You can be sure that you did not miss vital information as everything is in one place. Removing uncertainty is the key to a simpler reasoning about the test at hand.

Avoid test class hierarchies

The member-less test setup also helps to avoid test class hierarchies that are are only introduced to allow different fixtures for a sets of tests. Sometimes this is avoided in member style test by doing some setup globally in members and in the test method, but this complicates the reasoning even more. With a member-less test you can have a different setup for every test method. Normally, you organize the setup methods in a way that they build on each other to avoid redundancy.

Avoid isolation from the API

If you put the method execution and assertion into separate methods - like the member heavy style does - you introduce the danger of hiding API problems. One argument for TDD is to experience your API from the point of view of a user. When the actual method calls are in helper methods you do isolate yourself from the API reality your users are facing. It should be nice enough to do the setup, execution and assertion directly with it. Otherwise you have a design problem. The test code should be as much as possible analogous to the code users of an API have to write.

Conclusion

I hope I provided a new perspective to look at the organization of unit tests. If you have objections with this approach feel free to add a comment. If you are interested in this and other aspects of testing software, you can have a look at Test Principles from my co-worker Gaetano Gallo. He covered this idea under "Self containment" in his blog.

20101207

Inverse Function Test Pattern in Quickcheck

I already mentioned the inverse function test pattern in the introduction to specification-based testing using Quickcheck. Now I would like to present the pattern in more detail with an example.

The basic idea of testing with inverse functions is simple. We have two functions f and its inverse function f-1. If we apply f to an input value and then take the result and apply it to f-1 this should result in the input value again: f-1(f(x)) = x.

This test pattern is applicable for all kinds of functions for example compression and encryption. In the business application domain create and delete are examples of operations that can be tested this way. Another example are do and undo operations of command objects. In both examples, doing and undoing an action leaves the state of the world unchanged.

There are some constraints on the applicability of the test pattern. The function f has to have a inversion function so it is a bijective function and one of the functions f or f-1 have to be tested. Otherwise the intermediary result could be invalid. For example, if you test a create and a delete operation together, the inverse function test passes if both operations do nothing.

The example used to illustrate the inverse function testing is a simple square function.

public double square(double a) {
 double square = a * a; 
 return square;
}

The inverse function test can be implemented with concrete values. We use the Math.sqrt() implementation as the inverse function.

@Test public void squareConcreteValues() {
  double a = 2;
  assertEquals(square(Math.sqrt(a)), a, precission);
}

This is okay but defining input values manually is not very productive, not readable and has not sufficient test coverage. You can instead employ some computing power to generate the values using Quickcheck.

Firstly, the square function is not bijective as square(-x) = square(x). This is a fact we did not express in the example with the concrete values. It simply omitted this fact. To fix this the result is compared to the absolute value of x. Secondly, the function will overflow and test output like this:
java.lang.AssertionError: expected:<1.6482368012418589E307> but was:<Infinity>
is typical.

@Test public void squareWithOverflows() {
  for(double a : someDoubles()) {
    assertEquals(abs(a), Math.sqrt(square(a)), a * precission);
  }
}

Again, this aspect was not expressed in the concrete test. Even if you were not aware of the problem the failing test points to the overflow problem. This is a nice example how Quickcheck can help you to find bugs you did not anticipate. I admit that this is a simple example but give it a try. You'll see that you run into all kinds of problems you did not think of. You have to break your software to know it.

Now we have to fix the overflow problem. Depending on the situation it can be easier to find and test a valid implementation that is more restrictive than theoretically possible but suffices your requirements. This is the trade-off between effort now and potential applicability later.

For this example it is easy to find all legal input arguments. It is the largest double value that can be squared.

@Test public void squareWithBounds() {
  double max = Math.nextAfter(Math.sqrt(Double.MAX_VALUE), 0.0);
  for (double a : someDoubles(-max, max)) {
    assertEquals(abs(a), Math.sqrt(square(a)), a * precission);
  }
}

To finish this example let’s write the test for the illegal input arguments as well. All square arguments that cause an overflow are invalid and should cause an IllegalArgumentException. The invalidArguments double generator defines all invalid values. These are the values greater than the largest valid value max and smaller than the smallest allowed value -max.

@Test public void squareInvalidArguments() {
  double max = Math.sqrt(Double.MAX_VALUE);
  double firstInvalid = Math.nextUp(max);
  Generator<Double> invalidArguments =
    oneOf(doubles(-Double.MAX_VALUE, -firstInvalid))
    .add(doubles(firstInvalid, Double.MAX_VALUE));
    
  for (double a : someEnsureValues(asList(firstInvalid, -firstInvalid), invalidArguments)) {
    try{
      square(a);
      fail();
    }catch(IllegalArgumentException e){ }
  }
}

The implemenation passing the tests is:

public double square(double a) {
  double square = a * a;
  if (Double.isInfinite(square)) { throw new IllegalArgumentException() };
  return square;
}

Testing with inversion functions can solve the dilemma of writing tests without repeating the production code. If you repeat the code in the test you have an additional check that you wrote the code down correctly. You have to repeat an error in both places to introduce a bug. This is all you can get out of such tests. If you test with an inverse function the test and implementation do not share code. This kind of test has the potential to find conceptual problems in your code. (Actually, the reality is not black and white. You can have a test that implements the same operation with a simpler test implementation. This implementation verifies that the more complex production version works. This is the idea of the analogous function test pattern. I will come to later.)

If the inverse function pattern is applicable it can save you a lot of effort. For example the result of an operation can be very costly to verify like encryption functions. The exact representation may not be of interest for the given domain or change frequently leading to high maintenance costs with concrete test values. If you can define valid input values and have an tested inverse function the test effort is to setup the input value generator. The nice side-effect is that you test that the functions you think are inverse functions really are inverse.

20101110

The easiest way to get started with Quickcheck

You may download the Quickcheck zip file from bitbucket. This zip contains the full Quickcheck distribution. Alternatively, the easiest way to get started is Maven.

Add the following definition to your pom.xml file:

<dependency>
    <groupId>net.java.quickcheck</groupId>
    <artifactId>quickcheck</artifactId>
    <version>0.6</version>
    <scope>test</scope>
</dependency>

Have fun.

20101108

Alternative test approach: Quickcheck

With this article I would like to shed some light on the question: Why should I implement tests with the specification-based testing approach proposed by Quickcheck? I will show that the approach yield tests with better readability, more test coverage, helps to omit mock tests and manually maintained test fixtures.

I'll illustrate the approach with an simple example taken from the Guava library to be able to discuss the pros and cons. The example I picked is the test of the com.google.common.base.Splitter class. I present three implementations of the Splitter tests. The test implementation as it can be found in Guava today (translated to TestNG), a test implementation with TestNG @DataProvider and an implementation with Quickcheck for Java.

The following code are tests as they are today. The tests check the behavior of simple split operations for different input values. These tests have a lot of code but are structurally very similar.

@Test public void testCharacterSimpleSplit() {
  String simple = "a,b,c";
  Iterable<String> letters = Splitter.on(',').split(simple);
  assertContentsInOrder(letters, "a", "b", "c");
}

@Test public void testCharacterSplitWithDoubleDelimiter() {
  String doubled = "a,,b,c";
  Iterable<String> letters = Splitter.on(',').split(doubled);
  assertContentsInOrder(letters, "a", "", "b", "c");
}

@Test public void testCharacterSplitWithDoubleDelimiterAndSpace() {
  String doubled = "a,, b,c";
  Iterable<String> letters = Splitter.on(',').split(doubled);
  assertContentsInOrder(letters, "a", "", " b", "c");
}

@Test public void testCharacterSplitWithTrailingDelimiter() {
  String trailing = "a,b,c,";
  Iterable<String> letters = Splitter.on(',').split(trailing);
  assertContentsInOrder(letters, "a", "b", "c", "");
}

@Test public void testCharacterSplitWithLeadingDelimiter() {
  String leading = ",a,b,c";
  Iterable<String> letters = Splitter.on(',').split(leading);
  assertContentsInOrder(letters, "", "a", "b", "c");
}

@Test public void testCharacterSplitWithMulitpleLetters() {
  Iterable<String> testCharacteringMotto =
     Splitter.on('-').split("Testing-rocks-Debugging-sucks");
     assertContentsInOrder(
      testCharacteringMotto, "Testing", "rocks", "Debugging", "sucks");
}
 
private void assertContentsInOrder(Iterable<String> actual,
     String... expected) {
  assertEquals(Arrays.asList(expected), Lists.newArrayList(actual));
}

From the readability perspective the problem with tests specifying concrete values is that the reader has to infer which part of the information is significant and which part is not. For example taking this test in isolation:

@Test public void testCharacterSimpleSplit() {
  String simple = "a,b,c";
  Iterable<String> letters = Splitter.on(',').split(simple);
  assertContentsInOrder(letters, "a", "b", "c");
}

One could come to the result that:
- split works only with one, alphanumeric character (“aa,bb” would not be a legal input string)
- the separator may not be an alphanumeric character
- splits works only for successive char values as the input is a - b - c
- all other characters are undefined

The reader of the tests has to digest all tests to be able to infer the actual properties of the Splitter. While reading he has also to remove the insignificant information and has to eliminate the invalid assumptions. (Actually, the tests say nothing about alphanumeric separators. So this assumption is still open.) As we know what split does this is trivial but for unknown functionality this is much harder.

The test can be implemented with the TestNG DataProviders in a more concise way:

static final String SPLIT = "split";

@DataProvider(name = SPLIT)
public Object[][] split() {
  Object[] simpleSplit = {"a,b,c", ",", new String[] { "a", "b", "c" } };
  Object[] splitWithDoubleDelimiter = {
    "a,,b,c", ",", new String[] { "a", "", "b", "c" } };
  Object[] splitWithDoubleDelimiterAndSpace = {
    "a,, b,c", ",", new String[] { "a", "", " b", "c" } };
  Object[] splitWithTrailingDelimiter = {
    "a,b,c,", ",", new String[] { "a", "b", "c", "" } };
  Object[] splitWithLeadingDelimiter = {
    ",a,b,c", ",", new String[] { "", "a", "b", "c" } };
  Object[] splitWithMulitpleLetters = {
    "Testing-rocks-Debugging-sucks", "-", new String[] { "Testing", "rocks", "Debugging", "sucks" } };
  return new Object[][] {
    simpleSplit, splitWithDoubleDelimiter, splitWithDoubleDelimiterAndSpace,
    splitWithTrailingDelimiter, splitWithLeadingDelimiter, splitWithMulitpleLetters };
}

@Test(dataProvider = SPLIT)
public void simpleSplit(String simple, String separator, String[] expected) {
  Iterable<String> letters = Splitter.on(separator).split(simple);
  assertEquals(Arrays.asList(expected), Lists.newArrayList(letters));
}

This test removes a lot of the boiler-plate in the original version. It tests the split method repeatedly with a set of input values. The more compact representation of the test helps as long as we are keeping test input data and the test in one place. (If you organize your source file in a way that the test data and test is separated the readability decreases significantly.) This test approach still suffers from the readability problems the initial test has: the reader has to derive the property of the Splitter implementation from the input and expected return values. This is like inferring the properties of multiplication from a set of input and return values.

An completely different approach can be implemented using Quickcheck for Java for API. Quickcheck is build on the basic abstraction of the Generator<T>. The only method of the Generator<T> interface is next(). It returns a new value of type T.

The Splitter test uses Quickcheck functionality to create values:
- PrimitiveGenerators.strings() is a Generator<String> for undefined string values
- CombinedGeneratorIterators.someNonEmptyLists() is similar to a Generator<List<T>> but it is adapted to Iterable<List<T>> to be able to use it in a for-each loop. The lists returned have at least one element.

Basic idea is to specify a test as follows: If you take any non empty list of words, join them with a distinct separator and then split the joined string the result will be the input list of words again.

@Test public void simpleSplit() {
  for (List<String> words : someNonEmptyLists(strings())) {
    char separator = anyDistinctCharacter(words);
    String input = Joiner.on(separator).join(words);
    Iterable<String> letters = Splitter.on(separator).split(input);
    assertEquals(words, Lists.newArrayList(letters));
  }
}

The test uses the anyDistinctCharacter helper method to create separators. Separators have to be distinct from the characters used in words otherwise the splitter splits words. The helper uses the PrimitiveGenerators.characters() factory method to create a Generator<Character> for arbitrary characters. To create the separator values CombinedGenerator.excludeValues generator is used. It is build on a input generator but skips all generated values that are equal to the excluded values. (It skips all characters that are in words.)

private char anyDistinctCharacter(List<String> words) {
    char[] notAllowedAsSeperator = Joiner.on("").join(words).toCharArray();
    return excludeValues(characters(), asList(notAllowedAsSeperator)).next();
}

This specification-based test describes the contract and does not enumerate the specific test values. Compared to the two other implementations the way the test is written is more rigid. This is a positive thing, as I already tried to explain that when composing a test of concrete values there is place for misinterpretations. The other effect is that six tests boil down to one test.

This test approach has of course it's trade-offs: it depends on join and is not deterministic.
- Join has to be tested so that this test is working. I think the join test is much easier than the split test. Once join is tested we can depend on it. By testing join/split together we can check that these two are symmetric (f(f-1(x)) = x). Testing symmetry comes for free with this implementation approach.
- The real concern is that we will run the test with randomly generated test data and it is not guaranteed to contain all of the enumerated test cases in every test run.

If non-determinism is your concern there are some ways to address this.
- You can use a different distributions function in your generators. Using different distributions will make certain values more or less likely to occur.
- You can make the values used deterministic by using the CombinedGenerators.ensureValues(ensuredValues). This is still better than the tests with concrete values as you only have to specify the input values and not the expected result values.
- You can combine deterministic and random values. The generator created with CombinedGenerators.ensureValues(ensuredValues, otherValues) uses the enumerated values first and then the otherValues generator after the explicit values are exhausted.
The random nature of the generated values has of course an advantage: you may find bugs that you did not anticipate while writing the test.

Primitive type, enumerations and collection generators alone are not very useful. To test a real world application you have to be able to create values of your custom types. In the Splitter example we already used generators provided by the Quickcheck implementation.

Here is a simple example to get you started.

class Name {
  private final String first;
  private final String last;

  public Name(String first, String last) {
    super();
    this.first = first;
    this.last = last;
  }
  public String getLast() { return last; }
  public String getFirst() { return first; }
}

class NameGenerator implements net.java.quickcheck.Generator<Name>{
  Generator<String> first = PrimitiveGenerators.strings();
  Generator<String> last = PrimitiveGenerators.strings();

  @Override public Name next() {
    return new Name(first.next(), last.next());
  }
}

The code defines the custom type Name and a Generator<Name> using generator implementations provided by the Quickcheck framework. This is analogous to the definition of types in Java you define your own types based on the primitive and other types provided by the JDK.

The test then checks one property of the Name type. This test fails with the current implementation of the Name but this is easily fixed.

public class NameTest {
  @Test public void equals(){
    for(Name name : Iterables.toIterable(new NameGenerator())){
      assertEquals(name, new Name(name.getFirst(), name.getLast()));
    }
  }
}

The test uses an adapter to the Iterable interface to be able to use the generator in a for expression. This is one way to run the test. You can also use an inner class with a more capable runner (QuickCheck.forAll(generator, characteristic)). Using the for expression is the trade-off given that the Java language has no closures yet.

Analogous to your domain object hierarchy you can now build a generator hierarchy that is used to create the domain objects. This clearly adds some overhead as you now have an additional generator for every domain object. But this pays off as you can:
- use generators in unit tests.
- use generated values to replace hard coded fixtures in integration tests (for example tests that you dbunit).
- omit mock tests that are only in place because the object graph to start a test is to hard to setup manually
- use generated values to write inverse functions (f-1(f(x)) = x) and analogous functions (f(x) = g(x) where f or g is simpler) tests (for example zip, encryption, read/write, sorting, searching, etc.).

I hope I could convince you that the approach helps you with writing more expressive tests by leveraging some computing power. You can use specification-based testing approach to test a range of software from the boring business software to more interesting areas like data-structures and algorithms. Maybe you could start by writing a test for the join method using Quickcheck?

20100102

Quickcheck 0.5 release

It's done: Quickcheck version 0.5 is ready. Again one of the nice little releases of Quickcheck.

The 0.5 release added the following features mainly to provide better readability:

Support for an adapter from Generator to Iterable to allow the use of Java's for-expression in tests (replacing the inner classes).
Generated source code for @Iterables and @Samples annotations on generator factories. The @Iterables annotation was added to support for-expression directly with Generators without additional code. The @Samples annotation can be used to generate a factory for sample values from a Generator class.

Minor changes are:

Added a declarative Pojo generator for interfaces.
Implemented excluding generator.
Added Generators.cast utility method. Generator<String> can now be casted to Generator<Object> to provide broader applicability of Generators.
Maven repository support (http://download.java.net/maven/2/net/java/quickcheck)

The version 0.6 of Quickcheck will be a major rework after more than 2 years of experience with the specification-based test approach and development. Mainly to support further development for example shrinking (simplification) of failed test values. A major feature still missing in Quickcheck version 0.5.

When I started Quickcheck I would not have thought that it is possible to develop such a tiny little thing for more than 2 years. Still more ideas and things that could be implemented and written about and so little time.

Recently, I've used the 0.5 version in my Scala projects I blogged about. I did not mention that I tested it in any way, though. It was a pleasant experience given that QuickCheck was not designed with Scala's language features in mind. It works with just a small implicit type conversion and adaption layer (10 lines or so).

Distribution function implementations based on the XorShift random number generator I've already worked with would be nice to have. When I'm at adding new distributions anyway, there are still the distribution functions based on heuristics to implement. For example a distribution function based on a boundary heuristic that starts production of values with minimum value, -epsilon, 0, +epsilon and maximum value before random generation kicks in. Another item on my ever growing list is the description of patterns related to specification-based testing and QuickCheck. Obviously there is still plenty of work left.

20091019

Support for generator type conversion in Quickcheck

A new utility method is now part of Quickcheck that lets you convert a Generator<A> to a Generator<B> given that A extends B.

This is useful when you have a Generator<Integer> and would like to treat it as a generator of type Generator<Object>.

This can now be done using the Generators.cast method:

Generator<Integer> integers = PrimitiveGenerators.integers();
Generator<Object> objects = Generators.<Object> cast(integers);

This is valid as the type Generator is covariant. (It does only produce values but the state of the Generator<T> is not altered.)

The cast is needed as the Java type does only support invariant generic types. Java cannot support the valid implicit type conversion from Generator<Integer> to Generator<Object>. This information is not present in the Java type system. The cast has to be performed explicitly. (If the generic type Generator could be annotated that it is covariant this sort of conversion could be supported. Scala supports the covariant and contravariant type conversion this way.)

20091013

Using source code generation to simplify test development with Quickcheck

With Quickcheck version 0.5 it is possible to use generated source code to remove repetitive test code.

The first generator tries to remove some redundancy from tests that use only a sample from a generator. For example a common source code pattern is:

import static net.java.quickcheck.generator.PrimitiveGenerators.strings;
String sample  = strings().next();

If you're only ever going to use this one string the next() call on the generator is annoying. You're doing it over and over again.

With the @Samples annotation on the PrimitiveGenerators class the PrimitiveGeneratorSamples class will be generated that contains exactly this code. So instead of the code above you can now write:

import static net.java.quickcheck.generator.PrimitiveGeneratorSamples.anyString;
String sample = anyString();

The same @Samples annotation can be used for your own generators. The setup is simple with javac. As long as you have the quickcheck-src-generator.jar in your classpath and did not disable annotation processing in your project source generation should work out of the box.

Only static methods returning a subtype of Generator<T> are supported. The generated class will be named <YourClass>Samples and the methods some<YourMethod in singular>. Your generator method names should be in plural as in the PrimitiveGenerators and CombinedGenerators classes. (The algorithm to create a singular is simple. It will remove the last character from the method name. So people() won't work. It will create somePeopl.)

A new way to create state characteristics is part of Quickcheck 0.5. The Iterables.toIterable factory method will create an adapter the Iterable interface. Iterable instances can then be used in for expressions.

for (Integer any : Iterables.toIterable(PrimitiveGenerators.integers())) {
  assertEquals(any * 2, any + any);
}

With the @Iterables annotation the source code to adapt to the Iterables interface gets generated. The generated adapted factory method can then be directly used in your tests:

import static net.java.quickcheck.generator.PrimitiveGeneratrsIterables.*;
for (Integer any : someIntegers()) assertEquals(any * 2, any + any);

This generator can be used with your generators, too. The same restrictions apply as for the @Samples annotation. The generated class is named <YourClass>Iterables and the methods some<YourMethodName>.

At the moment the two factory method classes CombinedGenerators and PrimitiveGenerators are annotated with @Samples and @Iterables, so all factory methods support now samples and for expressions.

The combination of source code generation and the Iterables adapter removes some pain with the missing closure support in Java. I've tried to circumvent this problem. The JUnit runner support is such a trial. But now after using it I do see that it does clutter the test code unnecessarily, it is harder to navigate and to refactor. So this feature will be probably removed from the Quickcheck release 0.6 as expressing the characteristic in an inner class and the new Iterable adapter are better alternatives.

20090702

Generators

Generators are at the very heart of Quickcheck. They provide the test data the runner executes characteristics with.

Distributions

A distribution is a function that returns values between -1.0 and 1.0. The concrete frequency of values is determined by the distribution function. All generators provided by Quickcheck use an implementation of Distribution to derive their respective values from.

From a technical point of view all generators could be run with the uniform distribution. Other distributions can be used to increase the likelihood of certain values. This is useful to increase the significance of generators.

The supported distributions are: uniform, inverted normal, positive normal and negative normal.

The diagram shows the different distribution functions supported by Quickcheck. It is an example output of an integer generator with values ranging from 0 to 10.

All values of the uniform distribution have the same probability to occur. The inverted normal distribution is a distribution function whose border values are the most likely. The function is defined based on the normal distribution. This function is inverted and cut. This is done to provide a pragmatic distribution function to test border cases where the lowest and highest values are of the most interest. For the positive normal distribution the values near 0.0 are the most probable and values near 1.0 respectively for the negative normal distribution.

Primitive generators

Primitive generator are the basic building blocks for custom generators. The static factory methods in the net.java.quickcheck.generator.PrimitiveGenerators class can be used to create primitive generators (Strings, Integer, Long, Double, Byte, Character, Date, Enum). Typically the factory provides methods to create a generator with minimum value, maximum value and distribution for each primitive type. Additionally there are convenience methods that use default values.

There are two special factory methods that do not denote a primitive type: fixed value and null. Fixed value generators will always return the given values. Null generator always returns null.

Combined Generators

Combined generators use other generators to build more complex generators. As with primitive generators there is the net.java.quickcheck.generator.CombinedGenerators class with static factory methods.

There are number of generators that create subtypes of java.util.Collection and arrays (List, Set, object Arrays, int arrays, byte array).

Two generators generate tuples: the pair and triple generators. (The Java generics somewhat discourages the creation of lager tuples).

The ensured values generator can be used to create a deterministic generator. This can be combined with a random generator that kicks in when the deterministic generator runs out of values.

The nullsAnd generator can be used to create a random mixture of null values and values returned by an other generator. (The generator has this funny name to form readable statements in test code. For example: nullsAnd(integers()).)

The unique values generator can be used to decorate another generator to ensure that a value is not created repeatedly. When the base generator returns a redundant value it will be retried. (At some point the unique generator will give up and throw an exception.)

The frequency generator can be used to combine multiple generators to one generator. Every added generator will have a weight and values are taken from of the generators according to their respective weight. The oneOf generator factory method is a convenience method to create a frequency generator from a set of generators with the same weight.

Building custom generators

Armed with this knowledge about primitive and combined generators you can now start to create your own custom generators.

The following example defines a generator for java.io.File objects.

class FileGenerator implements Generator<File> {

  Generator<File> roots = fixedValues(File.listRoots());
  Generator<List<String>> paths = nonEmptyLists(letterStrings());

  @Override
  public File next() {
    File f = roots.next();

    for (String p : paths.next()) f = new File(f, p);
    return f;
   }
}

An implementation of the Generator interface has to define the next method. This method returns the next generated value. It will be called by the Quickcheck runner when a test executes.

It uses the fixedValues generator to create the root node. The number of file system roots depends on the OS. Windows systems can have multiple roots whereas Unix systems have only one. The path of the file after the root is created from a list of strings. The nonEmptyList generator will create lists with at least one entry.

The next method implementation creates the file from a file representing the root node and adds to it the relative paths generated with the paths generator. The relative paths are repeatedly added in the for loop to create the absolute path of the resulting file.

It is always a good practice to keep the base generators as members and reuse these for multiple runs of the next method.

20090625

Iterable adapter - Ways to define characteristics

The current development version of Quickcheck for Java contains new Iterable adapter. This adapter lets you convert a Generator implementation into an Iterable. This Iteraterable can then be used in Java for loops. This makes definitions of tests much shorter as no inner class has to be used to define a characteristic.

When I started porting Quickcheck from Haskell I thought that there will be some closure-like facility in the Java language in 2 years. (After all more and more developers seemed to get exposed to the concept of functions and higher order functions. Most recent languages have some form of language construct to build new control structures in libraries.) So I did not bother myself too much with the usability problems that arise from the use of inner classes. In the meantime time it is obvious that Java will not have closures soon and maybe never will. So as long as we are using Java we have to get around this.

Quickcheck has now three different ways to express a characteristic: inner classes, annotations and for-loop expressions. Each of these solutions has its advantages and disadvantages. The example shows a characteristic defined for all three approaches.

@RunWith(QuickCheckRunner.class)
public class IterableExample {

  @Test
  public void classic() {
    forAll(PrimitiveGenerators.integers(), new AbstractCharacteristic<Integer>() {

      @Override
      protected void doSpecify(Integer any) throws Throwable {
        assertEquals(any * 2, any + any);
      }
    });
  }

  @ForAll(generatorMethod = "net.java.quickcheck.generator.PrimitiveGenerators.integers")
  public void annotation(Integer any) {
    assertEquals(any * 2, any + any);
  }

  @Test
  public void iterable() {
    for (Integer any : Iterables.toIterable(PrimitiveGenerators.integers())) {
      assertEquals(any * 2, any + any);
    }
  }
}

Inner classes are the standard implementation and fine if one can ignore the boilerplate code that is used to express a characteristic as an inner class.
Annotations are not type-safe (and therefore refactoring safe), they cannot be navigated in IDEs and the generator definition and use is separated.
The Iterable adapter is type-safe and has concise definition of the characteristic but it lacks all features of the Quickcheck runner. So the Iterator<T> created by the Iterable<T> adapter will produce a fixed number of values. As the Java for loop construct is used the context information for the test run is missing. Setup and tear down have to be called explicitly.

20090618

Defining the Collection.add contract in Quickcheck for Java

The Collection add method does define a contract that is expressed in javadoc.
It states:

A collection implementation can throw an exception. The collection will stay unchanged and contains() will return false.
The add method return value indicates if the collection changed as a result of the call. The collection will contain the element (based on equals()) but only if the collection changed the very instance will be part of the collection.

This textual form can be expressed as a Quickcheck characteristic.

@Test
public void collectionAdd() {
 for (Pair<Integer, List<Integer>> any : somePairs(integers(), lists(integers()))) {
  Integer element = any.getFirst();
  Collection<Integer> collection = any.getSecond();

  boolean changedCollection = false;
  boolean exceptionThrown = false;
  try {
   changedCollection = collection.add(element);
  } catch (Exception e) {
   assertException(e);
   exceptionThrown = true;
  }
  assertTrue(collection.contains(element) != exceptionThrown);
  assertTrue(changedCollection == containsInstance(collection,
    element));
 }
}

private void assertException(Exception e) {
 assertTrue(e instanceof UnsupportedOperationException
   || e instanceof ClassCastException
   || e instanceof IllegalArgumentException
   || e instanceof IllegalStateException);
}

private boolean containsInstance(Collection<?> collection, Object element) {
 for (Object e : collection) if (e == element) return true;
 return false;
}

So in this way a specification can be provided of an interface some third party has to implement. It's then up to the implementer to provide a generator for the implementation besides the working code.

This has obviously some overlap with contract-driven development and the assertions facility. One interesting reuse of the characteristic could be to take an implementation that states it fulfills a given contract and monitor the behaviour of the implementation at run time (i.e. not while test time) and to report violations of the contract.