Tests as Documentation Workshop Notes

At this year’s XP Agile Universe conference, Brian Marick and I co-hosted a workshop on Tests as Documentation. The underlying theme was: How are tests like documentation, and how can we use tests as project documentation?. Can we leverage tests to use as project documentation to help minimize wasteful documentation?

Since the most up to date information about a product is in the source code itself, how do we translate that into project documentation? In the absence of a tool to traverse the code, translate it and generate documentation, are tests a good place to look? Can we just take the tests we have and use them as documentation, or do we need to design tests a specific way?

We solicited tests from workshop participants, and had some sample tests developed in JUnit with the corresponding Java code, tests developed with test::unit and the corresponding Ruby code, and some FIT tests.
Brian organized the workshop in a format similar to Patterns or Writers workshops. This was done to facilitate interaction and to generate many ideas in a constructive way. Groups divided up to look at the tests, and to try to answer the questions from the workshop description. Once the pairs and groups had worked through these questions, they shared their own questions with the group. Here is a summary of some of the questions that were raised:

Should a test be written so that it is understood by a competent practitioner? (Much like skills required to read a requirements document.)
How should customer test documentation differ from unit test documentation?
With regards to programmer tests: Is it a failure if a reader needs to look at the source code in order to understand the tests?
What is a good test suite size?
How do we write tests with an audience in mind?
What is it that tests document?
How should you order tests?
Should project teams establish a common format for the documentation regardless of test type?
How do we document exception cases?

Some of these questions were taken by groups, but not all of them. I encourage anyone who is interested to look at examples that might answer some of them and share them with the community. While discussion within groups and with the room as a whole didn’t provide a lot in the way of answers to these questions, the questions themselves are helpful for the community to think about tests as documentation.

Of the ideas shared, there were some clear standouts for me. These involve considering the reader – something I must admit I haven’t spent enough time thinking of.

The Audience

Brian pointed out an important consideration. When writing any kind of documentation whether it is an article, a book, project documentation etc., one needs to write with an audience in mind. When reviewing tests (and as a test writer myself), I notice that I don’t always write tests with an audience in mind. Often I’m thinking more about the test design, than the audience who might be reading the tests. This is an important distinction that we need to think about when writing tests if we want them to be used as documentation. Can we write tests with an audience in mind and still have them as effective tests? Will writing tests with an audience in mind help us write better tests? If we don’t write with an audience in mind, they won’t work very well as documentation.

What are We Trying to Say?

Another standout for me was what is it that tests document? We were fortunate to have example tests for people to review. The FIT tests seemed to be easier for non-developers to read, while the developers jumped into the Ruby test::unit and JUnit tests immediately. Some testers who weren’t programmers paired with developers who explained how to read the tests and what the tests were doing. I enjoyed seeing this kind of collaboration, and it got me thinking. More on that later. The point is, if we are writing a document, we need to have something to say. I’m reminded of high school English classes and learning how to develop a good thesis statement, and my teachers telling us we need to find something to say.

Order is Important

Another important point that emerged about tests as documentation was the order of the tests. Thinking of tests as documentation means thinking of the order of the tests not unlike chapters in a book, or paragraphs in a paper. A logical order is important. Without it we can’t get our ideas across clearly to the reader. It is difficult to read something that has jumbled ideas and doesn’t have a consistent, orderly flow.

With regards to the audience, one group identified two different potential audiences among programmers: designers and maintainers. A designer will need a different set of tests than a maintainer. Furthermore, the order of the tests developed will differ if one is a maintainer than if one is a designer. There are more audiences on the project than programmers, and these audiences may require a different order of tests.

Dealing with the “What is it that tests document?” question, one group felt that different kinds of tests document different things. For example, the unit tests the developers write will document the design requirements while the User Acceptance Tests will document the user requirements. The fact that some developers seemed more at home reading the unit tests, and some testers were more comfortable reading the FIT tests might give some credence to this. They are used to reading different project literature and might be more familiar with one mode over another.

Another important question was: “How do we define tests and explain what they are supposed to do?” If tests also should serve as project documentation and not just exercise the code or describe how to exercise the product in certain ways, the definition of tests will change according to how they are defined for a project.

Workshop Conclusions

I’m not sure we developed any firm conclusions from the workshop, though the group generated many excellent ideas. A workshop goal was to look at areas for further study, so we certainly met that. One idea that came up that I’ve been thinking about for a few months is to have meta descriptions in the automated tests that are more verbose. The tests would have program-describing details within the comments. A tool such as JavaDoc or RDoc could be used to generate project documentation from the specially tagged automated test comments. I like this idea, but the maintenance problem is still there. It’s easy for the comments to get out of date, and requires duplication of effort.

Most important to me were the questions raised about the audience, and how to write tests with an audience in mind. It appears that the tests we seem to be writing to date may not necessarily be taken on their own and used as documentation like requirements documents. None of the tests that we looked at sufficiently explained the product. The readers either had to consult the developer or look at the source code. This wasn’t a weakness or shortcoming of the tests, but showed us that tests as documentation is an area that needs more thought and work.

A couple of other very interesting observations were made. One was by a developer who said that you can tell whether tests were generated by Test Driven Development (TDD) or not by reading them. Another idea was that if one is reading tests and has to consult the source code to figure out what the program is doing might be a testing smell. These observations coupled with the tester/developer collaboration when reading tests got me thinking in a different direction.

My Thoughts

At the end of the workshop, I found myself less interested in tests serving as documentation to replace requirements documents, project briefs or other project documents. Instead, I started thinking about reading tests as a kind of testing technique. I started to imagine a kind of “literary criticism” technique to use to test our tests. This is an area that is hard to deal with. How thorough is our test coverage? Are our tests good enough? Are we missing anything? How do we know if our tests are doing the job they could be? I see a lot of potential to test our tests by borrowing from literary criticism.

Brian spoke about writer’s workshops as a safe place for writers to have practitioners, peers and colleagues look over each other’s work before they are published. This kind of atmosphere helps writers do better work and is a safe environment to get good constructive criticism before they are published and potentially savaged by the masses if they miss something important. For a “testing the tests” technique, instead of an us-versus-them relationship to simply negatively criticize, we could have test writers’ workshops to critique each other’s tests. The point is to have a safe environment to make the tests (and thereby the product) as solid as they could be before they are open to be potentially “…savaged by the masses,” for example, customers finding problems or faults of omission.

Here are three areas I saw in the workshop that could potentially help in testing the tests:

I saw testers and developers collaborating, and it occurred to me that explaining what you have written (or coded) is one of the best ways of self-critiquing. When explaining how something works to someone else, I find myself noticing holes in my logic. Also, the other person also may spot holes in what has been written. That editor, or second set of eyes really helps as pair programming has demonstrated.
I heard expert developers saying they could read *Unit tests and be able to tell immediately whether they were TDD tests or not. TDD tests are richer by nature they told us because they are more tightly coupled to the code. I thought that there is potential there for senior developers to read the tests to help critique constructively and find potential weak spots. One could have a developer outside of the pair that has been working read the tests as a type of test audit or editorial review.
The emergence of a possible test smell: “If we have to look at the code to explain the program, are we missing a test?” prompted me to think of the potential for a catalog of test smells that reviewers could draw on. We look for bad “writing smells” using rules of grammar, spelling, etc. We could possibly develop something similar for using this style of review for our tests to complement the work that has already been done in the test automation area. This could involve reading the tests to find “grammatical” errors in the tests.

I still think there is a lot of potential to use tests as documentation, but it isn’t necessarily as simple as taking the tests we seem to be writing today and making them into project documentation in their original form. I encourage developers and testers to look at tests as documentation, and to think about how to use them to possibly replace wasteful documentation.

I learned a lot from the workshop, and it changed my thinking about tests as documentation. I’m personally thinking more about the “test the tests” idea than using tests as project documentation right now.

Jonathan Kohl