Category Archives: testing

Virtual Bugs

Mark McSweeny and I were talking about some of the challenges facing conventional testers on agile projects. One such challenge is what to do with bugs found during development, such as during Test-Driven Development when a tester and developer are pairing. It doesn’t seem fair to the developer to formally log bugs on a story before they have completed development on it. Many of them will be moot once the story is done, but some of them might squeak through. How do we keep track of them in a constructive way?

When they are pairing and developing tests, new test cases are added as they are generated, and the code is added to make them pass. However, sometimes when bugs are found during story development, the tester can overwhelm the developer and impede development progress. At this point, the developer and tester pair can split up, and the develoepr can pair with another developer and work towards completing the story. As a tester, what do I do with the bugs we discovered but couldn’t get the unit tests finished for?

On small teams, I will keep a running tab in my notes of these bugs. When the story is complete, I check my notes and test these scenarios first, and then log the ones that weren’t fixed during the story as “bug stories”. This is fine if there are a small number of developers, and I’m the only tester. This doesn’t scale well though. On slightly larger teams, I have also used a wiki to record these bugs which the other testers also reviewed and used. When they tested a story when it was complete, they would check the wiki first for any of these bugs. Any of these that weren’t addressed in development were then logged as bug stories or in a fault-tracking system. This can create classes of bugs which can create problems. It was hard to maintain the two systems, the wiki and the bug tracker.

As I was describing some of the problems I’ve come across with bugs found during story development, Mark cut through my usual verbosity with clarity, and said I was describing: “virtual bugs”. This is a lot more concise than my five minute hand waving explanation of this different class of bugs.

I have started calling the bugs found during story development “virtual bugs”. My question to other conventional testers on agile projects is: “How do you deal with virtual bugs”? Please share your experiences.

Traits of Good Testers

People frequently ask me what to look for in potential candidates when hiring testers. Michael Hunter “The Braidy Tester” has a great post on The Hallmarks of a Great Tester. He has some good thoughts in this post. I recommend checking it out.

I would add the traits honesty and integrity, as well as someone who has courage to the list. Testers seem to end up in situations where ethical concerns may arise. A tester needs to know what their ethics are, and to stand for them. As Lesson 227 of Lessons Learned in Software Testing says:

You don’t have to lie. You don’t have to cover up problems. In fact, as a tester, the essence of your job is to expose problems and not cover them up. You don’t have to, and you never should compromise your integrity. Your power comes from having the freedom to say what’s so, to the people who need to hear it. If you compromise your reputation for integrity, you’ll weaken one of the core supports of your power.*

Discovering these traits in a potential candidate can be difficult, but there are effective ways of interviewing to tell if someone is a great tester or not. Check out Johanna Rothman’s Hiring Technical People blog for more information on hiring techies.

*Excerpted from p.200 of Lessons Learned in Software Testing.

MythBusters and Software Testing

I enjoy the Discovery Channel show MythBusters. For those who may not be familiar with the show, it is hosted by two film-industry techies: Adam Savage and Jamie Hyneman who test popular urban legends to see if they are plausible or not. It’s almost like an extreme snopes. They take a popular urban legend, design a concept on how to recreate it, and test to see if they can disprove it or not.

What I like about the show is the process that they follow when testing out whether a myth is plausible or not. Once a myth has been selected, they treat it like a theory (or hypothesis) that needs to be shown it can’t be disproven. They design and build tools to simulate a particular environment, and use observation to see whether the myth holds up. They usually improvise to create an environment that simulates the conditions needed to test the myth. They may not have the exact tools or objects readily at hand to use, so they will design objects to get the desired effect, and measure key elements to ensure they will help create the conditions that are needed.

The process the MythBusters follow isn’t perfect, and the simulations they create are not necessarily identical to the original objects in the myth, but the environments and tests they create and execute (as software testers often find on their projects) are generally good enough.

I think they would make great software testers. What they do reminds me of software testing. We often have to build specialized tools to simulate appropriate conditions, and we use observation and some measurement tools to test whether a theory can be reasonably proven to be false, under certain conditions. If we can demonstrate that the theory is falsifiable,1 we gather the data from our observations and demonstrate how the theory was shown to be false under certain conditions. For software testers, this data is usually gathered into a bug report or bug story card. On the show, they say whether a myth is plausible or “busted”.

What’s more, the MythBusters have that certain testing spirit that says: “What would happen if?….” which compels them to push things to the limit, and quite literally blow things up. Testers have fun pushing something to the brink, and then giving it that one last shove over the edge where the server dies, or the application runs out of memory. Not only are we having a bit of fun to satisfy curiosity, but we are gathering important data about what happens when something is pushed to perhaps extreme limits. Not only do we get a certain enjoyment out of watching things break, we add that observed behavior to our catalog of behavior. When we see it again, we can draw on some of those patterns and predict what might happen. Sometimes we can spot a bug waiting to happen based on the observation of what happens before the application blows up.

A related testing example might be a development theory we are testing such as: “the web application can handle 30 concurrent users”. We may have 8 people on our development team, so we can’t test 30 concurrent users with so few people. Instead, like the MythBusters we use, develop or modify a tool to simulate this. We might develop a test case that threads several sessions on a machine to simulate the 30 concurrent users, or use a test tool designed for this purpose. If the application fails to hold up under the simulated conditions in a repeatable fashion, other factors equal, we have evidence that this theory is falsifiable. If the reverse occurs, and the application holds up perfectly fine, we know the theory is not likely to be false under the conditions we ran the test. We may alter other variables to design new test cases, and observe what happens. As testers, we usually only record the results that demonstrate falsifiability which is a bit different from the MythBusters, and from others who follow the general pattern of the Scientific Method.

1 Check out Chapter 2 in Lessons Learned in Software Testing for more on testing and inference theory.

Tests as Documentation Workshop Notes

At this year’s XP Agile Universe conference, Brian Marick and I co-hosted a workshop on Tests as Documentation. The underlying theme was: How are tests like documentation, and how can we use tests as project documentation?. Can we leverage tests to use as project documentation to help minimize wasteful documentation?

Since the most up to date information about a product is in the source code itself, how do we translate that into project documentation? In the absence of a tool to traverse the code, translate it and generate documentation, are tests a good place to look? Can we just take the tests we have and use them as documentation, or do we need to design tests a specific way?

We solicited tests from workshop participants, and had some sample tests developed in JUnit with the corresponding Java code, tests developed with test::unit and the corresponding Ruby code, and some FIT tests.
Brian organized the workshop in a format similar to Patterns or Writers workshops. This was done to facilitate interaction and to generate many ideas in a constructive way. Groups divided up to look at the tests, and to try to answer the questions from the workshop description. Once the pairs and groups had worked through these questions, they shared their own questions with the group. Here is a summary of some of the questions that were raised:

  1. Should a test be written so that it is understood by a competent practitioner? (Much like skills required to read a requirements document.)
  2. How should customer test documentation differ from unit test documentation?
  3. With regards to programmer tests: Is it a failure if a reader needs to look at the source code in order to understand the tests?
  4. What is a good test suite size?
  5. How do we write tests with an audience in mind?
  6. What is it that tests document?
  7. How should you order tests?
  8. Should project teams establish a common format for the documentation regardless of test type?
  9. How do we document exception cases?

Some of these questions were taken by groups, but not all of them. I encourage anyone who is interested to look at examples that might answer some of them and share them with the community. While discussion within groups and with the room as a whole didn’t provide a lot in the way of answers to these questions, the questions themselves are helpful for the community to think about tests as documentation.

Of the ideas shared, there were some clear standouts for me. These involve considering the reader – something I must admit I haven’t spent enough time thinking of.

The Audience

Brian pointed out an important consideration. When writing any kind of documentation whether it is an article, a book, project documentation etc., one needs to write with an audience in mind. When reviewing tests (and as a test writer myself), I notice that I don’t always write tests with an audience in mind. Often I’m thinking more about the test design, than the audience who might be reading the tests. This is an important distinction that we need to think about when writing tests if we want them to be used as documentation. Can we write tests with an audience in mind and still have them as effective tests? Will writing tests with an audience in mind help us write better tests? If we don’t write with an audience in mind, they won’t work very well as documentation.

What are We Trying to Say?

Another standout for me was what is it that tests document? We were fortunate to have example tests for people to review. The FIT tests seemed to be easier for non-developers to read, while the developers jumped into the Ruby test::unit and JUnit tests immediately. Some testers who weren’t programmers paired with developers who explained how to read the tests and what the tests were doing. I enjoyed seeing this kind of collaboration, and it got me thinking. More on that later. The point is, if we are writing a document, we need to have something to say. I’m reminded of high school English classes and learning how to develop a good thesis statement, and my teachers telling us we need to find something to say.

Order is Important

Another important point that emerged about tests as documentation was the order of the tests. Thinking of tests as documentation means thinking of the order of the tests not unlike chapters in a book, or paragraphs in a paper. A logical order is important. Without it we can’t get our ideas across clearly to the reader. It is difficult to read something that has jumbled ideas and doesn’t have a consistent, orderly flow.

With regards to the audience, one group identified two different potential audiences among programmers: designers and maintainers. A designer will need a different set of tests than a maintainer. Furthermore, the order of the tests developed will differ if one is a maintainer than if one is a designer. There are more audiences on the project than programmers, and these audiences may require a different order of tests.

Dealing with the “What is it that tests document?” question, one group felt that different kinds of tests document different things. For example, the unit tests the developers write will document the design requirements while the User Acceptance Tests will document the user requirements. The fact that some developers seemed more at home reading the unit tests, and some testers were more comfortable reading the FIT tests might give some credence to this. They are used to reading different project literature and might be more familiar with one mode over another.

Another important question was: “How do we define tests and explain what they are supposed to do?” If tests also should serve as project documentation and not just exercise the code or describe how to exercise the product in certain ways, the definition of tests will change according to how they are defined for a project.

Workshop Conclusions

I’m not sure we developed any firm conclusions from the workshop, though the group generated many excellent ideas. A workshop goal was to look at areas for further study, so we certainly met that. One idea that came up that I’ve been thinking about for a few months is to have meta descriptions in the automated tests that are more verbose. The tests would have program-describing details within the comments. A tool such as JavaDoc or RDoc could be used to generate project documentation from the specially tagged automated test comments. I like this idea, but the maintenance problem is still there. It’s easy for the comments to get out of date, and requires duplication of effort.

Most important to me were the questions raised about the audience, and how to write tests with an audience in mind. It appears that the tests we seem to be writing to date may not necessarily be taken on their own and used as documentation like requirements documents. None of the tests that we looked at sufficiently explained the product. The readers either had to consult the developer or look at the source code. This wasn’t a weakness or shortcoming of the tests, but showed us that tests as documentation is an area that needs more thought and work.

A couple of other very interesting observations were made. One was by a developer who said that you can tell whether tests were generated by Test Driven Development (TDD) or not by reading them. Another idea was that if one is reading tests and has to consult the source code to figure out what the program is doing might be a testing smell. These observations coupled with the tester/developer collaboration when reading tests got me thinking in a different direction.

My Thoughts

At the end of the workshop, I found myself less interested in tests serving as documentation to replace requirements documents, project briefs or other project documents. Instead, I started thinking about reading tests as a kind of testing technique. I started to imagine a kind of “literary criticism” technique to use to test our tests. This is an area that is hard to deal with. How thorough is our test coverage? Are our tests good enough? Are we missing anything? How do we know if our tests are doing the job they could be? I see a lot of potential to test our tests by borrowing from literary criticism.

Brian spoke about writer’s workshops as a safe place for writers to have practitioners, peers and colleagues look over each other’s work before they are published. This kind of atmosphere helps writers do better work and is a safe environment to get good constructive criticism before they are published and potentially savaged by the masses if they miss something important. For a “testing the tests” technique, instead of an us-versus-them relationship to simply negatively criticize, we could have test writers’ workshops to critique each other’s tests. The point is to have a safe environment to make the tests (and thereby the product) as solid as they could be before they are open to be potentially “…savaged by the masses,” for example, customers finding problems or faults of omission.

Here are three areas I saw in the workshop that could potentially help in testing the tests:

  1. I saw testers and developers collaborating, and it occurred to me that explaining what you have written (or coded) is one of the best ways of self-critiquing. When explaining how something works to someone else, I find myself noticing holes in my logic. Also, the other person also may spot holes in what has been written. That editor, or second set of eyes really helps as pair programming has demonstrated.
  2. I heard expert developers saying they could read *Unit tests and be able to tell immediately whether they were TDD tests or not. TDD tests are richer by nature they told us because they are more tightly coupled to the code. I thought that there is potential there for senior developers to read the tests to help critique constructively and find potential weak spots. One could have a developer outside of the pair that has been working read the tests as a type of test audit or editorial review.
  3. The emergence of a possible test smell: “If we have to look at the code to explain the program, are we missing a test?” prompted me to think of the potential for a catalog of test smells that reviewers could draw on. We look for bad “writing smells” using rules of grammar, spelling, etc. We could possibly develop something similar for using this style of review for our tests to complement the work that has already been done in the test automation area. This could involve reading the tests to find “grammatical” errors in the tests.

I still think there is a lot of potential to use tests as documentation, but it isn’t necessarily as simple as taking the tests we seem to be writing today and making them into project documentation in their original form. I encourage developers and testers to look at tests as documentation, and to think about how to use them to possibly replace wasteful documentation.

I learned a lot from the workshop, and it changed my thinking about tests as documentation. I’m personally thinking more about the “test the tests” idea than using tests as project documentation right now.

Javan Gargus on Underdetermination

Javan Gargus writes:

I was a bit taken aback by your assertion that the testing team may not have done anything wrong by missing a large defect that was found by a customer. Then, I actually thought about it for a bit. I think I was falling into the trap of considering Testing and Quality Assurance to be the same thing (that is a tricky mindset to avoid!! New testers should have to recite “Testing is not QA” every morning. ). Obviously, the testers are no more culpable than the developers (after all, they wrote the code, so blaming the testers is just passing the buck). But similarly, it isn’t fair to blame the developers either (or even the developer who wrote the module), simply because trying to find blame itself is wrongheaded. It was a failure of the whole team. It could be the result of an architecture problem that wasn’t found, or something that passed a code review, after all.

Clearly, there is still something to learn from this situation – there may be a whole category of defect that you aren’t testing for, as you mention. However, this review of process should be performed by the entire team, not just the testing team, since everyone missed it.

Javan raises some good points here, and I think his initial reaction is a common one. The key to me is that the people should be blamed last – the first thing to evaluate is the process. I think Javan is right on the money when he says that reviews should be performed by the entire team. After all, as Deming said, quality is everyone’s responsibility. What the development team (testers, developers and other stakeholders) should strive to do is to become what I’ve read James Bach call a “self critical community”. This is what has served the Open Source world so well over the years. The people are self critical in a constructive sense, and the process they follow flows from how they interact and create working software.