Category Archives: testing

Designing a Gamification Productivity Tool

Gamification and Software Testing

I haven’t spoken about this project publicly because we never got to a public release. Software testing tools represent a tiny market, so they are incredibly difficult to fund. Some of you have asked me about gamification tools with testing, so I thought I would share this brain dump.

A few years ago, I was asked to help a development team that had significant regulatory issues, and frequently accrued testing debt. The product owner’s solution was to periodically have “testing sprints” where other team members helped the over burdened test teams catch up. There was just one problem: the developers HATED helping out with 2 weeks of testing, so I was asked to do what I could to help.

A couple of the senior architects at this company were very interested in Session Tester and asked me why I had put game mechanics in a testing tool. I didn’t really realize at the time I had put game mechanics in, I was just trying to make something useful and engaging for people. So I started talking with them more about game design, and they encouraged me to look into MMOs and co-operative games. The team played games together a great deal, so I learned about the games they enjoyed and tried to incorporate mechanics

I set up a game-influenced process to help structure testing for the developers, and taught them the basics of SBTM. They LOVED it, and started having fun little side contests to try to find bugs in each other’s code. In fact, they were enjoying testing so much, they would complain about having to go back to coding to fix bugs. They didn’t want to do it full time, but a two week testing sprint under a gamified, co-operative model with some structure (and no horrible boring test cases) really did the trick.

Eventually, I worked with some of the team members with a side-project, and the team lead proposed creating a tool to capture what I had implemented. This was actually extremely difficult.  We started with what had been done with Session Tester, and went far beyond that, looking at a full stack testing productivity tool. One of the key aspects of our approach that differed from the traditional ET and scripted testing approaches was the test quest. As I was designing this test tool, I stumbled on Jane McGonigal’s work and found it really inspiring. She was also a big proponent of the quest as a model for getting things done in the real world. Also, we were very careful in how we measured testing progress. Bug counts are easily gamed and have a lot of chance. I have worked in departments that measured on bug counts in the past, and they are depressing if you are working on a mature product while your coworkers are working on a buggy version 1.0.

One thing Cem Kaner taught me was to reward testers based on approach rather than easily counted results, because they can’t control how many bugs there may or may not be in a system. So we set up a system around test quests. Also, many people find pure exploratory testing (ET) too free form and it doesn’t provide a sense of completion the way scripted test case management tools do. And when you are in a regulatory environment, you can’t do ET all the time, and test cases are too onerous and narrow focused. We were doing something else that wasn’t pure ET and it wasn’t traditional scripted testing. It turns out test quest was a perfect repository for everything that we needed to be done. Also, you didn’t finish the quest until you cleaned up data, entered bugs and other things people might find unpleasant after a test session or two. There is more here on quests: Test Quests – Gamification Applied to Software Test Execution

As I point out in that post, Chore Wars is interesting, but it was challenging for sustained testing because of different personalities and motivations of different people. So we used some ideas from ARGs to sprinkle within our process rather than use it as a foundation. Certain gamer types are attracted to things like Chore Wars, but others are turned off by them, so you have to be careful with a productivity tool.

We set up a reward system that reminded people to do a more thorough job. Was there a risk assessment? Were there coverage outlines? Session sheets? How were they filled out? Were they complete? What about bug reports? Were they complete and clear?  I fought with the architects over having a leaderboard, but eventually I relented and we reached a compromise. Superstar testers can dominate a system like this, causing others to feel demoralized and not want to try anymore. We decided to overcome that by looking at chance events, which are a huge part of what makes games fun, so no one could stay and dominate the testing leaderboard, they would get knocked to the bottom randomly and would have to work their way back up. Unfortunately, we ran into regulatory issues with the leaderboard – while we forbade the practice of ranking employees based on the tool, this sort of thing can run afoul of labor laws in some countries, so we were working on alternatives but ran out of resources before we could get it completed.

Social aspects of gaming are a massive part of online games in particular, but board games are more fun with more people too. We set up a communication system similar to a company IRC system we had developed in the past. We also designed a way to ask for help and for senior testers to provide mentoring, and like MMOs, we rewarded people who worked together more than if they worked alone. Like developer tools, we set up flags for review to help get more eyes on a problem.

We also set up a voting system so testers could nominate each other for best bug, or best bug report, best bug video, and encouraged sharing bug stories and technical information with each other within the tool.

An important design aspect was interoperability with other tools, so we designed testing products to be easily exported so they could be incorporated with tools people already use. Rather than try to compete or replace, we wanted to complement what testers were already doing in many organizations, and have an alternative to the tired and outdated test case management systems. However, if you had one of those systems, we wanted to work with it, rather than against it.

Unfortunately, we ran out of resources and weren’t able to get the tool off the ground. It had the basics of Session Tester embedded in it, with improvements and a lot of game approaches mixed in with testing fundamentals.

We learned three lessons with all of this:

  1. Co-operative game play works well for productivity over a sustained period of time, while competitive game play can be initially productive, but over time it can be destructive. Competition is something that has to be developed within a co-operative structure with a lot of care. Many people shut down when others get competitive, and rewarding for things like bugs found, or bugs fixed causes people to game the system, rather than focus on value.
  2. Each team is different, and there are different personalities and player types. You have to design accordingly and make implementations customizable and flexible. If you design too narrowly, the software testing game will no longer be relevant. If design is more flexible and customizable from the beginning, the tool has a much better chance of sustained use, even if the early champions move on to other companies. I’ve had people ask me for simple approaches and get disappointed when I don’t have a pat answer on how to gamify their testing team approach without observing and working with them first. There is no simple approach that fits all.
  3. Designing a good productivity tool is very difficult, and game approaches are much more complex than you might anticipate. There were unintended consequences when using certain approaches, and we really had to take different personality and player styles into account. (There are also labour and other game-related laws to explore.) Thin layer gamification (points, badges, leaderboards) had limited value over time and only appealed to a narrow group of people.

 If some of you are looking at gamification and testing productivity, I hope you find some of these ideas useful. If you are interested in some of the approaches we used, these Gamification Inspiration Cards are a good place to start.

Applying Gamification to Software Testing

I wrote an article for Better Software magazine this month called “Software Testing is a Game”, available here in PDF format. I wrote about using gamification as an approach to analyze and help make software testing more engaging. I encouraged readers to apply some ideas from gamification to their own testing efforts. Now, why would I do a thing like that? And what do I mean by using game mechanics when we are testing? Games are all well and good, and I may enjoy them, but we are talking about serious work here, why would we make it look like a game?

Let me give you a bit of background information.

I was working with my friends Monroe Thomas and David McFadzean on product strategy when they started bringing up my gamification design ideas. I use gamification in mobile app design to help them be more engaging for users. That doesn’t mean that I make an app look like a game, it means I use ideas from games to help make the app more interesting and easier to use. However, we weren’t talking about mobile apps, so I was a bit surprised. They pointed out that the same concepts that make gamification in mobile apps apply to other apps, after all, David and I even wrote an article about using gaming when creating software processes. Why couldn’t I use those ideas in a product strategy meeting for something else?

Good point.

In fact, they even urged me to look at some of my other prior app designs, they felt I would find gamification-style aspects in those as well, because I always worry about making apps more engaging. Once I started thinking about the implications of what they were saying, an entire new world of possibility opened up. I felt like they had just kicked open a big door of perception for me.

But wait a minute. What is this business about games? Well, the thing with gamification is that when I use those tools correctly in an app, you don’t know it is there. I don’t put childish badges and leaderboards in a productivity app and then say: “Look! gamification at work!” for example. Andrzej Marczewski describes gamification mechanics in terms we can relate to in his blog Game Mechanics in Gamification as: Desired Behavior, Motivation and Supporters.

Andrzej uses a game format to illustrate his point, but it should be obvious that these three themes are not limited to games. Where game designers shine, and where policy wonks and enterprise or productivity designers tend to fail is in the structure around desired behavior. Too often, we just expect people to excel in a work place environment with little support. Games on the other hand tickle our emotions, they captivate us, and they encourage us to work hard at solving problems and reaching goals.

Framing something like software testing in terms of gaming, and borrowing some of their ideas and mechanics, applying them and experimenting can be incredibly worthwhile. After all, as I state in the article, it is difficult to get people involved in software testing, and as technology becomes more pervasive and more enmeshed in our every day lives, it has more potential to do harm. We need new people and new ideas and new approaches, and I want to figure out how to make it more engaging for people. Why can’t effective testing be fun?

It can.

If you work on a team with me, you will notice that there is a lot of laughter, a lot of collaboration, a lot of discovery and learning. And everyone tests from time to time. Sometimes, it can be difficult to get the coders to code, the designers to design and the managers to manage, because everyone wants to test. Why is that? Well, gamification can help provide a structure to analyze what we do and learn why some things are fun and help us work hard, while others cause us to avoid them.

Speaking of analyzing something from a gamification perspective, remember in the Better Software article how I described several aspects from gaming and asked you to apply it to your testing work? Prior to writing the article, I did exactly that with a product I designed called Session Tester. Aaron West and I developed a tool to help testers capture information while using an approach called Session-Based Testing. We had high hopes for the project, but after several setbacks, it’s now dormant. However, a back of the napkin analysis of the tool using a gamification approach was incredibly useful. This is what we came up with, using game concepts from Michael Wilson’s “Gamification: You’re Doing it Wrong!” presentation:

  1. Guidelines and Behaviors:
    Context and rules around the tool was hit and miss. The tool enforces the basic form of session-based testing which helps people learn how to approach testing from this perspective. People are required to fill in the minimum information to create a session sheet. There are strategy ideas readily at hand, and the elements are easily added by using tags. The tool was helpful to teach beginners on the basic form of SBT, but we didn’t enforce the original SBTM rules as set out by James and Jon Bach. This hurt the tool’s effectiveness. While we value the ability for people to modify and adapt, we should have started with the known rules and then provided the ability to adapt, rather than design it from an adapted view. This caused confusion and controversy.
  2. Strategies and Tasks:
    Elisabeth Hendrickson’s ET Heuristics Cheatsheet is provided in the tool to help people think about strategy, and there are oblique strategies to help create test ideas using the Prime Me! button. There could be more resources added to help with strategy, and in fact a lot of the strategy work can be done outside of the tool. We could have done more feature-wise to help with strategy. Tasks can be pre-planned outside of the tool, or done on the fly and recorded with the @tasks tag, which is saved in session sheets. We could also have done more to support tasks.
  3. Risks and Rewards:
    There is a risk that you don’t have a productive session, or your session sheet is woefully inadequate. The timer was a good motivator since you run the risk of running out of time, so there was a bit of a game there with trying to beat the clock and have a focused, productive session. I designed that to be analogous to the “red bar green bar game” used in Test Driven Development tools. There is a reward inherent in getting your mission completed and having a good session sheet you can be proud to share, but it is completely intrinsic. You are also rewarded a bit with the Prime Me! button to help you get a new idea, or break a creativity log jam. We could have done a lot more to help people plan and manage risks, and add features to reward testers for using a good assortment of tags, or a peer-reference or reward system for great testing. The full bar showing once time has run out helps tickle an intrinsic reward of completion. As a tester, I did all I could in that session, and now I can move on to other things.
  4. Skill and Chance Events:
    Skilled testers often like to record what they discover, to have the freedom to investigate areas of high value, and take pride in having a varied approach to their testing. However, there is no extrinsic reward for completion of session sheets. Sheets with more tags having a higher score might have been a good option to add,to help people learn how to improve what they record. Outside of discovering bugs, chance events are brought in by the Prime Me! button. Like rolling a dice, people can click the button until an oblique strategy jiggles their brain in a different direction. The Prime Me! button is the most popular feature of the tool and is still demonstrated at testing conferences by people like Jon Bach. People find it fun and useful.
  5. Cheating and Compliance:
    Cheating: Anyone who uses a test case management system will have a high degree of cheating. People just get tired of the regression tests they run over and over and start clicking pass or fail to show progress. They are very easy to cheat, but a session-based approach is much more difficult to cheat, because you have to show a description of a testing session. However, there is nothing to prevent people from saving an empty session sheet. I have seen this happen on over worked teams, and it wasn’t discovered for weeks. We could possibly have looked at flagging incomplete or blank session sheets in the system so there is visibility on them /prior/ to an audit, or encourage people to do something about it within the tool. Compliance was a big miss because we altered the original SBTM rules, which caused a lot of controversy and prevented more widespread adoption. We should have enforced the original rules by supporting the Bach SBTM format first, then added the ability to adapt it instead of approaching it from the other direction.

It’s interesting to note that the aspects that made this tool popular and engaging can also be viewed in terms of gaming mechanics. A couple of them were there by design, but the others were just there because I was trying to make the app more engaging. However, if we had used this gamification structure during design of the tool, we would have had different results, and arguably a better tool, because it provides a more thorough structure. Areas of fun such as the Prime Me! button, and trying to automate some of the processes of SBTM helped make the experience more enjoyable for our users.

However, if you didn’t look at the tool from a gaming perspective, you wouldn’t notice that there are game mechanics at play within it. This is an example of using a gamification approach that goes beyond superficial leaderboards and rewards, and I encourage you to try it not only with your testing tools, but your processes and practices in testing. Use it as a system to analyze: What is working well? Where are you lacking? It’s a useful, systematic approach.

That analysis doesn’t look like a childish game does it? Bottom line: if you aren’t a gamer, you probably won’t notice the gaming aspects I bring into testing process and tools. If you are a gamer, you’ll notice the parallels right away, and will hopefully appreciate them. For both groups, hopefully gamification will be one tool we can use to help make testing more engaging and fun.

Software Testing Training and Gaming

If you spend time at conferences, or hire a well-known testing consultant to provide some training for your company, it’s likely that one or more of them have used game mechanics as teaching tools. In fact, they probably used them on you. You may not be aware that they did, but they used gaming mechanics to help you learn something important.

James Bach is famous for using magic tricks and puzzle solving as teaching tools. When I spent time with James learning about how to be a more effective trainer, he told me that magic tricks are great teaching tools because we all love to be fooled. When we are fooled by something, we are entertained, and our mind is primed for learning about what we missed during the trick. That is an ideal state for the introduction to new ideas. If you spend any time with James or any of his adherents at a conference or peer workshop, you will likely be inundated with puzzles to solve. There is always a testing lesson to be learned at the end, and it is a novel way of helping people learn through solving a tangible problem. If you love to solve puzzles and learning about testing, you’ll enjoy these experiences.

Dorothy Graham has a board game that she developed for testing tutorials. It’s a traditional style game that she created as a training aid, and Dot loves to deliver this course. The tutorial attendees have a lot of fun, and they learn some important lessons, but Dot admits she may even have more fun than they do. Dot loves training, and the game takes the entertainment value of learning up a few notches. I’ve taught next door to Dot and heard attendees as they play the game and learn with her, and I’ve seen their smiling faces during breaks and after the course. There is something inherently positive about using a real, physical game, designed for a specific purpose (and fun) in this way.

Fiona Charles and Michael Bolton also created a board game for a software development game workshop they facilitated in 2006. Fiona says:  “Our experience with the game highlighted the power of games and simulations in teaching: their ability to teach the participants (and the teachers) more than was consciously intended.”

Ben Simo uses a variation on a board game. I’m not going to give it away, since it’s highly effective, but he used it on me when I was moving from a dabbler in performance and load testing to working on some serious projects. Ben is an experienced and talented performance tester, and he has taught a lot of people how to do the job well. Ben spent hours with me using pieces from a board game, and posing problems for me and having me work on solving them. It was highly interactive, was chock full of performance testing analysis lessons, and we enjoyed working together on it. He would set up the scenario, enhanced by the board game, and I would work on approaches to solve it. I had about 15 pages of notes from this game play activity to take back and apply to my work on Monday. After playing this training game with Ben, I had much more confidence and I was able to spot far more performance anomaly patterns than I had prior to working with him. (We worked through this in a hotel lounge, and we got a lot of weird looks. We didn’t care, we were having fun! Besides, channeling Ralph Wiggum: I was “learnding”!)

James Lyndsay developed a fascinating course on exploratory testing, and with it, simple “black box test machines” that he developed in Flash to aid in experiential learning. These machines had no text on them, and they are difficult to start using, because there are no outward signs of what they are for. This is done on purpose, and each machine helps each class participant experience the lesson through their own exploration and discovery. This is one of my favorite game-like experiences in a testing training course. The machine exercises remind me of a puzzle adventure game. One of my favorites of this type of game is Myst. You have to explore and go off of your observations and clues to figure out what to do, and the possibilities for application and experience are wide open. James managed to create 4 incredibly simple programs that can replicate this sort of game experience during training. Simply brilliant.

Those of you who follow Jerry Weinberg, or the many consultants who have been influenced by him have likely worked through simulations during a workshop or tutorial. Much like an RPG (role playing game), attendees are organized around different goals, roles, activities and tasks to create an improvised simulation of a real-life problem. This involves drawing on improvisation, your “pretending” skills and applying your problem solving techniques in a different context than a work context. Many people report having very positive experiences and “aha!” moments when learning from these sorts of activities.

Another theme in Jerry’s people working is physical activity. Jerry gets people to move around, and he can influence the mood of the room by adding in physical activity to a workshop. In the book, the Gift of Time, Fiona Charles shares a poignant story about Jerry using a movement activity to calm down a room full of people during a workshop when they first learned about the events of September 11. Michael Bolton has told me several stories of how Jerry changes the learning dynamic by getting people to move and work in different parts of the room, or grouping people and having them move and work with others in creative combinations. Movement is a huge part of many games, especially sports and outdoor activities, and it gets different parts of our brain working. If you couple movement with learning concepts, it brings together more of your senses to help with concept retention. It is also associated with good health, a sense of well being and fun.

(Speaking of experiential learning, pretty much everyone I have mentioned here, including me, (and a lot more trainers you have heard of) have been influenced either directly or indirectly by Jerry Weinberg’s work on experiential learning. He even has a series books on the topic on Leanpub. The first one: Experiential Learning: Beginning , the second: Experiential Learning: Inventing and the third: Experiential Learning: Simulation.)

There are other examples of trainers using game structures in software testing, and I’ve probably missed some obvious ones. (I haven’t even told you about the ones I use, but that doesn’t matter.) These are some good examples off the top of my head that demonstrate the use of game mechanics in teaching.

I wanted to point out that each of them use game mechanics to teach serious lessons. While people may have fun, they come away with real-world skills that they can apply to their work as soon as they are back in the office.

Don’t be turned off by the term “game” when it comes to serious business – if you look at gaming with an open mind, you’ll see that it is all around us, being used in effective ways.

Did I miss a good software testing training gaming example? Please add them in the comments.

Edit: I just discovered an interesting post on games and learning on the testinggeek.com blog: Software Testing Games – Do They Help?

Test Automation Games

As I mentioned in a prior post: Software Testing is a Game, two dominant manual testing approaches to the software testing game are scripted and exploratory testing. In the test automation space, we have other approaches. I look at three main contexts for test automation:

  1. Code context – eg. unit testing
  2. System context – eg. protocol or message level testing
  3. Social context – eg. GUI testing

In each context, the automation approach, tools and styles differ. (Note: I first introduced this idea publicly in my keynote “Test Automation: Why Context Matters” at the Alberta Workshop on Software Testing, May 2005)

In the code context, we are dominated now by automated unit tests written in some sort of xUnit framework. This type of test automation is usually carried out by programmers who write tests to check their code as they develop products, and to provide a safety net to detect changes and failures that might get introduced as the code base changes over the course of a release. We’re concerned that our code works sufficiently well in this context. These kinds of tests are less about being rewarded for finding bugs – “Cool! Discovery!” and more about providing a safety net for coding, which is a different high value activity that can hold our interest.

In the social context, we are concerned about automating the software from a user’s perspective, which means we are usually creating tests using libraries that drive a User Interface, or GUI. This approach of testing is usually dominated by regression testing. People would rather get the tool to repeat the tests than deal with the repetition inherent in regression testing, so they use tools to try to automate that repetition. In other words, regression testing is often outsourced to a tool. In this context, we are concerned that the software works reasonably well for end users in the places that they use it, which are social situations. The software has emergent properties by combining code, system and user expectations and needs at this level. We frequently look to automate away the repetition of manual testing. In video game design terms, we might call repetition that isn’t very engaging as “grinding“. (David McFadzean introduced this idea to me during a design session.)

The system context is a bit more rare, but we test machine to machine interaction, or simulate various messaging or user traffic by sending messages to machines without using the GUI. There are integration paths and emergent properties that we can catch at this level that we will miss with unit testing, but by stripping the UI away, we can create tests that run faster, and track down intermittent or other bugs that might be masked by the GUI. In video or online games, some people use tools to help enhance their game play at this level, sometimes circumventing rules. In the software testing world, we don’t have explicit rules against testing at this level, but we aren’t often rewarded for it either. People often prefer we look at the GUI, or the code level of automation. However, you can get a lot of efficiency for testing at this level by cutting out the slow GUI, and we can explore the emergent properties of a system that we don’t see at the unit level.

We also have other types of automation to consider.

Load and performance testing is a fascinating approach to test automation. As performance thought leaders like Scott Barber will tell you, performance testing is roughly 20% of the automation code development and load generation work, and 80% interpreting results and finding problem areas to address. It’s a fascinating puzzle to solve – we simulate real-world or error conditions, look at the data, find anomalies and investigate the root cause. We combine a quest with discovery and puzzle solving game styles.

If we look at Test-Driven Development with xUnit tools, we even get an explicit game metaphor: The “red bar/green bar game.” TDD practitioners I have worked with have used this to describe the red bar (test failed), green bar (test passed) and refactor (improve the design of existing code, using the automated tests as a safety net.) I was first introduced to the idea of TDD being a game by John Kordyback. Some people argue that TDD is primarily a design activity, but it also has interesting testing implications, which I wrote about here: Test-Driven Development from a Conventional Software Testing Perspective Part 1, here: Test-Driven Development from a Conventional Software Testing Perspective Part 2, and here: Test-Driven Development from a Conventional Software Testing Perspective Part 3.
(As an aside, the Session Tester tool was inspired by the fun that programmers express while coding in this style.)

Cem Kaner often talks about high volume test automation, which is another approach to automation. If you automate a particular set of steps, or a path through a system and run it many times, you will discover information you might otherwise miss. In game design, one way to deal with the boredom of grinding is to add in surprises or rewarding behavior when people repeat things. That keeps the repetitiveness from getting boring. In automation terms, high volume test automation is an incredibly powerful tool to help discover important information. I’ve used this particularly in systems that do a lot of transactions. We may run a manual test several dozen times, and maybe an automated test several hundred or a thousand times in a release. With high volume test automation, I will run a test thousands of times a day or overnight. This greatly increases my chance of finding problems that only appear in very rare events, and forces seemingly intermittent problems to show themselves in a pattern. I’ve enhanced this approach to mutate messages in a system using fuzzing tools, which helps me greatly extend my reach as a tester over both manual testing, and conventional user or GUI-based regression automated testing.

Similarly, creating simulators or emulators to help generate real-world or error conditions that are impossible to create manually are powerful approaches to enhance our testing game play. In fact, I have written about some of these other approaches are about enhancing our manual testing game play. I wrote about “interactive automated testing” in my “Man and Machine” article and in Chapter 19 of the book “Experiences of Test Automation“. This was inspired by looking at alternatives to regression testing that could help testers be more effective in their work.

In many cases, we attempt to automate what the manual testers do, and we fail because the tests are much richer when exercised by humans, because they were written by humans. Instead of getting the computer to do things that we are poor at executing (lots of simple repetition, lots of math, asynchronous actions, etc.) we try to do an approximation of what the humans do. Also, since the humans interact with a constantly changing interface, the dependency of our automation code on a changing product creates a maintenance nightmare. This is all familiar, and I looked at other options. However, another inspiration was code one of my friends wrote to help him play an online game more effectively. He actually created code to help him do better in gaming activities, and then used that algorithm to create an incredibly powerful Business Intelligence engine. Code he wrote to enhance his manual game play in an online game was so powerful at helping him do better work in a gaming context, when he applied it to a business context, it was also powerful.

Software test automation has a couple of gaming aspects:

  1. To automate parts of the manual software testing game we don’t enjoy
  2. It’s own software testing game based on our perceived benefits and rewards

In number 1 above, it’s interesting to analyze why we are automating something. Is it to help our team and system quality goals, or are we merely trying to outsource something we don’t like to a tool, rather than look at what alternatives fit our problem best? In number 2, if we look at how we reward people who do automation, and map automation styles and approaches to our quality goals or quality criteria, not to mention helping our teams work with more efficiency, discover more important information, and make the lives of our testers better, there are a lot of fascinating areas to explore.

Software Testing is a Game

David McFadzean and I wrote an article for Better Software magazine called The Software Development Game, which was published in the September/October edition. We describe applying game-like structures and approaches to software development teams. I’ve been asked for another article on applying game-like approaches to testing, so look for more in this space.

In the meantime, here are some of my current thoughts.

How is software testing like a game? If we think of software development as a game, or a situation involving cooperation and conflict with different different actors with different motivations and goals, software testing fits as a game within the larger software development game (SDG). While the SDG focuses more on policy, practices and determining an ideal mix of process, tools and technology, software testing is often an individual pursuit within a larger game format. There are two distinct styles of playing that game today: scripted testing and exploratory testing.

In the scripted testing game, we are governed by a test plan and a test case management system. We are rewarded for going through a management system and repeating test scripts and marking them as passed or failed. We are measured on our coverage of the tests that are stored in this test case management system. It is very important to stakeholders that we get as close to 100% coverage of the tests in that system as possible. This is a repetitive task, in a highly constricted environment. Anything that threatens the test team goal of high coverage of the tests within a test case management system tends to be discouraged, sometimes just implicitly. The metrics are what matter the most, so any activity, even more and better testing will be discouraged if it takes away from reaching that objective. “You, tester are rewarded on how many test cases you follow and mark as pass or fail in the system. If there is time after we get this number, you may look at other testing activities.”

In the exploratory testing game, testers are given some degree of self-determination. In fact, the latest version of the definition of exploratory testing emphasizes this:
“a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the quality of his/her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.”

The game player in exploratory testing is rewarded on the quality of their work, their approach, and the quality of the information that they provide. Exploratory testing works by being adaptable and relevant, so traditional ideas of metrics are often downplayed in favor of qualitative information. If a tester changes their mind when they are working on an assignment, and they have good reason to do so, and can make their case defensible, they are rewarded for changing things up because it helps make the product and our approach better. Coverage is important, but testers are rewarded on using multiple models of coverage (and discovering important new ones) rather than following the regression tests and counting that as complete coverage.

When I think about exploratory testing, I am reminded of the game Nomic where changing the rules is considered a valid move. Instead of a group effort like Nomic, a tester has the freedom to change their own course, and to help improve the approach of the overall team by demonstrating excellent work. “You, tester are rewarded by excellent testing, and your skill and ability to find and report important problems. We don’t care if you work through one test, or a thousand tests to get that information. The information you provide and how it improves the quality and value of our products is what we value.” It is deeply important for exploratory testers to be able to adapt to changing conditions, to uncover important information to help the team create value with their product, and for many, to get endorsement and the respect of their peers.

Now think of game play activities. Both scripted and exploratory testing approaches involve repetitive work. In the gamification space, we can explore these activities, and how we can apply game concepts to enhance the work. One area we can explore are video games. Imagine we are are playing a massively multiplayer online role-playing game (MMORPG). Some players like to perform repeated tasks that are unrelated to individual and shared game objectives. They like to repeat tasks to unlock features related to their character, such as acquiring new outfits or accessories for their character.

Other players are very achievement goal oriented – they try to reach individual goals and gain more points and unlock more achievement-based features in the game, and learn that when they co-operate with others, that they can achieve even more. One player type gets rewarded more for repeated individual tasks, while the other gets rewarded for trying to create achievement or quest value for themselves and others. The two types blend, as any player will have a subgoal of changing the appearance of my character, or acquiring accessories or currency to enhance quest or other achievement game play. People who are less adventurous and are more caught up in acquiring personal character attributes will also take part in quest events.

If you play these games and try to collaborate, you will often find other players who are at different points on this spectrum. It can be frustrating when your team members are more interested in building up their own character’s numbers than in helping the team as a whole. For example, if you are off on a raid, and one or more team members constantly stop to pick up health or currency points, your team will be let down and may have to regroup. It can be difficult to work together, since the perceived value of the game and the reward structures don’t match. One group want personal metrics and accessories, while the other care about non-character pursuits and the respect of their peers more.

In the software testing game, it is difficult to try to play an exploratory testing game in a system that is rigid, restrictive, and rewards testers for playing a scripted game. Conversely, those who are used to being rewarded on acquiring coverage counts in test case management systems can be very threatened when they are now rewarded on the quality of their testing skills and the information they provide. Testing tradition and common practice tends to think of testing as one sort of game, of which a test case management system is central. What they fail to realize is that there are many ways of playing this game, not just one. I want to explore this other alternatives in more detail through a gaming lens.

Testers and test managers, there is a lot more to our world than scripted testing and test case management tools. They are part of an ancient game (from at least the 1970s), which is struggling and showing its age in the face of new technology and new approaches to work that mobile technology and other styles of work are bringing. Modern testers work hard to play a game that fits the world around us. Are you measuring and rewarding them for playing the right kind of game, or the one you are most used to? Like the video game character who has gathered metrics and accessories, are you gathering these at the expense of the rest of the team’s goals, or are these measures and rewards helping the entire software development team create a better product?

Watch for more in this space as I explore software testing from the perspective of gamification, game theory and other game activities more deeply.

Creating a Test Lab for Mobile Apps Testing – Part 3

In parts one and two of this series, I introduced some ideas on building a physical test lab for testing mobile apps. In part 3, I’m going to talk about devices. This is all based off of my own experience.

Purchasing Devices

This is the the most difficult aspect of setting up a test lab. I’m not going to make any hard and fast recommendations, because by tomorrow the market will probably change. You will need to research and adjust depending on what your current needs are.

Emulators are fine for basic functional testing, but you’re going to need to do testing on the real thing, whether your focus is on automated or manual testing. The devices have sensor and touch interaction, as well as network and other physical aspects that need to have real-world testing.

You will need to consider the following:

  • both smartphones and tablets
  • different hardware and operating systems or versions, per framework
  • devices with different screen sizes
  • different wireless broadband carriers
  • cables and other accessories for charging and syncing (never hurts to have extras)

Each of these areas will yield different results when testing. For example, I frequently test with a smartphone and a tablet at the same time, and it’s amazing how the app can behave or appear differently on one or the other.

Also remember that a platform for testing is a combination of a hardware model, operating system version, and if you are using network connections with your app, wifi, wireless broadband, different carriers and any special software carriers install.

Purchasing devices is a balancing act, because the market is so dynamic. You will need to get a reasonable number of devices for testing the real thing, but you can’t buy them all. I look at the following to help guide what I should buy:

  • What platforms are our customers using?
  • What are the most popular smartphones and tablets on the market?
  • What are the most popular mobile browsers?
  • What platforms are known to be problematic? (aka. problem child devices)
  • When is the newest thing coming out? Do we need it for testing?

To figure out what you need to do using this information can be challenging. Here is one example. If you don’t know what your target customer base are using, you can start with information you already have. If you already have a web presence, or a web app, find out your web traffic stats. Are people using mobile browsers? What brand is the most popular? Can you get a visual breakdown? This information will usually breakdown according to brand, such as Android or Apple iOS, but it won’t give you specifics, such as an iPad, with 1st gen. hardware running iOS version 5.1.1. But, if you know that Apple iOS devices are the most popular, you can find out information about what the most popular devices are in the market, and assume that your customer breakdown is likely very similar. Look at a chart of popular versions, and try to replicate that in your lab – ie. stock up on the more popular devices, with less of the less popular devices. Not perfect, but it’s a good heuristic.

If you are looking at mass market applications, there is some information out there to help. For example, this chart this chart by Net Marketshare would indicate we should focus a good deal of our efforts with iOS device testing. Here is some more information to help deal with proportions of devices to purchase: comScore Reports May 2012 U.S. Mobile Subscriber Market Share. Akamai has an interesting report on mobile web browser breakdowns here: Akamai IO.

Notice that these reports are already out of date. This information is difficult to get, and I doubt anyone really knows exact numbers. It’s important to research and look at a lot of data to help inform you. When you spot trends, that should help you with proportions.

Some people think that Apple devices are much easier because there are fewer hardware devices, and they are all made by one company. However, even with Apple, you can end up with a lot of platform combinations. Take the iPhone for instance. Let’s say we are going to have 3 devices: an iPhone 3GS, an iPhone 4, and an iPhone 4S. Since Apple encourages people to upgrade their OS, we have decided to stick with iOS 5, the latest version. At the time of writing, there are 4 iOS versions out there: iOS 5, iOS 5.0.1, iOS 5.1.1. Also assume our app requires an internet connection. If we choose 3G, the most popular wireless broadband networking type, and wifi, then we have 24 platform combinations. What if we now need to test on 4G now? Add 12 to the mix for a total of 36. That can be a lot of regression testing. If you want to optimize, you may decide to elminate everything but the latest version, 5.1.1. If you do that, try to find something that gives you an indication of upgrade proportions, like this information: iOS 5.1.1 Upgrade Stats. You may not want to upgrade all of your devices at once, but keep some back to mimic what is happening in the real world.

Here is a recent plan for an iOS test lab:
Apple iOS (all on the latest, 5.1.1):

  • (1) iPhone 3GS
  • (2) iPhone 4
  • (3) iPhone 4S
  • (1) iPod touch (they aren’t as powerful, and are great for finding problems you don’t see on other devices)
  • (1) iPad first gen
  • (2) iPad second gen
  • (3) iPad third gen
  • (at least 1) whatever the newest iOS device is when it comes out(these are always hugely popular)

Android, Windows Phone and Blackberry are more difficult, not just because of the different handset manufacturers, but there is more of a proportion of devices across versions of their OS, or OS family. For example, Android has a nice graphic of their Android OS Distribution. When setting up a lab for Android devices, I would try to replicate that pie chart in my own device distribution. For older platform families like Windows and Blackberry, you will have even more variations to consider, but temper that with how popular they are in your target market.

For mass market applications, or web testing, Brad Frost has a fabulous blog post test on real mobile devices without breaking the bank, and this blog post from the Mobile in Higher Ed blog is also very useful.

It sounds expensive, and it can be, but compared to test workstations and servers, it isn’t that bad. You can get multiple test devices for the price of many test servers when you take hardware, OS and required software into account. You will have to research using your favorite search engine, and you’ll need to stay on top of new developments, because you will regularly need to acquire new devices.

Looking for ideas on how to test mobile devices? Check out my I SLICED UP FUN mnemonic to help generate ideas.

Creating a Test Lab for Mobile Apps Testing – Part 2

In Part 1 of this series, I introduced some ideas on building a physical test lab for testing mobile apps. In part 2, I expand on those ideas.

Storage

You will want a clean, dry, secure storage area for devices to be kept in when not in use. I’ve used a drawer that locks with toolbox liner at the bottom to protect the cases and keep them from moving around. Here is an example: toolbox liner. The devices are small and disappear easily, so secure storage is a must.

Set up a sign-out sheet with each device noted by OS, version and serial number. Also create tags for each power/sync cable for each device so they can be kept track of if they leave the test lab. I used a label maker for this. There is nothing worse than losing your hard-earned test devices to other groups, or not having devices around when you need them. Cables disappear the most, so you need to make ownership for your lab visible. Think of the washroom key for a service station on the side of a busy road – the small key is often connected to something larger to make it more visible to help prevent it from getting lost. I always chuckle when I see a key on a chain with a hubcap attached, and I try to make my lab cables visible in some easily identifiable way.

Power Station

You will need access to power to charge the devices so they are available for testing when you are. It’s incredibly frustrating when you are getting ready to test on a device and find out that the battery is dead. If you can’t get power into your device storage area, set up a charging station in the lab.

Set aside counter or desk space near a power source, and add the following:

  • at least one power bar to plug in multiple devices
  • several power cables for the devices
  • toolbox liner or something similar on the surface to keep them from moving around

Lab PCs

You will need computers to manage the devices for testing, troubleshooting and other tasks. Consider at least one machine for each platform family. Here are four popular platforms. Your team will have their own mix of devices that they are developing for. Get the most powerful workstation you can afford for each platform; mobile app development tools can be large and complex.

iOS (Apple)

You will need at least one Mac for emulators and device management, and pulling screenshots and other information from the devices. Install XCode for emulator use, Also consider installing the Network Link Conditioner to help simulate different network connection conditions with an emulator. Install the iPhone Configuration Utility for getting stack traces, managing licenses, etc.

Android

You will need at least one machine with the Android SDK installed for managing Android devices and emulator use. Many developers use this within the Eclipse IDE, so talk to them about the correct combination of tools for your shop. The operating system doesn’t matter for Android development.

Windows Phone

If you are supporting Windows Phones, you will need a Windows machine with at least Windows 7 and the Windows Phone SDK and related tools.

BlackBerry

The new BlackBerry SDK supports several languages and implementations, so you’ll need to find out which version to install, and what other libraries need to be installed. I don’t think OS matters, but I have seen installation instructions recommending at least Windows XP or Vista.

Wireless Broadband

Data plans or other contracts for monthly wireless broadband access are important to test. I would get a reasonable plan for each device, from at least two local telecom providers. Try a larger provider, and a smaller provider (like a discount provider) to get more variation in wireless broadband technology. There is a huge diversity in what different carriers use to deliver this to the end user, so try to get some variation

Other

You will need internal test email addresses, and phone numbers for each smartphone for testing. Also get at least one developer license for each platform your team supports for your test lab. You will need test ids with the App Store on Apple for making purchases, or installing apps from the app store. You will likely need to do this with other stores as well such as Google Play and others for each device type you have.

Looking for ideas on how to test mobile devices? Check out my I SLICED UP FUN mnemonic to help generate ideas.

Creating a Test Lab for Mobile Apps Testing – Part 1

Some of you have asked me to expand on the physical aspects of mobile app development that I described in my article: Mobile Challenges for Project Management. If we have a testing burden when we develop mobile apps, what sort of considerations should we take into account when we are setting up test labs, when using real devices? Here are some pointers from my own experience.

This is part 1 of a 3 part series.  Feel free to add comments for anything I’ve missed.

The Room

You’ll need space where you can set up PCs for managing devices and using productivity tools, a place to store devices when they aren’t in use, space for charging devices, and plenty of space for people to move around with the devices while testing. (The mobility side of mobile devices needs to be tested.)

Network

Wifi

I prefer to have at least two dedicated test wi-fi beacons to test network connections and transitions between wifi, and wifi to wireless broadband. I prefer small, cheaper devices that have weaker strength so it is easier to set up test conditions for apps that require network connections. I like to have these for test purposes only, and it is better that they have weaker signals (so testers can walk away and get weaker wifi for testing), and can be turned off or unplugged for testing outages, etc. Also since they are smaller and weaker, you can set up two in a lab and transition between the two by moving in the lab.

Simulate Dead Spots

If you don’t have an office dead spot (no wireless broadband or wifi connectivity), or one that is convenient for testing, you may need to set something up yourself. If you have an elevator or other known dead spot near your office, that works well. In some buildings you don’t.

Consider building a faraday cage to test for dead spots. Try doing network/web searches or submissions within a deadspot, or transitioning into or out of one. You can use an old microwave, but make sure you cut off the power cord so it can’t ever be powered on accidentally. They are simple to use, try a web search or submission, or any action that requires a network connection, and put the device into the microwave. It will instantly be in a dead spot.

You can also buy them from providers, or make one out of wood and wire mesh:

  • build a box out of 2×4’s, hinges and wire mesh
  • create a door in the front
  • wrap the box with wire or brass mesh
  • attach a ground wire to the box

There are different instructions online on how to make them.

To test outages, or performance issues, and to monitor and measure performance, page sizes, security, and other things,  you may want your own network for testing. It’s a good idea to have a separate test network, so you can quickly create scenarios, outages, or easily monitor mobile app traffic, whether it is a physically separate network, or a virtual network without interrupting other work.  Your administrators can figure out the implementation.

Stay tuned for parts 2 and 3.

Looking for ideas on how to test mobile devices? Check out my I SLICED UP FUN mnemonic to help generate ideas.

The Secrets of Faking a Test Project

Here is the slide deck: Secrets of Faking a Test Project. This is satire, but it is intended to get your attention and help you think about whether you are creating value on your projects, or merely faking it by following convention.

The Back Story

Back in 2007 I got a request to fill in for James Bach at a conference. He had an emergency to attend to and the organizers wondered if I could take his place. One of the requests was that I present James’ “Guide to Faking a Test Project” presentation because it was thought-provoking and entertaining. They sent me the slide deck and I found myself chuckling, but also feeling a bit sad about how common many of the practices are.

I couldn’t just use James’ slides because he has an inimitable style, and I had plenty of my own ideas and experiences to draw on that I wanted to share, so I used James’ slide deck as a guide and created and presented my own version.

This presentation is satirical – we challenge people to think about how they would approach a project where the goal is to release bad software, but you make it look as if you really tried to test it well. It didn’t take much effort on our part, we just looked to typical, all too common practices that are often the staple of how people approach testing projects, and presented them from a different angle.

I decided to release these slides publicly today, because almost 5 years after I first gave that presentation, this type of thing still goes on. Testers are forced into a wasteful, strict process of testing that rarely creates value. One of my colleages contacted me – she is on her first software testing project. She kept asking me about different practices that to her seemed completely counter-productive to effective testing, and asked if it was normal. Much of what she has described about her experiences are straight out of that slide deck. In this case, I think it is a naive approach. No doubt managers are much more worried about meeting impossible deadlines than finding problems that might take more time than is allocated, rather than blatant charlatans who are deliberately faking it, but sadly, the outcome is the same.

If you haven’t thought about how accepted testing approaches and “best practices” can be viewed from another perspective, I hope this is an eye opener for you. While you might not think it is particularly harmful to a project to merely follow convention, you might be faking testing to the most important stakeholder: you.

How do I Create Value with my Testing?

I wrote an article for EuroStar last year about creating value with testing. Some of you have asked for more specific ideas about determining whether your testing is creating value or not. In the article, I talk about getting feedback from stakeholders, but that isn’t always easy or possible. One of the most important stakeholders on any project is you, so how do you go about satisfying yourself with your testing value?

The easiest way to get feedback is from other stakeholders. What does your manager think about your testing? How about the programmers, business analysts and customers (users) of your software?

The hard part with that answer is you may not be able to talk to all of those stakeholders. Or, they may not know what good testing looks like so they won’t have answers that satisfy you. In some cases, the stakeholders around you may have such low expectations that their feedback might not help you at all. They may expect you to provide testing work that you might consider shoddy and negligent. In that case, you have to show them what great testing looks like. When that happens it’s like graduating from a cheap box of wine to the good stuff. Once they’ve tasted the good stuff, it’s hard for them to go back to expecting poor testing.

Even if you have good direction from other stakeholders, I recommend asking yourself some questions to help determine if you are creating value or not. This is hard to do, and will result in work for you over the long-term, much like personal growth endeavours. Don’t expect quick fixes, but if you work in these areas over time, you will see changes in your testing. Here are some things to think about:

  • Is my testing work defensible? (Cem Kaner, a lawyer and teacher of testers talks a lot about this.) Think of a court case. What would a jury think if you testified and described what you did as a tester and why. How did you determine priority? Why did you test some things and not test others? (100% complete testing is impossible, so you have to make decisions to optimize your work.) Are those decisions well thought out, or more subconscious? What sorts of things might you be missing that you haven’t thought of?
  • Do I have a superficial or thorough approach to testing?James Bach, a teacher of testers talks about how important it is to have thought out and varied approaches to testing. What kind of approach do I have to testing? Do I consciously choose to have a varied approach using as many models of coverage as I can to discover important information about the product? Do I make the best use of tools, testing techniques and management approaches that I can? Or do I just do what the programmers or someone else tells me to do?
  • Would someone I respect like what they see? In the absence of getting real feedback from real people on my project, what would happen if a well-known consultant came to visit me? Could I answer their questions about why I chose to test this way? What kinds of holes might they spot in my thinking? Would they see weak spots? More importantly, would I be proud to have Cem Kaner or someone else I look up to see what I actually do? (For example, I often share ideas with Jared Quinert, and if he shoots holes in my approach, I know it is time for a rework.) Have I used ideas from testing thought leaders in my work and found out what works well for me and what might not work so well? Could I communicate my work to an expert outsider clearly and thoughtfully? If so, what might they think?
  • Do I adapt my test plans and strategies from project to project based on the risks and rewards our project environment has at a particular point in time, or do I just copy and paste what I did last time, and repeat the same thing over and over?
  • Do I track down and find repeatable cases for important intermittent bugs, or do I just file them and forget about them?
  • Do I feel energetic, creative and proud of my work as a tester, or do I just feel like I am doing the same boring things over and over and filling in paper work and forms to please a manager?
  • Can I look at a released product and identify ways in which my testing has improved the product experience for our end users?
  • Do others on my team feel better with me around? Do they miss me and my creative input when I am away, or do they welcome the break from my negativity? Do they request that I work with them on other projects?
  • Is my testing service in demand? Am I the person team members come to when they need help solving a particular problem that I am really good at helping solve?
  • Am I aware of other approaches to testing that challenge my favorites? Do I understand approaches that I may not favor or I may even dislike, or do I just dismiss anything unfamiliar and threatening out of hand? Do I have an open-mind and look to challenge my ideas in testing to help improve?
  • Am I learning about different ways I could improve my work? Am I aware of recent changes in testing techniques and tools? Do I know where to find information to learn from?
  • Do I consistently try to do better than I did last time?

These are the kinds of questions I ask myself regularly. I don’t always have the best answers to my own questions, but as time goes on, I feel much more confident about both my own answers to those questions, and more importantly, the value I know my testing work provides.