Category Archives: product management

Hey Educational App Designers, Stop Creating Glorified Worksheets!

Educational app designs need a rethink.

I am a product manager and UX designer in the software industry, and I’m also a parent of an elementary school aged child. Learning applications are a big part of our educational experience, and I find I am constantly frustrated by them. While they often have great promise and claims, and they use modern graphics and game engines, they rarely use the technology to help facilitate learning. In fact, they often put fantastic game engines around worksheets. They have spectacular characters, a wonderful environment and storyline, then for the actual math or literacy, they just display a virtual worksheet to complete. Even worse, if the user gets a question wrong, instead of showing them how to fix the problem and learn from it using the technology, they just lose points, need to find the correct answer somehow, and at worst, are unable to progress. The fun part of the app is often the parts around the actual learning, rather than making the learning part the fun part and main focus. No matter how cool and amazing the application is, if you are just wrapping that around the same old printed worksheets students have used for decades, you a really aren’t making good use of the technology.

Educational apps that don’t use game engines or a game format still tend to use game mechanics in their design to help students understand progress, facts they have mastered, concepts and activities they have tried, etc. When you know what to look for, you see the mechanics in virtually every educational app, but no matter how cool, new, flashy or exciting, they tend to devolve into learning as worksheets.

Here are three examples of math apps we have tried in the past.

We downloaded a math app, and it was a sandbox style game with customizable avatars, a rich environment to explore, and lots of clever use of music and animation. When it was time to do math work to earn points to buy things to add to your game environment, the user had to answer a series of questions, in a certain period of time. While the graphics were nice, and there was animation to help make it more engaging, it was still just look at a math equation, enter in the answer, and move on. If you didn’t reach a certain number of points in a certain amount of time, your progress was stuck. You couldn’t do anything more in the game other than wander around until you passed that level. This isn’t much fun, particularly if it is a skill you need to work on. Not only do you need more practice, but the game isn’t fun anymore.

Another math app we tried had an immersive RPG play style. You choose and customize an avatar, and your character does quests, engages with other players, and had boss battles, and other fun activities. This looks fun! However, when it was time to do math questions, you are literally taken out of your immersive environment and shown a virtual math worksheet, like the kind you print out and fill in by hand. At least with this app you aren’t punished for getting answers wrong, you just get a certain number of points to continue. However, there isn’t a lot of actual math learning going on, you have to practice those skills off screen, then come back to it. My son was so impressed with this game and loved it so much, he dedicated daily homework to develop skills to advance further. We spent three weeks of doing daily math practice and worksheets so he could master the level. While we were impressed with how motivated he was, we were baffled on why the game didn’t provide that practice. There was a little bit of support to show what the correct answers were, but beyond that, it was just doing worksheets.

The worst example of a math app with poor mechanics was one that used no graphics at all. It just had math equations, a timer, and a score. There were no visual indications of how many questions to answer, and many of the questions required off screen work, since there was nothing to do other than fill in the answer and hope for the best. To do the work to solve the problems required several minutes of whiteboard or work on paper off screen, then enter in your answer. If you had a typo, or an off by one error, you not only get a message that your answer is wrong, but your score reduces. If your score doesn’t reach a certain level, you just continue, over and over, until it reaches a certain point. You have no sense of progress, and while the app would show the correct answer, it didn’t do anything to teach the student how to do better. You get points for the correct answer, so you better come to the app with a lot of facts memorized. If you make some mistakes, not only do you not get much feedback, but you get punished. Furthermore, if you are too slow, that also affects your score, dragging the effort out even longer.

There are lots of apps that at least provide visuals and let students change their minds, but they are still mostly virtual worksheets that are trying to get students to enter the correct answer. While there is a place for that, such as dragging letters around to make words, dragging words around to figure out parts of speech, or moving objects into groups to divide or multiply, they are still not utilizing technology to help learn very much.

In math, it is so simple to design a visual calculator, and let people play around with numbers and see how that affects outcomes, and how patterns start to emerge. Once math stops being abstract, and people can play with manipulatives and see what happens, things can really click in a learner’s brain. Math manipulatives such as number blocks, Montessori boards, cuisinaire rods and more are extremely helpful learning tools, but virtualizing them, adding in animation and allowing safe exploration would be incredibly powerful. Instead of catering to learners who do well with worksheets and flash cards, learners who are struggling to understand a concept should be able to visualize the concept in various ways, play around with inputs and outputs, and see how the concept manifests itself. Not everyone can translate abstract math concepts into visualizations or numbers in their minds. Providing ways to see not just how objects and patterns interact with math, but how those concepts can be applied with virtual tools holds a lot of power. While all the technology is available to us, educational apps tend to fall back on some sort of worksheet, which only appeals to a certain kind of learner. On the other hand, virtual objects you can interact with and learn from are more engaging to every learner, and they can help people actually learn something new.

Use Technology as a Safe Place For Learning from Mistakes

What drives me up the wall with educational apps is they tend to only focus on getting correct answers. Instead, they should provide a space for experimentation. What happens if I play with addends or minuends? What happens if I multiply negative numbers together? What happens if I play with the variables and use huge and small numbers in a multiplication problem? What happens if the divisor is larger than the dividend? What if the divisor is an emoji or a letter? What does it look like if I make a word problem come alive? What happens to a graph if we loop through a huge number of values for x and y in a linear equation? What happens if I watch an animation of a huge range of possible inputs? What if the inputs are at extremes or nonsensical? Imagine how that can be quickly visualized, and different types of inputs can change the outputs, and what patterns arise from different kinds of mathematical concepts.

The beauty of virtual tools is they are SAFE places to make mistakes. You get to put in some inputs, and then watch what happens. In real life, when you make a mistake on a worksheet, you have to erase it and fix it. Virtually, there are no eraser smudges, you just change it. Furthermore, a printed worksheet can’t come alive and show you what happens when a train leaves Philadelphia at 6:00pm and another leaves New York at 7:00pm, or how many ball bearings can fit in the back of a pickup truck. Game engines with virtual tools can. Furthermore, making mistakes should be fun learning experiences, rather than being punitive. Sure, there is a point in learning where it is important to have precision and to be able to do things by hand is vital. However, playing around with technology and seeing what might happen will help students form a picture in their mind of how a concept works, not just memorizing how to get the right answer.

Actually using game engines, game design and having and understanding of different styles of game play will help people with different needs be able to learn the concepts in the app itself. Different players will have different needs, and while some people like timed tests and fact based answer seeking, others are vastly different. Andrzej Marczewski makes it easy for us to learn and incorporate Gamer User Types in our designs. For example, a socializer might want to help others learn something they struggled with, and provide a tutorial of something cool they discovered. A griefer might giggle away putting in extreme values. An explorer might try lots of different combinations of things to see how that is visualized or what virtual outcomes might be. An achiever might be motivated more by virtual rewards and determining how many and what kinds of activities to complete. There are a lot of differing user goals and scenarios, and there is a tremendous amount of knowledge and experience in the games industry we can learn from.

A number of years ago, I was asked to do a UX audit of an anatomy app. It had beautiful graphics, and a ton of fantastic information. However, it was really just a digitized version of something like Gray’s Anatomy, the famous anatomy text book. Sure, you could search, you could look at amazing graphics and click around to help you memorize, but was not using the technology to help teach. I saw two problems immediately: it was a digitized book or worksheet, and it was static. Anatomy in living organisms is not static. A living organism has different states in their body at all times. For example, there was no use of technology to show oxygenated vs deoxygenated blood in a circulatory system, or to simulate illness, pathologies, or other things a med student needs to do to apply their anatomy knowledge. Furthermore, testing was fact based. You needed to memorize facts using the app, and then state those facts in an exam. The learning was about reading and looking and memorizing, not experiencing. When you are a medical professional, one of the most important sources of learning is from mistakes, or from failures. A patient doesn’t respond well either due to a lack of knowledge or the wrong treatment, and you learn what not to do.

My design concept was to digitize that virtual patient experience, as sort of a medical study Tamagotchi. Instead of memorizing a virtual anatomy text book, why not have a virtual patient to keep alive for a semester? Sure, you have the required anatomy to understand and commit to memory, but you have a simulated patient who can have certain illnesses, pathologies or states to manage. It sounds crass, but if your virtual patient dies a lot of times, you are going to learn a tremendous amount from that experience that you can use in the real world. It is much safer to learn and fail and see what happens to your virtual patient, rather than memorize and get a poor score if you get things wrong. If you can fail virtually and learn from it, that has a lot of value. I would prefer to play and experiment and be rewarded for learning from mistakes, rather than memorizing text books facts and being afraid to fail an exam. Exam scores have real world consequences, but playing around in an app and having fun, piquing curiosity to explore “what if” scenarios, or having instructors throw you challenges to keep your virtual patient alive is something we can absolutely do with computers that we just can’t do with dead tree text books.

Another area to learn from video games is how they treat failure. To make a game engaging, failure is part of the game, not a punishment. In popular games, their designs never make you feel lost or dumb. You feel like a super hero, and when things go wrong, you can recover and try again. In fact, many games make the failure part incredibly fun and rewarding. Who doesn’t want their avatar to scream in a ridiculous way and burst into flames if they fall off a balance rope in an obstacle course? Some failure modes are so fun and hilarious that people spend more time crashing their characters than completing tasks. Even games that are extremely challenging and are designed to be frustrating are engaging and use that frustration and failure to encourage people to try again. You aren’t left feeling stuck and dumb, you feel like you need to try again, just one more time. Furthermore, if your character crashes and burns, you just respawn and try again. You aren’t stuck unable to play without doing a lot of work outside of the game to continue. The game helps you succeed, and if you are really stuck, game communities are fabulous places of sharing knowledge and helping each other.

The more I experience educational apps with my kid, the more I see that educational app designers completely miss the power of virtual technology and learning. They should design the apps around experimentation and reward failure as a part of learning, but they end up digitizing worksheets. They expect people to know facts, they don’t help people pique their curiosity in a safe way. They have an extremely narrow view of learning and teaching. Why don’t they support inquiry and experience? Why do they just duplicate books and worksheets, even when they have a fancy MMO or RPG engine around the learning? Virtual learning environments themselves are fantastic places to do whatever you want to learn. Where else can you safely find out what happens if you feed something poison, or if you fly the rocket into the ground, or you play around with your math question variables, or if you rearrange your words in a nonsensical way. These are pretty bad ideas in the real world, but great learning experiences in a virtual one. Plus, mistakes can help you learn, and they can be fun and silly. Laughing at a ridiculous mistake on a math concept and visualizing the carnage is a much more effective learning technique than getting a long division problem wrong, after you spent ten minutes solving it off screen.

Using the technology to just digitize the printed worksheets completely misses out on this important approach to learning. Sure, at the end of the experimentation, you want the student to have knowledge and skills and to have learned, but we have game engines, and graphics and powerful machines that can be used to learn what we need them to learn, and we instead just give them worksheets. And in many cases, the worksheets are even worse than a printed one.

Bottom Line: Let students play with the concepts you are trying to teach and let them succeed and fail in a safe way, using everything technology affords us. Stop punishing learners for making mistakes, let them make mistakes and explore the outcomes virtually. Stop taking dead tree technology, digitizing it, and rewarding people for getting the correct answers and calling that an educational experience. Use the technology to show, tell, demonstrate, play with and really get a solid grounding in the concepts without real world consequences. That is the differentiator with learning with technology: you have limitless access to information, and tons of rich tools to virtualize problem solving and learning in stunning ways. Provide structure and opportunities to learn, don’t just expect people to write an answer on a worksheet. Give them more.

UPDATE:

April 24, 2024

I was reading this article about educational apps: The 5 Percent Problem: Online mathematics programs may benefit most the kids who need it least, and there are some thought provoking points. This quote in particular stood out: “…the programs may have been unintentionally designed to fit high achievers better, says Stacy Marple, a researcher at WestEd who has studied several online programs.”

Put another way, if you design apps that expect learners to already have mastery, they will tolerate your virtual worksheets because they can easily enter in the answers. They have the knowledge, skills, and confidence to grind away to get back to the fun part after the get the math or literacy “worksheet” completed. For learners who don’t already have mastery, they will be frustrated and stuck, because there aren’t mechanisms in place to help them safely learn, to build their understanding and confidence, and to actually help.

I SLICED UP FUN Mobile Testing Infographic

Twelve years on since I created and shared the mobile testing mnemonic I SLICED UP FUN, I see that people are still using it and finding it valuable. I still use it myself on projects, so I decided to create an infographic to make it more shareable.

I call this a mnemonic because it is a memory aid to help me with my work. A catchy phrase helps me remember everything I need to think about to be thorough when testing mobile apps. Sometimes these are called heuristics, or listicles. Whatever you want to call it, it’s a helpful thinking framework to help quickly generate lots of useful testing ideas.

I SLICED UP FUN is a testing framework for mobile apps, but I use it for more than testing. As a product manager I use it in a generative or creative way as well, not only to help evaluate an existing app design, but to create something new.

If you haven’t used a thinking framework like this before, it’s quite simple to use. Read each section, and determine which ones apply to your product. If a section doesn’t apply, skip it and move to the next. Once you have a list that is applicable to your work, use each item in the list to generate ideas for that category. Once you have a few relevant ideas under that section, move to the next. Then review what you have, and see if there are gaps. Whenever you’re able, include other people to help you generate more and better ideas.

Once you have generated enough ideas, put them into action, whether it is testing, design, or other work you need to do.

You can download the infographic here:
ISLICEDUPFUN mobile testing infographic

Load Testing Your Web Infrastructure: Please Be Careful. Part 4

Earlier, we looked at different ways that load testing can go wrong, if you aren’t informed, or if you don’t know what you’re doing. In part 1, we talked about a well meaning person who inadvertently created meaningless tests. In part 2, we saw the disastrous effects of someone with a little knowledge creating a mess. In part 3, we read about what can happen to a network if you unleash load tests while other people are working. In this section, we will talk a bit about some of the underlying math we need to use with load and performance testing. (On second thought, “underlying” is a bit misleading as a term, it is actually foundational, but it’s also lots of fun. It’s fun, even for math phobics, as long as you get help from time-to-time.)

NOTE: I am simplifying the math descriptions here for brevity. If you are a stats expert, please don’t be offended by my glossing over the details. The point here is to provide a basic amount of information so people get the gist of it.

What? We Need Math?

It’s one thing to generate load and point out potential issues, but the real key to performance and load testing is an understanding of probability and statistics. A lot of problems are uncovered through basic statistical analysis, and reports on this testing are also used to help with forecasting, service commitments and purchase decisions. Communicating anything useful and actionable about performance requires stats and probability knowledge and skill. It’s important to highlight that generating load and successfully taxing a test system is the easy part of load and performance testing. The hard part, and the time consuming part is to figure out what the results data is telling us, or not telling us. This requires a working knowledge of statistics, including:

  • Averages
  • Means, Medians, Modes
  • Standard deviation
  • Confidence intervals
  • Distribution types: normal vs uniform
  • Statistical significance, equivalence, and outliers
  • Percentiles
  • Probability

It’s also important to have a good knowledge of elementary math:

  • Addition and Multiplication
  • Exponentiation
  • Combinatorics

You don’t need deep expertise in these concepts, but a working knowledge is important, as well as the ability to work with these concepts in popular productivity or math tools.

It’s one thing to manage the math, it is quite another to communicate what the math means to stakeholders clearly, honestly, and with context. It’s also important to be able to explain the limitations of what your math work has revealed.

While I’m not an expert in probability and statistics, I had worked at conferences and workshops with performance testing luminaries Scott Barber and Ben Simo. I once spent hours in a conference hotel lounge with Ben Simo as he dumped game pieces on the table and would ask me to observe and describe what I saw. Little did I know that this data visualization practice would help me track down a nasty performance bug months later. I also took online courses, attended other workshops and talks, and tried out various tools. Once I was comfortable with generating suitable levels of load, working with the numbers started to take precedence in my work.

Basic Math and Exponentiation

Performance and load testing requires dealing with large numbers, and calculating and observing the effects of addition and multiples. While this sounds simple, it can be deceptively complex.

At its simplest, generating load against a test server requires generating multiple simulated users, which in-turn requires counting and observing. For example, if you generate 10 simulated users with a testing tool, you need to observe your test environment and see what effect that has on it. Does the machine work harder? What do CPU usage, I/O and other measurable aspects look like? For most systems, ten is a small number and may not even register, so what happens if you simulate 100 users? Furthermore, can the network infrastructure you are using handle that much load, or will it limit traffic in unintended ways?

Once you are absolutely sure that yes, your 100 simulated users are exercising the test server more or less like 100 real users would exercise your production server, now you can start to add on more. What happens with the 101st user? Nothing much? Ok, let’s add more and observe. The trick here is to find the point where unintended behaviors start to occur when you add that nth user to the tests. The temptation is to think of this as a linear graph, where nth amount of load will add n amount of server utilization, but that isn’t how this tends to work. What often happens is the nth user causes a surge in server activity, which looks like a geometric graph, or a hockey stick shape effect. Adding that nth user causes I/O to go out of control, or CPU utilization to stay at 100%, or memory usage to get used up, etc. In other words, that nth test user causes the system to get overwhelmed, rather than increment resource usage the way all the previous ones did. This forces us to move from thinking about addition and multiplication, or simple product calculations, and start looking at exponentiation.

Exponentiation in simplest terms deals with the rapid increase of numbers. This can occur in distributed systems for a lot of different reasons. There can be a massive influx of users for unpredictable reasons, there can be massive increases in utilization of hardware components, there can be data that grows unbearably large quickly … the possibilities are numerous. In other words, something unexpected happens, and suddenly there are huge numbers that are impacting things, and we get called in because these rapid increases upset the status quo, making things worse. This is a complicated topic with lots of discrete math concepts, but it is fun and rewarding to study, as long as you aren’t learning during a production outage.

Even simple product based calculations can be tricky, especially when small numbers can lead to large numbers. Without some thought and analysis, this can lead to poor results. Our brains struggle with large numbers (hence the need to create computers in the first place), and our shorthand for dealing with them can get us in trouble.

How many servers do we need??!??!??!!

One project I worked on required a backend overhaul due to the addition of a suite of mobile apps. The mobile apps used the existing server infrastructure differently than the legacy suite of web apps, and there were some nasty load-related surprises. Trouble was, these surprises were major bugs that required architectural changes in the code base, as well as the server hardware. There was little appetite to address those issues due to cost, and politics, so they were deferred for a later release. In the short term, that meant that they had to severely curtail the estimates of simultaneous users per server with the addition of mobile app usage. (Note, when I say severe, I mean severe, as in a factor of 10 reduction of users.) The thinking was to get a couple of friendly existing customers to take on the mobile app product as beta testers, and then slowly roll on more organizations as the existing code base and infrastructure was updated. Trouble was, some of the sales people weren’t on board with this, because they wanted the potentially lucrative sales and commissions for that now, not months in the future. One salesperson returned from a trip with a friendly, major customer, who had signed up for an early release of our mobile app suite. There was great rejoicing. However …

One of the most important things I do when I take on performance and load testing projects is to read all the published claims about the system. That includes the README files, the release notes, website and other pubs, blog posts, and most importantly, any contracts with user and performance commitments and SLAs (service level agreements.) I asked for the contract that the sales people had signed with the customer, and I was horrified. They agreed to an enormous number of licensed users, starting modestly, but increasing at 3 month intervals over two years. The numbers didn’t look too bad at a glance, but when you factored in that they committed to doubling, tripling, quadrupling, etc over time, it was cause for concern. The lead architect and I spent a few minutes calculating what these commitments looked like in server requirements, and the numbers were insane. If we were to support that number of users without substantial work and massive performance increases, it would require thousands of web servers to support the commitment of one customer.

Getting to the bottom of this required a bit of digging.

It turned out that the lead sales person who had signed the agreement said he had approached QA for information about how many simultaneous users we could support on the test server. He then went to IT and asked how much more powerful the production server was. Since they said it was at least 10x more powerful, he took the QA quote, and multiplied it by 10. He then massaged the numbers to increase to the extreme level to sweeten the sales offer, assuming a massive increase in performance every six months for two years. Of course when he talked to QA and IT, he did not make it clear what he needed the numbers for. We had to explain that you can’t take raw numbers that a server can sustain for a short period of time before crashing, and then multiply it and assume some sort of “half Moore’s law” for the product.

In the end, legal and senior managers had to approach the customer and try to salvage the sale. They were able to renegotiate the contract SLA into something achievable and sensible. It wasn’t pretty, and the company lost money, but they thankfully didn’t lose the customer. It could have been a serious outcome though, with lawsuits and other potentially calamitous outcomes.

Calculating and Communicating Probability and Statistics

The real fun of performance and load testing for me is in the various ways we can use math to uncover important problems. It can also get a bit messy, since we aren’t dealing in absolutes, but in likelihoods. There is some experience involved in how to manage the uncertainty, and that comes with risk. Taking some calculated risks with the math you use can help your clients greatly reduce the risk in the operations of their systems. I used to really enjoy that uncertainty, using mathematical tools, observation and background knowledge to help inform recommendations, and seeing those ideas pay off in better customer service. The only downside is that when you have in-depth work in this area, you will yell at your computer screen when you see polling data, media articles or marketing campaigns that get it wrong either purposefully to manipulate, or due to a lack of research.

What metrics can we publish?

One system I was brought in to test was updating to support a significant higher number of mobile users. They needed to publish some of their user metrics, especially within contracts that required licenses. They wanted to provide a safe number of simultaneous users for customers who were hosting their solution themselves, so they would know what to expect and plan accordingly. This is straight forward, but from a statistics perspective, it adds a lot of complication and time to our work. It is one thing to find problems to fix, and to anticipate what you need for your own systems, it is another to make commitments about that to others. For example, if you have too much traffic on your own system, you can quietly add more capacity and no one needs to know. If a customer who hosts your solution is budgeting for servers, they need to have specifics. Also, if they end up with more traffic than they can handle, you might be on the hook, determining on what claims you have made in your SLA.

Company leadership understood what I needed and were willing to provide everything, including a safe test network. What I had to do was determine safe, but enticing metrics that marketing could use to publish in advertising, and sales could use in service level agreements for contracts. The key was, how many simultaneous users could they safely advertise, and commit to supporting legally? The way forward with this task involved a lot of simulation, and a lot of math.

I started by analyzing their legacy product and their website traffic metrics. Unfortunately, the data seemed to be off somehow. When I asked for more information, it turned out that the data I wanted was from two different sources. To make up for that, IT had been asked to add the two datasets together, and divide by two, providing a sort of average. Unfortunately, this isn’t the way to approach this kind of data. When you are dealing with two separate, but related sets of data, it is sometimes called bivariate data. The reason for this was a bit complicated, but imagine that you could get a dataset for web browsers only, and then a dataset for operating systems only. You can use some deduction on this data to get a better sense of the reality of the metrics. For example, if you are seeing lots of Safari browsers, then you know you are dealing with Apple devices only. But if you are seeing Chrome browsers, they will be Android devices, but can also be Apple and other operating system providers. The “averaged” data provided earlier skewed the data in unintended ways because it didn’t account for those proportions.

To cope with the bivariate data, I reviewed Chi-Square analysis from university statistics, and read up on how to analyze bivariate data accurately. I use spreadsheets a lot, so I found some youtube videos on built in analysis I could use there. Fortunately, while I was struggling with my calculations, a programmer who had worked with complex statistical systems was sent my way. He happily took over the task and used a more suitable approach. The numbers he generated looked much more realistic. With a bit of research we were able to find the proportions of mobile operating systems and web browsers, and our analysis revealed something similar in these metrics.

Phew. Our first math problem was out of the way. However, this had implications for our testing. We had to repeat certain tests to increase our confidence in our analysis. I’m simplifying for the sake of brevity here, but essentially, we needed to figure out a realistic sample size, and calculate our margin of error, or confidence interval. It got a bit complex, and meant we had to have a production snapshot available for a few days and did nothing but re-run subsets of our load tests on it, and analyzed results based on our prior calculations.

Next, we analyzed the new system that would support much more mobile traffic. What might change now that we had better mobile support? Would the proportions of OS/web browser remain the same, only increase in amounts, or would traffic behaviour change completely? Since most people like to use their mobile devices first, we felt that it could have a much larger impact than just increasing the same traffic as the legacy system. The behaviour and type of traffic could change significantly. This was a prediction, or a hypothesis, and we needed to research published metrics of mobile usage when web sites became more mobile friendly to help bolster that prediction.

While we were researching and adapting our tests to better reflect production data, I was extremely fortunate to be on-site during a system outage. I was able to view errors, request snapshots of server logs, server utilization and other metrics, and anything related to data. What are queues doing, are there problematic processes, tables filling up, etc. Also, we were able to gather hardware and network infrastructure information. After the initial problem solving to get the system back up, failure point analysis and bug reports, we were able to pour over the data to get a picture of the weak points in the existing system. This also required some math, since server utilization and other metrics have different formulas. One type of hardware might use one set of metrics, while another might use something that sounds similar, but uses different calculations. In other words, a “one” might be a great measure for one type, while another might use a percentage, like “97% utilization”. Furthermore, “97% utilization” might be a good metric for one service, but a red flag for server CPU usage. Furthermore, monitoring a web server vs monitoring an RDBMS vs network activity can be very different. Also, different applications can behave differently, utilizing different infrastructure and services depending on their unique needs and client load. Context and an understanding of what tools to use and what the metrics mean is vital.

We identified problem areas in the existing system, and then created conditions in the test environment to reproduce this at lighter levels of user load. Then, we used real mobile devices with different OS and web browser combinations and captured their traffic information so we could add those into our load tests. We then used simulated mobile clients to analyze the system and observed how and where the increased mobile clients would impact the servers. Next, we figured out how to artificially create some of these unique conditions in key areas of the system. For example, we created tools to eat up machine memory, or to cause database queries to slow down or even hang. We tried to determine how an influx of mobile users might use the system differently, and created tests based on typical user scenarios mobile users would be interested in. We also determined peaks, such as peak usage by number of simultaneous users, as well as peak usage with regards to system utilization. This is important, since a lot of simultaneous users reading a marketing release is easier to support than fewer users who are taxing the system using applications. From there, we got a good sense of what how the system behaved under heavier load vs. lighter load. Once we had a suite of tests that had a good mix of mobile and PC users, doing simple things and more complex things, we were able to simulate our projected system behavior, once it was released into the wild. We could also force conditions that could be problematic, so we could determine outcomes with various combinations of things going wrong on the back end. For example, what happens if an influx of mobile users all do the most taxing thing that could be done to the system, from a user workflow perspective? In other words, we were modeling expected server behavior based on both web and mobile application usage.

Finally, we worked on what areas we were going to measure. Management had asked for the greatest number of simultaneous users that the system would support, but this is a bit too vague. It is one thing to measure how many users can connect to the home page, versus how many users can use the supported apps, versus a combination of browsing, lightweight processing and apps that require heavy processing. Furthermore, while a server might be able to handle many users without crashing, if the performance is poor, people will get frustrated. Similarly, a server may handle a certain level of traffic for a period of time, and then stop performing adequately, either by slowing down considerably, hanging or crashing, etc. Or, a server may manage many multiple users, but it may become unreliable, also negatively impacting their user experience. To determine what to measure, we needed to utilize the following related testing approaches:

  • Load testing
  • Stress testing
  • Duration testing
  • Performance testing

Load testing is about generating a number of simulated users, and analyzing the system. Stress testing involves simulating enough traffic to push the server to its limits, or to failure, in order to learn limitations, what behavior to be aware of in production, etc. Duration testing involves load testing over time. Finally, performance testing is all about the measurement. It’s one thing to survive load, stress and testing over a duration, but qualitatively, how is the performance? What measures can we do to signify “good”, or “adequate”, or “poor” performance? We determined to measure average times of connections to the website, and the duration of completing the most common tasks in the mobile apps. That meant we did the typical web measure of simultaneous users and page load times, but we also timed how long it would take to do important things. That said, we needed to be wary of averaging these values too quickly, since outliers are important to find and identify the underlying cause. Once we had a reasonable sample size we performed calculations such as standard deviation in addition to spotting outliers and repeating conditions to cause them and verify when they were eliminated. For example, one issue we ran into was a nasty database table that required a lot of processing time to read, write, update, etc, and that could impact the load times at seemingly random points in user workflows. Once we found a fix, a subset of time delays on certain pages were eliminated.

Next, we analyzed mean, median and mode for each of our measurement points. Mode is one of my favorites for analysis, because it shows the frequency of a result, which can look different when graphed than a mean or median. A mode can show a cluster points at unexpected parts of a graph, which are a sign that there is a performance problem that needs to be addressed. Once averages of our data are calculated, based on sample sizes that are sufficient, I then use one of my secret weapons: percentiles. Percentiles can be used in several ways with performance testing. A percentile takes a portion of the results, which you can then analyze as a subset of your full set of data. For example, with the 90th percentile, you eliminate the top ten percent of your result set, and look at the remaining 90%. I have found a lot of performance issues in systems using percentiles to analyze and visualize data that weren’t apparent when using the full data set. This works because the top results can skew the overall results, pulling the graph in an area beyond the mode, for example. There are several ways you can use percentile to find patterns and problems that are shown in test data, but this is one I use a lot to troubleshoot. I often use the 80th, 85th and 90th percentiles in various ways to find unexpected results in the data. Those three work really well for me to find problems that get flattened out when using 100th. Percentiles are used in other ways in performance testing, but this is a potent analysis tool when you are finding problems.

Once the system was tuned, anomalies discovered and reduced, and the response times are fitting in a normal distribution that coincides with mean, median, mode, etc. then we are ready to measure and communicate metrics. First, we need to create a sample set of test results that is reasonably statistically significant. We don’t necessarily need to have a great deal of rigour with these calculations (such as statistical significance), but we need to run the tests enough times to have confidence in them. For example, running the tests once is not enough for a sample set of data. On your project, running them 100 times with the same build, the same equipment and conditions, etc. might be large enough. Or, you may need to run them a thousand times. In general, the larger the sample size, the better, but diminishing returns can kick in too. This requires some experience and judgment. Other projects may budget for the time and expense to do an auditable, full set of statistical calculations. I will use percentile here again, but rather than using it to look for problems, I am using it to assess the validity of the set of test results we are working with. If I find something surprising, then there is either a bug we didn’t encounter, a server misconfiguration, or a problem with the tool or test environment itself. Once we are happy with the sample set data, we can start capturing metrics and generating reports. (Reporting results could take up several blog posts to cover, so I will just touch on it.)

Determining server performance metrics that we want to commit to isn ‘t an exact science. Our test environment is rarely identical to a production environment, and no matter what we do to distribute simulated test users, etc, we aren’t completely emulating real world conditions. As a result of the statistical calculations, and analyzing the probabilities of events occurring, we tend to deal with percentages. “We guarantee a 99% up time” is a common one we see in marketing materials. They don’t say “100%, because there are so many factors beyond their control that might temporarily cause down time. Server up time is a pretty simple metric to measure and communicate, whereas performance is even less exact. For example, in testing, 90% of users may experience page load times of a certain average, or falling in a certain range, 90% of the time. Furthermore, the metrics we publish to brag about versus the numbers we are legally required ot meet might look very different. For example, we may find that a certain type of server configuration is adequate for performance targets, using a certain number of users. An aggressive approach might be to publicize one particular set of data that is attractive. We reached that level once, so we will tell the world we can do it. When it comes to SLAs though, we will likely be much more conservative. In some cases, an average is determined, and then some breathing room is built in those metrics by diminishing them, just in case of some events in production that weren’t apparent in test.

Communicating and reporting results requires skill and experience. Figuring out what is useful to measure, how to accurately analyze and interpet those measurements is part of the picture, but communicating what that means, what the limitations are, and providing advice on how to proceed is much more difficult. It’s one thing to do the math, and it’s altogether another to do something useful and helpful with it.

Lies, Damned Lies and Statistics

One of the great side effects of load and performance testing is how formerly intermittent bugs start to become repeatable. This is due to high volume test automation, one of the most powerful and useful test automation approaches you can use. While it is often unintended, adding load starts to cause problems to bubble up. This is so common, I always recommend teams schedule time around their load and performance testing efforts to deal with the inevitable issues that crop up. This is a good thing, because it helps improve the overall system and the end user experience with your software. In the short term though, it can be frustrating and might threaten schedules. These problems tend to require time and effort to fix, so while testers get excited, project managers start to get nervous.

One performance testing project I worked on had a particularly nasty “unrepeatable bug.” Once in a while, a tester using one of the web apps would experience a crash. This crash would also cause the test web server to hang, requiring a manual restart. No one was able to repeat it, so it was put into the state where bugs go to get forgotten, otherwise known as: “We’ll monitor it.” One day, the QA team installed a major new build. The team was getting ready to release a new version of the software with some new features and important bug fixes included. We started to run our automated tests, and testers began to work through their daily tasks. Suddenly, there was the familiar crash, and the required server restart. We had four test servers at the time, with one dedicated to our load and performance testing, with the other three available for other testing work. The testers moved on to a new server as the frozen one was restarted, and then the bug happened again, a tester saw a crash report, and the server froze up. Now there were two. Once again, a server froze up, and the testers were all on one test server. It crashed, and so did the load testing server. “That’s odd.” At one point, we had all four test servers requiring a restart at the same time, and this was causing serious productivity issues in QA, not to mention the implications for the new release. We raised the issue with the product and project managers, and started to analyze it.

The testers all kept track of what they were doing when they saw the bug, but we quickly set that aside. There was a factor in the system that wasn’t observable through the UI that was the likely culprit. We started to monitor the servers, turned up logging to get more information, and when a crash occurred, we tried to investigate every component of the web infrastructure on that server. We used low level load testing traffic on the each of servers to cause the bug to occur even more frequently. It took a couple of days, but we realized there was a strange race condition, where two services were utilized at exactly the same time. In the previous version of the software, this happened infrequently, but now, it was happening a lot. But, at least we had a repeatable case, and with the aid of our automated tests for load testing, we could repeat it on command, within five minutes. That gave developers the opportunity to run their debugging tools and track the issue down so they could fix it.

Trouble was, the fix was not an easy one, and was extremely political. To fix the problem required some major architectural rework, and re-opened a major debate on the development team. There had been bitter disagreement on a particular direction, and the one that was chosen was not popular. Now that the unpopular architectural decision was shown to be problematic, the issue blew up. There were heated arguments, lots of negative back channel chatter, polarization over possible solution ideas. All of this caused a lot of hurt feelings and resentment on the team. Some minor server setting tweaks were proposed, and each of them helped reduce the frequency of the bug somewhat, but didn’t reduce it enough. The team now had a choice: proceed with the release as-is, delay the release to try to find a temporary fix to reduce the occurrence more, or put the release on hold until the rework could be done to remove the problem for good. I was tasked with coming up with an impact assessment to help management determine a course of action. Here is what we observed, so I recorded it:

“Intermittently, a catastrophic bug causes a web server to crash, requiring a reboot. This means that once the bug occurs, the server is not available for users until it has been restarted. It doesn’t corrupt data, but it deletes the work that the user was currently working on, so they have to start over. The user will see a crash message, and once they refresh and connect to a new server, they have to log in again, and start over. In the meantime, there are fewer servers available, which means that at times, some users are unable to connect until someone else logs off. We found that on average, one in five users who connected to the server would come across this bug. This is a high probability issue, and it affects more than just the person who triggers the crash, the server is now unavailable for anyone until there is IT intervention. It costs time and money, not to mention the extreme frustration of the users who experience this. With self-hosted equipment, there is time required by IT to go and reboot the server, often several times a day. With cloud-hosted infrastructure, moving to new servers could cause expenses to increase significantly.”

Unfortunately, the people with political power did not want to fix the problem, they wanted to release. They took my 1 in 5 occurrence metrics and reframed it. While it wasn’t technically a lie, they greatly minimized the impact of the bug. This is what they told senior management:

“There is a severe bug that QA have found a repeatable case for, but it is going to hold up the release to fix it. The bug only happens 20 percent of the time!”

They also heavily implied that it was happening in the test environment more frequently because the QA team were abusing the system to find more bugs. Technically, we were using load testing tools to generate very light levels of load, but they didn’t say that. “You know how QA are, and they are also running load testing!!!” which made it sound like it would happen more frequently in test than in production. However, we were extremely worried about how often it could occur in production, with thousands of users, instead of the 15 testers and light load we were generating in the lab. Senior management decided to move ahead with the release as it was, and take a risk on the bug not occurring at all, or occurring infrequently. Why did they do this?

A 1 in 5 chance of something occurring is quite high. So is 20%, but twenty percent sounds smaller. If you use that figure without context, and your attitude is to make it seem small and insignificant, people will generally interpret it according to how you spin it. A 1 in 5 chance of the bug occurring in production, could mean that 200 people out of the first 1000 could experience this bug. It wasn’t uncommon for client sites to have dozens or even hundreds of simultaneous users, and our servers would peak at 1000 simultaneous users at times. If you think of 200 people seeing this crash, and then many people having to log in to a new server and start over, until license or server capacity was filled, with the system being unavailable for everyone after them, it starts to look more serious. However, the political players decided to just say “It has a 20% chance of occurring.”

The product management lead approached me and asked for a second opinion. I had to tread carefully because of the political implications of what they had been told, but I explained that even a 20% chance is sky high. For a bug like this, we could risk a 0.02% (zero point zero two percent) chance. Even a 2 percent chance would result in outages that would anger our customer base. For example, if you were gambling in Vegas, you’d take a 20% all day long. Those are wonderful odds if you are gaming. To hedge their bets, I advised that they create and rehearse a roll back strategy in case the new release was as bad as we expected it to be. Thankfully, the team followed that advice, because the release was a disaster. Every client site had no access at all by mid morning, which meant that our IT and customer support teams were busy 24 hours a day, dealing with extremely angry people. The release was rolled back, and the difficult architectural change was implemented, and the bug disappeared. It was weeks of effort, but if they had decided to wait on their release, they would have been much better off than unleashing something so unstable to the public. They lost a lot of money, they lost face publicly, and they lost some customers. They also lost months of time on their product roadmaps, since everything ground to a halt to address the customer anger and problems, and then efforts were split between support and fixing the problem.

The most expensive combination were the cloud based hosting services of the system, in some cases causing a huge increase in hosting bills. When you couple a frequently occurring server outage and a wish to fix the problem quickly with an extremely easy way to add more servers, you can quickly end up over your hosting limit and incur costs. As you might imagine, there were some extremely angry customers whose IT teams fell into the “just add more” trap to try to minimize the problem.

What went wrong? Someone decided to use metrics to try to spin a narrative that was counter to reality. This happens all the time in the world! It is almost always by people who want to minimize the problems highlighted by scientific rigour, or to try to maximize public support for unpopular policy. Or it is used by people trying to sell you something. The concept lies, damned lies and statistics explains how metrics can be used to spin a narrative. It’s important to question narratives, especially if they lack context. What can go wrong? Who wins and who loses when a particular course of action is taken? Are methodologies with weaknesses and strengths explained, or are they glossed over? Is the person presenting the data a relevant authority, or are they just a good talker? What happens if you scale up the numbers (if they are small), or scale down the numbers if they are large? Does the message change? These are all important questions to ask yourself when you are shown data that is supposed to convince you of something. The math lesson here is how you communicate metrics is important. Spin can blunt a serious issue and problem minimizers can win out of they are clever, albeit dishonest, communicators.

Load Testing Your Web Infrastructure: Please Be Careful. Part 3

In the Part 2 story, we saw what a load testing tool can do when it is used by someone who doesn’t have the right knowledge and skill about the tool and underlying systems. However, you also need to understand the environment where you would need to use the tool. Creating and using test environments that are optimized for load and performance testing is a must. If you use these tools on a regular network, you will likely disrupt everyone else at the office, causing lost productivity and extra work for IT staff. The last thing you want to do is try them out at home, and end up blacklisted by your ISP (internet service provider).

Bye Bye Network!

After a while, I was an old hand at load and performance testing. To bolster my hands-on experience, I attended workshops on how to overcome technical restrictions, how to accurately analyze the data and find problems others would miss, how to write reports and describe risk and problems, and I was adept with a handful of tools. I started to get hired for performance and load testing gigs, and under the right circumstances, I had some rewarding and fun projects. I worked with a lot of talented people with vastly different skills, and learned from each of them.

Since I had a lot of retail and telco experience, a work friend asked me to come in to help him with a large retail system that was going through an upgrade. One of my tasks was to provide load testing help, since they were upgrading all the software and hardware for their back end system. I was given a lot of freedom to choose the tools, to interview everyone I could about any backend system issues, how to simulate credit card processing, etc. I was given a lot of freedom to research and design exactly what they needed. However, I was not given a test network to run the tests, so I never used any load. I verified my load tests would work with only one user.

To find potential areas of concern, we set up monitoring at several key areas on the system, and I had test results output in a format we could utilize with statistical analysis software. We also monitored server utilization, and recommended moving some processes around to better utilize the system. We learned a lot, but I wasn’t ready to unleash full load testing capabilities without a dedicated test network. There was no way I wanted to use this on the corporate network, even though we knew it would only run against our internal test system. I knew from experience that we could overload the internal network and cause problems for others. My friend, the dev manager, ignored my concerns. He was confident that the internal network would handle the extra traffic, since the IT admins had shown him that it was perpetually under-utilized.

Despite my objections, the dev manager insisted I run the load tests on the regular internal network. To start, he wanted to run the tests with 1000 simultaneous users, but I suggested we try something smaller. I wanted to try 10, he insisted we try 100. Still objecting, I hit the “Enter” key on my machine to start the tests. Immediately, a collective howl started to swell across the entire floor of the office. Then people started calling out that they had no network access. The dev manager and the IT manager ran to the server room, and when they unlocked it, all we could see in the dark rook was a sea of blinking red and yellow lights. Clearly, my load tests had overwhelmed the entire network, and every piece of hardware was in an error state. No one in the office was able to do work until all of the equipment was restarted. It took about a half hour to get the network up and running again, and the first thing my friend said was: “TRY IT AGAIN!!!!” He insisted the network outage was coincidental.

I refused to run the tests again, and made him tap the button on my machine. No sooner had his hand lifted from my keyboard, when the collective howl swelled again. The IT admin opened the server room door, and again, it was all blinky lights, and no network access for the company. It was remarkable how quickly the network was getting overwhelmed. Technically, the dev manager and IT team felt it was impossible, but they agreed not to run the tests again until we had investigated the source of the problem. Furthermore, permission and a budget for a test network specifically for load and performance testing was immediately approved by stakeholders.

It turned out that it was an extraordinary event that caused the outage, but it was something that would have happened in production without us catching it internally first. In simple terms, the network cards on the new servers had been set to a default to broadcast to each other when under load, to try to load balance. This was a new feature, that looked good on paper. However there was already had a load balancing system in place, so this was redundant, and harmful. In effect, the servers spammed each other because they were all under load, and the traffic increased exponentially. Machine one would find itself under too much load, so it would message machine two to get it to process excess. Unfortunately, Machine two was also under extreme load and was also messaging machine one, who was messaging machine two for help, as were Machines three and four, messaging each other over and over and over with more and more messages.

To visualize what they were trying to process and the traffic they created themselves, imagine a geometric or hockey stick curve on a graph, or an infinite series in mathematics. The load tests were already creating a huge amount of traffic, but the servers themselves were generating more network traffic at an exponential rate. This traffic generation behavior instantly overwhelmed every component in the corporate network. We quickly turned off that setting in the network cards of the test servers, and then waited for a test network we could safely run the tests on.

The next time we ran the tests, I had several managers breathing down my neck, but the server outages they caused did not cause any network outages. There was no collective howl, no server room full of blinky error lights. We all breathed a sigh of relief, and we went on a find and fix cycle for a few weeks to get the systems ready for a production launch. We were able to ship with a lot of confidence due to this work, and the load tests were part of pre-production tests for years after that launch.

This was a relatively small company, and the impact was fairly low. The entire development team and IT team sat together, and the infrastructure was in a server room on the same floor as the office. We were able to deal with the outages quickly, and the incident became a part of office lore, brought up when a laugh was needed. It wasn’t without political fallout though, since it was disruptive and problematic. Now imagine if this was a larger company, with IT departments in another location, servers at a hosting provider or on the cloud, etc. There could be considerable downtime, and increased costs with hosting providers, etc. While this situation was more lighthearted due to friendships and a tight knit office environment, it could have been extremely serious.

Part 4 story

Load Testing Your Web Infrastructure: Please Be Careful. Part 2

In the Part 1 story, time, money and effort were wasted. This story is much more serious. Load and performance testing tools can be simple to get started on, but they belie a good deal of complexity. In other words, a little knowledge can be a dangerous thing. While the tool may look simple, and like there isn’t a lot going on, they have a lot of power and can unleash mayhem on a system. To simulate adequate load, the tools are generating a lot of traffic, which can have unintended consequences unless you know what you’re doing. Using record/playback can be handy when someone has skill and understanding of what they are doing, but when used by someone who is unskilled, can unleash absolute misery. Just because you can use a tool and generate load doesn’t mean that you should.

A Complete Clusterfuck

A year after the Part 1 story, I was brought in to work with some Agile teams that were helping an overwhelmed IT department. Load and performance testing were brought up, but since I had been down that road before, I explained the work and potential pitfalls to stakeholders. They agreed we should treat it as a separate project, and use a cross functional team. However, a high powered consultancy had brought in a team who were desperate to show their mettle. They were skilled, they had a great reputation for turning projects around, but they were extremely arrogant. I was pulled into a meeting with sneering programmers who mocked my experience and concerns about load testing without analysis and careful planning. After my treatment in the meeting, my manager told me to decline further invitations, and let them “sink or swim.”

I didn’t hear much about what they were doing for a few weeks, but then one day a concerned executive assistant called the CTO. The CTO called the IT manager, who in-turn called the people who were on my team. I was on a small cross-functional team that worked on development projects, but we would get pulled into helping fix any difficult production issues. The problem was that the CEO couldn’t access their work email. After rolling our eyes and asking if they had forgotten their password, we realized that webmail access for the entire company was down. The lead IT Admin and I sat next to each other, and he provided me with a play-by-play of what he was doing. He found that the webmail service was hanging, so restarted it. Webmail briefly came up again, but the service started to hang again. Then more reports came in of poor performance on the corporate network, and some services becoming unavailable. He had to restart the mail servers, which in a large organization is not a simple task. It requires communication to all staff, timing warnings over a few minutes, doing the restart, communicating and monitoring. Similarly, certain areas of the network seemed to be under some sort of attack. Was it a security breach? Did someone have a virus or trojan horse?

Eventually, we tracked down the excess traffic to a particular machine, and it was one of the staff consultants from the arrogant consultancy. The IT Admin blocked his IP from the network, and we went to management to figure out what to do next. We wandered over and initiated a chat with a now angry group of consultants who were furious that one of their team members had lost network access. After a brief explanation, and a query as to why they were nuking our network, they admitted they had tasked one of their junior consultants with researching load testing tools. He had downloaded an open source tool, recorded HTTP traffic, played it back, and then kept adding more simultaneous users. There were several problems here, and senior management were furious. The consultant was kicked off the project and escorted out the door, and the consultancy was warned that they were in breach of contract. They had ignored several directives that they had pledged to follow when they signed the contract. As time went on, more problems than the CEO not being able to access webmail started to emerge.

Internally, there were formal complaints to the IT team about a lack of access and downtime. IT was in violation of their commitments for network and tool availability, and management had to spend time mollifying angry managers in other groups. You have to imagine what can happen in an internal network when someone starts generating hundreds of simultaneous requests over and over. Devices get saturated and stop functioning, others go into error mode, and everything slows to a crawl. IT technicians need to identify areas of the network that need intervention, and try to remotely restart services. In some cases, they had to physically go and restart network infrastructure manually. This resulted in thousands of dollars worth of lost time that day.

Remember when I said that if you record traffic for a load testing scenario, it will capture ALL the protocol level traffic on your machine? It turns out that this programmer didn’t know that or think of that. Later that day, the consultancy found out that they were locked out of their corporate messaging system. This is a core tool for a company that has most of its employees distributed at various customer sites. The load test against our system included all the instant message traffic that occurred while he was recording the scenario. They were without their system for days, while they negotiated with the vendor and tried to explain why one of their employees had essentially executed a denial of service attack. They were able to reinstate their corporate account, but that employee was banned from using it.

A few weeks went by, and an IT Manager came storming into our development area with a credit card bill. There were several thousand dollars worth of mystery expenses on it. It turned out that the day of the tests, he had given the consultancy his corporate credit card number “to run a few tests”, and assumed that they would let him know what they had done, and he would call to cancel them. The day of the load test disaster, the credit card company called to let him know they had frozen his account, but he assured them it was ok, people were running a few tests. By the time he had approached the staff consultants, the load testing had been stopped. Unfortunately, no one thought to connect the dots and tell him how his corporate card had been used. Thankfully, the credit card company found the problem and shut down his card, but the damage was done. He had to get a new corporate card, and it took time to dispute the payments and get them refunded. It took time, energy, and other managers had to use their cards on his behalf.

In the end, the consultancy lost their MSA with the company, and they lost credibility due to one person ruining it for everyone else. Unfortunately, a consultancy with people who weren’t as skilled was hired instead, but they were much nicer to deal with. Internally in IT we had hoped the prior consultancy would work out, because they had the skills and experience to deliver. Due to their arrogance, we all lost out. Furthermore, IT lost credibility with the business for allowing a consultant to wreak that much havoc. Because of the sudden, repeated excess traffic from that location, even our corporate ISP had flagged us, and that required finessing and promises to not occur in the future. If we suggested a vendor, stories about this ridiculous situation would be recalled, and we would get stuck with less ideal providers that other groups chose for us. This, plus thousands of dollars of costs, not to mention all the staff work to clean up the mess was caused because someone without the knowledge and skills used a tool they didn’t understand and ran it on our network. Depending on who retells this story, it can even sound amusing, but it was extremely serious. This person downloaded an unauthorized tool against a client corporate policy, recorded some HTTP traffic, then ran this over and over with various sizes of payloads. A few hours of playing around with something they didn’t understand had extremely serious effects.

Click here for Part 3 of the series.

Load Testing Your Web Infrastructure: Please Be Careful. Part 1

Now that I am on the product management side of software projects, I don’t deal with testing approaches in my day-to-day work very much. I get info about product quality criteria, quality goals and metrics, information on testing status and quality, or show stoppers that require attention. Unless I want to dig deeper, I don’t hear much about the actual testing work. Once in a while though, something big pops up on to my radar, usually because there is a threat to a product release, or there is a political issue at play. In those moments, my background as a software tester comes in handy.

Recently, my testing experience was called into action, because of project controversy about load testing.

There were some problems with a retail system in production, and poor performance was blamed. The tech team did not have the expertise or budget for load testing, and were instead pushing the sales team to take responsibility for that testing. The sales team didn’t have any technically minded people on their team, so they approached marketing. The marketing team has people with more technical skills, so a manager decided to take on that responsibility. They asked the team for volunteers to research load testing, try it out, and report back to the technical team. I happened to overhear this, and began waving my arms like the famous robot from Lost in Space who would warn about impending danger by saying: “Danger, Will Robinson!” This is out of character for me, since I prefer to let the team make technical decisions, and rarely weigh in, so people were shocked by my reaction. I will relay to you what I said to them.

Load testing is an important testing technique, but it needs to be done by people with specialized skills who know exactly what they are doing. It also needs to have test environments, accounts, permissions and third party relationships taken into account.

Load testing is a great way to not only find performance issues with your website or backend servers, it will also cause intermittent bugs to pop up with greater frequency. Problems you might miss with regular use will suddenly appear while under load, due to the high volume of tests that are run during a short period of time. High volume automated testing is extremely effective, and one of my favorite approaches to test automation. To do it correctly and to get utility requires work, environment setup, as well as knowledge and skill. Done well, performance bottlenecks are identified and addressed, intermittent bugs are found and fixed, and a good test environment and test suite helps mitigate risks going forward when there are pushes to production. However, when done poorly, load testing can have dangerous results. Here are some cautionary stories.

The simplest load testing tools involve setting up a recorder on your device to capture the traffic to and from the website you are testing. You start the recorder, execute a workflow test, turn off the recorder, and then use that recorded session for creating load. The load testing tool generates a certain number of unique sessions, and replays that test at the transport layer. In other words, it generates multiple tests, simulating several simultaneous users using the website. However, lots of systems get suspicious of a lot of hits coming from a particular device, and protect against that. Furthermore, internal networks aren’t designed for one machine to broadcast a huge volume of data. If you are working from home, your ISP will get suspicious if you are doing this from your account, fearing that your devices are being used for a Denial of Service attack. Payment processors are especially wary of large amounts of traffic as well. So if you use this method, you need to completely understand the system and the environments where you are performing the tests.

Part 1: Expensive Meaningless Tests

Early in my career, I was working with a popular ecommerce system. They were successful with managing load, but felt their approach was too reactive and possibly a bit expensive. If they could do load and performance testing within the organization rather than deal with complaints and outages, they could also improve customer experience. I was busy with other projects, and I had never worked with load testing tools before. Since I was a senior tester, I was asked to oversee the work by a consultant who was a well known specialist, who also worked for a tool vendor that sold load and performance testing tools. To be completely honest, I was busy, I trusted their expertise, and I didn’t pay a lot of attention to what they were doing. One day, they scheduled a meeting with me, and provided an overview. It all looked impressive, there were charts and graphs, and the consultant had a flashy presentation. They then showed me their load tests, and highlighted that they had found “tons of errors”. He said that his two weeks of work had demonstrated that we clearly needed to buy the tool he was selling. “Look at all the important errors it revealed!”

My heart sank. All they had done was record one scenario on the ecommerce system, and then played that back with various amounts of simultaneous users. They were wise enough not to saturate the local network, so they kept the numbers small, but their tests were all useless because they had no idea or curiosity about how the system actually worked. The first problem was that retail systems don’t have an endless supply of goods. Setting up test environments means you set up fake goods, or copies of production inventories that don’t actually result in a real life sale. To make them realistic, you don’t have an infinite number of widgets, unless you need that for a particular test. These tests didn’t take that into account, and the “important errors” his hard work had revealed with the tool were just standard errors about missing inventory. In other words, there were ten test books for sale, and he was trying to buy the 11th, 12th, 13th books. If he had been a real user using a website, the unavailable inventory messages would have been displayed more clearly. Because he was getting errors from the protocol level, they weren’t as pretty. A two minute chat with an IT person or programmer would have set him straight, but he didn’t look into it. He copied the messages and put them in his report, treating them as bugs, rather than the system working just fine, due to his error.

Next, they were using a test credit card number that was provided to us by the payment processor. There are lots of rules around usage of these test numbers, and he was completely oblivious to these rules. In his days of so-called analysis of our system, he had not explored this at all. That meant that our test credit card numbers were getting rejected. This was the source of some of the other “important errors” he had found, but not investigated. This was so egregious to me, I had to stop the meeting and talk to our IT accountant who managed our test credit card. My fears were confirmed – these load tests resulted in our test credit card numbers getting flagged due to suspicious activity. That meant none of us could test using the credit card, and we had to have a meeting explaining ourselves and apologizing to get them reinstated.

I got dragged into developing my own load and performance testing skills because of this. The consultant went back to the office, and I inherited these terrible tests. What I found that was while the load testing tool looked impressive, it had this terrible proprietary programming language that created unmaintainable code. While it had impressive charts and graphs, they were extremely basic and could actually mask important problems. Recording HTTP(S) traffic and playing it back could be fraught with peril, because the recorder is going to pick up ALL the HTTP traffic on your machine, including your instant messages, webmail, other websites that are open, and 3rd party services such as a weather plugin or stock ticker. Also, you need a protected test network that prevents you from causing problems and interfering with everyone else’s work. Then, you need to look at your backend and see what is possible. In my case, I worked with the team to create new load test products on the website, but the backend retail system only allowed a maximum of 9999, since it maxed out with a 4 digit integer. We also had to create a system to simulate credit card processing, since the payment processor wasn’t going to allow thousands of test purchases hitting their machine. Furthermore, our servers had DDoS protection, and would flag machines that were hitting them with lots of simultaneous requests and deny access, so we had to distribute tests across multiple machines. (These issues were all a bit more technical than I am recording here, but this should give you an idea.)

How much time do you think it took to create the environment for load tests, and then to create good load tests that would actually work?

If you answered: “weeks” with several people working on the testing project, then you are in the ballpark.

We also abandoned the expensive load testing tool, mostly due to it using a vendorscript instead of a real programming language. We used one that was based on the same language the development team used, so I would have support, and other people could maintain the tests over time. It was a bit rudimentary, but we were able to identify problem areas for performance, and address those in production. A happy side effect was the load tests caused intermittent issues that we had missed before to become repeatable cases that could be fixed. It was a lot of work, but it was the start of something useful. The tests were useful, the results were helpful, and we had tests that could be understood, maintained and run by multiple people in the organization.

I was fortunate in this case to be able to work with a great team that was finally empowered to do the right thing for the organization. We were also fortunate in our software architecture and design. We spent the time early on to create something maintainable, with simple tests. As a result, our testing framework was used for years before it required major updates.

Click here for Part 2 of the series.

New Book Published: Tap Into Mobile Application Design

After a long delay due to some health issues, I have finally finished the first version of Tap Into Mobile Application Design. This book is only available in PDF and epub electronic formats which should work on most e-reader apps on most mobile devices. I used Leanpub as the publishing platform, as with Tap Into Mobile Application Testing. I like this platform because it’s faster for me to get content out there, changes and updates are instant, and customers can download updated copies for free.

Tap Into Mobile Application Design book cover

This book is long and detailed, a result of me trying to capture most of my thoughts on designing software for mobile apps. When I started the book, I had planned for something much shorter, but as I worked through the content, it felt abbreviated and overly simplified. In Tap Into Mobile Application Testing, I made a couple of mistakes trying to overly simplify complex technical issues with regards to wireless technology. I updated some of the content to be more accurate, but I didn’t want to repeat that problem with this book, so I went deeper in some technical areas.

To help illustrate the challenges we have on these projects, I decided to use an example app project in the book. This helps to ground the content, moving from initial idea, to a full user experience design process and ending with user testing. The example project helps illustrate what a real world project can look like, but with the benefit of time I was able to capture many design project issues, rather than the few you encounter on a rapidly developed app. You get it all, including the positives and the negatives of the example app.

Furthermore, the context of app design changed as I was writing, and I felt I should capture some of those changes in the book as well. The legal landscape has changed, and there is a much better awareness of ethics and the long term effect of our designs on people. With the benefit of a side project to use as the example in the book, I was able to capture these issues as they happened on that project. The project had to adjust, and that is reflected in the book. Unexpected issues are common on mobile projects, and the example app shows how we adjusted. Initial attempts often fail due to oversight, legal rulings have an impact, and the timing of what you do on a product is crucial.

The book is also longer because it doesn’t just follow a happy path. There are lots of great books out there that fit that model. Instead, this book covers false starts, changes in direction and a completely reworked interaction design. That’s right, I cover how we almost went to market with one design, hit a snag, and completely redesigned the example app from the ground up. It’s difficult to capture the non-linearity of design in a book, and that results in some awkward flow and a couple of extra long chapters. I apologize for that, but I had hoped this would be an honest and detailed account of what can happen when you are creating an app.

I have also created a book bundle, combining both my “tap into” books called Tap Into Mobile Apps, where you can buy both the books for about the same cost as the full price of either book. The books are very different, but are complementary. In Tap Into Mobile Application Testing, the reader follows Tracy, a tester who is learning about mobile testing approaches. In Tap Into Mobile Application Design, the reader follows an example mobile app project called “Reporter” throughout the book. The design book is more heavy and dense content-wise, and that is reflected in the tone. The testing book is lighter in content and tone and an easier read. Both books cover technical issues to help inform your work. The combination of designing for or testing for people in social contexts, with a deep understanding of the technical underpinnings of the technology, within real world environments is my core differentiator. When I help teams develop that three pronged approach themselves, they build better software and have happier customers.

Both of these books represent my approach to working on mobile apps, which people can utilize as they see fit. These books aren’t for everyone. They are long and detailed and don’t provide easy answers. What they do provide is context and details that are important to understand on mobile projects, especially when you are having trouble. In spite of it representing a more difficult approach to your work, Tap Into Mobile Application Testing has been used by people all over the world, and influenced many mobile projects. It was highly praised when released and even now people contact me to tell me about how much it helped them. People still use it, they still talk about it at conferences and on projects, and several years on, find it relevant and helpful. I hope as many people find the design book to be as useful as they found the testing book.

Dealing with the Business Neg

Have you ever had this kind of interaction? You are talking with someone, and you’re having a good conversation. At some point, seemingly out of nowhere, the other person says something subtly unkind about you. You may not even notice it at first, but then realize it was a backhanded compliment or a personal dig. You might not realize something was wrong until after the conversation has ended, and feel confused. If this sounds familiar to you, you were most likely negged. Wikipedia says it’s a “…remark to another person to undermine their confidence and increase their need of the manipulator’s approval.”

In other words, negging is a manipulation tool used by people to undermine your confidence and make you feel inferior to them. If you feel defensive, or that you need to prove yourself to them, you are now more open to being manipulated. Negging is gross and it’s awful. People use it for various reasons, but it is always some sort of power play. Negging is well known in social situations, but have you ever had this happen in a business context?

While pop culture references negging in social contexts, you will also run into this behaviour in business environments from time-to-time. As you deal with executives, consultants, senior managers, startup founders, influencers and others who have reached some level of success, you are going to run into egos. People with big egos can feel threatened by others around them that have knowledge, skills and abilities that they lack. They bring you in to a business because they are aware of this and they want their product or service to be a success, but that doesn’t stop them from feeling threatened by you, particularly when under stress.

Popular folklore around business is that it is driven by forces of efficiency and demand, and private companies in particular are guided by forces related to their revenue and expenses, and adjust accordingly. In other words, companies operate on a knife edge of market conditions, and are adjusting their sources of revenue and keeping an eye on their expenses, and make decisions based on the best, most up to date data available at the time. That’s folklore.

The reality is that like everything else in life, emotions drive business decisions much more than anything else. When you trigger the emotions of someone who has more power than you, they will respond in surprising ways. One way is to use the neg, to try to assert dominance and let you know who is really in charge.

Sometimes the neg is subtle, but sometimes it is obvious. Usually near the end of a conversation, the person in power gives you “feedback” and describes something you do that they feel needs improvement. It’s almost always about your personality and being, which is hard to cope with, rather than some aspect of your work, which you could easily adjust and improve.

As a consultant, I get brought in by companies who are looking to improve, so part of my job is to understand what the company is doing well, and what areas need improvement. A common reaction when I present my findings to senior managers is for them to get defensive and neg me. Instead of analyzing and reflecting on the insights they are paying me to give them, they might tell me I look too young to be a consultant, or they expected someone taller, or that I don’t know how to communicate.

Why do they neg?

Businesses aren’t a meritocracy. The people in power got there due to circumstances, luck, and timing. They stay there due to their personality and their interpersonal skills, especially their ability to assert leadership and their mastery of group politics. In some organizations, the politics are vicious and cut throat. In others, they are milder, but leadership still requires you navigate them. Some leaders even have personality traits that are dysfunctional interpersonally, but advantageous in business. Leaders respond to threats or perceived threats to their leadership in different ways, but manipulation is less threatening than productive confrontation or straight out intimidation. Negging is a subtle way to undermine someone so they feel like they need to acknowledge that you have power over them. Here are some reasons why people neg in business situations.

They:

  • want to manipulate you and get you to do something for them you might not normally do.
  • assert dominance and “put you in your place”, reminding you of their power over you.
  • feel defensive about your critical feedback (even though they want it and need it).
  • are intimidated by skills you have that they feel they lack.
  • perceive your leadership in the organization as a potential threat.
  • are jealous of you. Jealous of your skills, success, experiences, the way you look, etc.
  • are insecure and run everyone down to make themselves feel better.
  • have a personality disorder.
  • are feeling stressed and desperate and are inadvertently lashing out.
  • like to fuck with people.
  • experience a combination of the above.

There are probably others I have missed. In short, people are people, and there are a lot of reasons why people behave the way they do. Pettiness, childishness and social manipulation do not disappear just because you are in a business context. In fact, these basic negative interpersonal behavior patterns can be amplified in stressful business situations. There are cliques and group patterns that emerge like in every other group, but there are also financial and other rewards at stake, causing people to behave in different ways depending on their hope for reward or fear of punishment. People in leadership like to stay in leadership positions, and the further up the corporate ladder you go, the more stresses there are. The financial rewards are greater, but there are more powerful people with a lot of influence and power that are putting pressure on the leaders. If you add desperation to the mix, then more erratic and dysfunctional behavior will follow.

A simpler way of saying this is schoolyard behavior does not magically disappear when we are adults, or when we are in business situations. In fact, due to enormous stress in work situations, it can be even worse.

Software companies in particular are incredibly difficult business environments due to their fast, hectic pace, fickle consumer markets, disruption from competitors, and the winner takes all environment they are capitalized in. Software founders and dealmakers can have huge egos, and are used to getting their way. These are unique people who are able to create excitement over their ideas, get investment, attract people and create a team around them to see their vision get built into something tangible.

Furthermore, investment firms often like to back leaders with a certain kind of personality, and arrogant, dysfunctional people are often lionized and held up as leaders we should emulate. That said, in a fast paced, ever changing environment, even the most balanced and empathetic leaders will suffer under the strain. None of us are perfect, and under the right conditions we can behave poorly, even when we don’t mean to.

In other cases, the business neg is more insidious. Sometimes people behave this way because they are:

  • sexist. The vast majority of the time it’s a man who doesn’t want to see a female or non-male in a business environment, and/or they are trying to hit on you.
  • racist. They don’t like the color of your skin, where you are from, etc.
  • homophobic. They don’t approve of who you love.
  • transphobic. They don’t like your gender.
  • anti-science. They resent the facts and data you use to make decisions.
  • politically intolerant. They want everyone to believe and vote the way they do.

These are extremely difficult situations to navigate, but they are easier to spot than the previous list. Leaders in companies are used to people agreeing with them, and can find alternative people and viewpoints extremely threatening.

What can you do about it?

To cope with the business neg, you first need to analyze it. Was it reasonable feedback that came at an awkward time, and I’m just feeling defensive about it? Or was it an attempt to manipulate me to change something about my behavior or my work products?

For example, I presented findings for a small software company after a short audit. Audits aren’t pleasant to do, and are unpleasant for the people in the company. After I was finished, the CTO lashed out at how I had presented the information. He was angry and said that he didn’t like the format of my report and wanted it changed. The QA manager responded with a backhanded comment, implying wrongdoing on my part during the audit. I felt taken aback by both comments, and immediately felt defensive. I thanked both for their feedback, then held my tongue and stayed quiet, even though I felt like responding in my own defense. Instead, I waited. The meeting ended, and I had some time to reflect. Was this critical feedback that was poorly timed, or was there a business neg at play here?

In this case it was straight forward. The feedback from the CTO was constructive but awkwardly delivered. Their concerns were easily addressed in my report, and I chalked up their negative behavior as a form of projection. They were upset with the findings and took it out on me.

On the other hand, the QA manager’s response wasn’t critical or useful feedback, it was a statement intended to undermine my credibility. The team was struggling because an Agile consultant had worked with them a year prior, and advised they automate all tests. Now the team were collapsing under the weight of all the tests and couldn’t move forward. They had far more code in automated tests than in the actual product, but fewer people working on that codebase to keep it running. They needed a serious philosophy change, major rearchitecting and refactoring, and to staff their automation efforts adequately.

I provided some actionable approaches to cope with this: use proper software architecture and design with automation code, treat it equally with product code, refactor, don’t mindlessly automate, summarize tests, etc. However, the QA manager was afraid they would get blamed and would lose face, lose opportunities in the organization, or lose their job. If they undermined me, then maybe they could get out of doing the hard work of improving their department. Or, maybe, they could get me to blunt my feedback and alter my report. That’s an even better way to get out of hard work that has high visibility. However, if I altered my report to make them feel better, it would be deeply unethical. Instead, I ignored their statement and moved on.

Another neg came at the end of an interview for a new gig. The company founder had walked me through their pitch, their financials, their organizational structure, their positives and their negatives and we had a constructive session. He asked hard questions about what I would do to help a product move forward, and we talked a lot about processes and people and fit. Then, near the end of the interview, he suddenly turned on me. He stood up, had a defiant look on his face, then towered over me and got quite aggressive. He then started to insult me, berate me, and then concluded the meeting. I knew then that there was a problem with fit, and we weren’t going to work well together. He was pulling a neg, and I wasn’t going to stick around and find out why. If he was going to choose to neg me in a job interview, imagine what it would be like working together?

Here are some options for dealing with the business neg:

  • Ignore it and move on.
  • Acknowledge it and address it.
  • Get help. Talk to trusted confidante outside of the organization.
  • Run. Seek opportunities elsewhere with more like minded people.

Sometimes good leaders have some negative traits, and getting negged once in a while is the cost of doing business with them. None of us are perfect, and if it isn’t really harmful and you can cope with it, then ignoring it and moving on might be appropriate. However, what works for you might not work for others, so don’t assume everyone should always put up with it just because you can. One way to determine if this is a trait is to watch and see if they do it to others, and how they respond. If they start to supplicate to the person doing the negging, that is a big warning flag. The person doing the negging probably wants this. If they ignore it and nothing else happens, it is probably a behavior the leader isn’t conscious of.

There are two ways you can acknowledge and address negging. The first is head-on, in the moment. You call out the other person and ask them what they mean and what they hope to get out of the comment. This is high risk, because people don’t like getting confronted by their toxic behavior. It can be effective in stopping it, or in ending a business relationship where there isn’t a good personality fit. In other words, this will very well end a business relationship early on, before it gets to be truly abusive.

The second approach is more time consuming, but allows you time for reflection and strategy. With this approach, acknowledge it after the fact on your own, and address it through your own behavior. You come up with some approaches to deal with the person negging you, and how to cope with the self doubt and emotional responses it causes in you. You figure out ways to protect yourself politically while still doing your work and getting enjoyment from it. Setting it aside in the moment and finding a path forward after the interaction is lower risk. However, putting up with negs over a period of time can have a damaging effect on you. While it is lower risk in the short term, beware of a long term emotional toll.

Bottom line: If you can handle getting a neg once in a while, and the person who negs you doesn’t escalate their behavior, you are probably ok in that situation. If negs are eating at your self esteem, interfering with your life and work, or if your management of them causes people to intensify negative behavior towards you, you need to find a change.

Having negative feelings after an interaction doesn’t necessarily mean we have been negged. Sometimes we get defensive and feel doubt because an interaction felt like a neg, but it was something else. Reflection can help a lot, because negative feedback can hurt, but it can be a super power to deal with it constructively. It’s important to analyze the interaction and see if your confusion persists. Someone might have been awkwardly giving you valuable feedback that is hard to hear. One rule of thumb with telling the difference between feedback and a neg is an ongoing and growing feeling of confusion. After careful reflection, do I still feel confused and hurt, or do I feel like the person was correct and I need to work on something? Do I feel like they are in a position to get something from me, or are they in a position where they are trying to help? Is it something within my control to address and change?

Here are some things to consider when you are wondering if you have been negged:

  • Was it a comment about you and things about you that you can’t change (physical traits, background, etc), or is it related to things you can control such as your behavior or your work?
  • Is it in a context where critical feedback would be appropriate, or did it come out of nowhere?
  • Did you ask for feedback or did they volunteer it without asking?
  • Who is in a position of power here? What kind of power play might this person be pulling on you?
  • Are you feeling defensive and upset, or are you feeling attacked and confused? Do you know down deep that they are right about it, or is this something that feels unfamiliar and hurtful?
  • Does the confusion and discomfort feel better with time and reflection, after the encounter, or do you feel more confused and full of self doubt the more you think about it?

If you find yourself gradually agreeing with the feedback and coming to terms with it rather than feeling defensive, it might have been good critical feedback that was delivered poorly. If it is worthwhile feedback, you will have something you can work on that helps you improve. In that case you can do something to address it, you’ll grow from it. If it is just a neg, you need to utilize your own self care tools, possibly with the assistance of a counsellor or therapist.

Negging is awful and can be so damaging it tears you apart emotionally, causing you to doubt yourself. Often there is a grain of truth to the neg, which can prey on you and dominate your thoughts and self talk. If it comes from someone in a position of power over you, especially someone you admire, it can be even more difficult to cope with. Getting advice from someone who isn’t in the situation can help a lot.

Recently, I had a startup founder insult me at the end of a meeting. I had spent a lot of time researching their product offering, pouring over their business plan, and helping them hone their pitch deck as they tried to raise money. It was positive, but difficult, as these processes are. I have to get up to speed quickly, point out problems and gently suggest actionable tasks that would make a big impact quickly and cheaply. The founder was experienced, well spoken and gracious. They took feedback in stride, and talked a lot about future goals and how they hoped the product would be the cornerstone of several offerings.

Things were going well, and they asked me to take on a role in the organization for the longer term. I wasn’t qualified, and suggested I could help them find the right person. It would have been an amazing opportunity, but ethically I couldn’t take on the role. We finished the chat, and agreed that I would help out as much as I could, and in ways I was comfortable with, and that was that.

The next conversation was fine, but the founder asked me about reconsidering. He was positive and professional, but said it was hard to find someone with my skillsets, and I could learn the rest on the job. I reiterated that I couldn’t do that role for them, but I would help find the right person. The session went off as they usually did, some hard questions, lots of reflection and banter, and a great positive vibe. All of a sudden, at the end of a meeting, the founder insulted me.

I confronted them in real time, and they started making excuses and then blamed themselves. I laughed it off afterwards, but a few days later self doubt started to creep in. I called a friend and professional colleague and shared what had happened with them, and what was said. They were shocked. We walked through the conversations, and realized that me turning down the job offer had probably bothered the founder. This was the likely scenario: The founder had tried several gentle ways of manipulating me to say yes, and when those had failed, had insulted me. They had hoped the neg would cause me to change my mind. Instead, I fulfilled my business obligations to them and then promptly cut off contact.

My colleague helped me unwind what had likely happened. I then decided that putting up with someone who negged wasn’t worth my time and the mental health implications. (Furthermore, they were early stage and likely wanted me to work for free or for options rather than a paycheck.)

If the neg is potentially damaging, you need to get out of the situation. Sometimes you might feel like putting up with the neg for your own reasons, but you will need to prepare and strategize.

It can be hard to walk away. If it is a first encounter, or early in the business relationship, congratulations, you found out early on that this person is harmful to you. Move on to another opportunity. If it is in an established business relationship, moving on might take time and planning. In that case, hunt for a new job or relationship and end things cleanly once you are able to. People who are against who you are as a person can never be satisfied, while others who are so deeply dysfunctional will never change. You’ll be a target for not just negging but harassment, bullying and possibly worse. The business neg is the proverbial canary in the coal mine warning of more to come. It will be damaging to you personally and emotionally to stick around, as well as your career. Your mental health will suffer, and if you are set up to fail, your professional reputation could be harmed.

I have seen business leaders with severe personality disorders absolutely ruin the mental health of those around them. I have seen sexists, racists, and political zealots destroy everything they have built professionally because of their lack of empathy and respect for others. I have seen talented staff driven off because of who they were, not because of their work or interpersonal and communication skills. When someone is completely unreasonable and is angry with your right to exist, you need to get away from them. They won’t even have a sense of self preservation when they are stressed out and desperate.

It’s not just leaders who neg

So far I have described situations where there is a natural leader/subordinate relationship. However, negging can occur in all kinds of business scenarios. Here are some I have witnessed and experienced myself:

  • If you are a potential customer, a salesperson might neg you to try to manipulate you into buying something. They may use a neg to close a sale. Marketing teams might also do this at large scale.
  • A work colleague negs you to try to position themselves politically in an organization.
  • A coworker might be jealous of your success or skills at work, and want to put you off your game.
  • Your manager might neg you because they feel threatened because they are insecure in their position and see you as competition.
  • A manager or coworker may neg to try to get you to agree to doing work that you aren’t able to take on. You might be overloaded already, for example.
  • If you are a contractor or freelancer, an established “big name” person in your field might see you as potential competition for influence, public events or business.
  • A corporate trainer may be using group dynamics to generate further business for themselves. Or they might be toying with a group for their own research or enjoyment. They might split the group to encourage fans who become net promoters, and discourage skeptics.
  • A professional or networking organization may have leaders who want to use your success for their own PR and influence, or to try to control who you work with.
  • An “experienced mentor” may be using you to get new ideas or work leads because they are becoming out of touch.
  • Established people with public platforms and reach (influencers) may want to use your up and coming reputation, skills and energy to try to secretly generate business for themselves.

These are just a few examples. Sadly, the business world is full of them, but often when we complain, we get gaslit. We are told that people somehow behave “professionally” in the office, and we are imagining it or overblowing it.

When I transitioned from full time employee to contractor and then to consultant, I thought I had escaped office politics. The truth was they just changed their form. When you leave a business situation with a clear hierarchy, the hierarchy does not disappear, it just goes underground. Don’t convince yourself out of your feelings and experiences because the business context markets itself differently. Just because you are your own boss, or you are in a flat organization or a distributed autonomous organization that basic human behavior changes. It does not.

One rule of thumb to determine whether someone is negging you is to realize that negging is about control. What does this person have to gain from you changing your behavior? What might their motivations be? Do they potentially benefit? Do you? Conversely, what danger are you put in by accepting the manipulation and going along with them? Are you going to burn out at work? Do you expose yourself politically? Are you breaking your sense of morals or ethics to satisfy this person? Are you falling into a pattern of dysfunction that you already know impacts your mental health?

There are situations in business where people deliver you hard truths that are difficult to hear for your own benefit. In my experience though, altruism in business tends to be rare. Also, the most sincere forms of feedback and mentorship tend to come from people who would not be in any kind of competition with you (for status, for work, for money, for online influence, etc.) Or, there is an obvious and mutual benefit to you improving in a situation where you are working together on something tangible, for fair pay.

Another area to be aware of are people that are in less obvious positions of power. When you are in an office, it is obvious. Nowadays though, there are people in power that aren’t so obvious. Their source of power may come from their social media status, or their popularity at conferences and in publications. They might be a “famous expert” in your field. (Or they just have lots of social media followers and a big platform for self promotion.) Outside of an office working environment, you may be blindsided because it doesn’t feel like a formal business situation. However, you will still deal with business negs by people who are clever manipulators.

Meeting well known experts in your field, celebrities, social media influencers and others with a platform that you look up to can leave you feeling a bit star struck. They may use your admiration of them and their work against you, and you may find yourself putting up with behavior you never would put up with normally, or getting you to engage in behavior yourself that you would never do in normal circumstances.

A lot of well known people got there because they are extremely skilled at getting their own way, and aren’t above using their power and influence and star status to push people into things they are uncomfortable with.

Often negs feel subtle, or we overlook bad behavior because we respect an expert or influencer who is suddenly paying attention to us. In other cases, the negging is extremely aggressive and shocking. Business celebrities will often approach someone (maybe you), and isolate them. If it is you they have approached, they may flatter you or love bomb you in private to make you feel special. They will often use a neg and your response to it as a test to see if you are someone they can manipulate and control. If you don’t respond the way they want, they can sometimes become quite aggressive. If you give in and placate them, then you are useful to them. They can dole out attention, public shout outs, or other “rewards” to get you to do what they want. For free! On the other hand, if you stand up to them, they may ramp up the aggression and hurtful behavior to try to push you away, or discourage you from becoming potential competition.

When someone who is well known and respected pays attention to you, don’t get starstruck and ignore red flags. In fact, the more famous they are and the larger their platform, the more careful you need to be. That large reach can be used against you.

Business negs can come from anyone and from anywhere at any time. While the situation might feel casual, and the people negging you might be posing as a mentor, an interested colleague or work friend, they are still negging you and you shouldn’t ignore it.

Note: It’s not just in business where this pattern appears. Nonprofits, volunteer organizations, politics and networking, professional organizations and even informal social groups will also have this dynamic. Even on your own time, when you are engaging on social media, taking part in a hobby or playing a game, someone else might be in a business situation that you aren’t aware of. I have seen this pattern in social media interactions, in hobbies, video games, and in professional development and self help forums where someone was profiting from the platform without your knowledge. They are surreptitiously monetizing their following on a platform, and you just might be a potential source of money for them. The more engaged followers they get who actively refer to and promote the influencer’s work, the more they can grow their influence and their income from the platform and beyond.

Do you neg others?

It’s one thing to see this behavior in other people, but it’s important to analyze our own interactions. We might be the person inflicting the business neg on others. A lot of these behaviors are learned when we are young, and to help us cope in difficult situations. We may be more comfortable being passive aggressive rather than confronting an issue directly. Or we may not be aware we are doing it. It just might be part of our learned communication and interpersonal interactions. If we are rewarded for manipulating people, and we often are in business, then it becomes more entrenched.

If you realize that you neg others, please try to stop. Seek professional help via a counsellor or therapist, and through professional communication and HR training. There are tons of resources out there to help you deal with confrontation better, on how to have difficult conversations and provide feedback in a healthy way, and how to cope with your own emotions so you don’t lash out when you feel threatened. There are better ways to assert your leadership or get things done than using potentially damaging manipulation tactics.

Bottom line: negging is real and it will affect you. A business relationship is no different than any other relationship, no matter how successful or large the company is, or how fancy the office space is. People are people. Always trust your gut. If you think you are being negged, and careful reflection confirms this, you most likely are. Find the best path forward to cope and deal with it effectively. And, if you neg others, please seek help and stop.

The business neg can show up in various ways. Often, it appears along with other red flags. Once you learn to spot it, you see it more often, and you can deal with it rather than feeling troubled by it.

Note: This blog was originally published in July of 2021, but I have updated it due to feedback in December 2021.

When Product Management Goes Wrong – Part 6 – Frozen

In part 1 of this series, we looked at the underminer product manager. In part 2 we talked about the dinosaur. In part 3, we looked at the erratic driver. In part 4, we looked at the micromanager. In part 5, we discussed the Dread Pirate Roberts. In this post, we will explore the frozen product manager.

I thought I had finished this series with my last post, but I realized there was an obvious omission, so I am adding this post to the series. Originally this post was all about fear, but I realized that wasn’t accurate. Fear is a driver of the frozen product manager, but not the only one.

The frozen product manager delays decisions and derails product development due to their own fears, insecurities, biases, and their reaction to political dynamics.

They miss opportunity windows in the market due to delayed decisions, or not making decisions at all.

The Frozen Product Manager:

  • Procrastinates excessively
  • Doesn’t make product decisions until the last minute, or is forced to.
  • Hijacks strategic work and brainstorming in favour of mundane project tasks and busy work.
  • Allows problems to fester until they become damaging to the product and team.
  • Uses processes, tools and other forms of busy work to redirect and consume productive efforts that would be put towards improving the product and innovating.
  • Attacks and thwarts visionary product or feature planning. To excuse their behaviour, they cite scope management, budget, technology or other limitations.
  • Is resistant of any change, no matter how minor.
  • Claims to be in favour of innovation, yet secretly undermines any efforts to do something new and creative.
  • Refuses to acknowledge the elephant(s) in the room. ie. Ignores obvious problems or issues that require attention.
  • Lashes out at suggestions for improvement, new technology observations or at creative solution ideas. This can be overt or covert.
  • Undermines efforts to improve, innovate or grow the product
  • Also undermines the people who point out and try to solve problems that involve hard work, innovation and difficult decisions. Overtly by arguing and escalating conflict, or covertly through endless arguments and discussions that derail any real progress or decision making.
  • Attacks people who highlight any problems on the project (ie. kills the messenger) and demands that any problem brought forward must be accompanied by a solution.

Fear

Fear is normal for a product manager, but  dysfunction sets in when we have unhealthy reactions to it. When fear prevents you from moving forward with new ideas that will improve the product and the experience of your users, then it is getting in the way.  Fear can cause you to freeze up and stagnate, or it can be channelled into something positive.

No matter how confident you are and well informed you are in your decisions, things can go wrong. Fear can be a good indicator that you are on the right path because you are out of your comfort zone, and you are taking a risk with the product. For example, needing to pivot a marketing campaign due to current events can have some healthy fear:

“Oh no! Is our messaging wrong for the current state of the market, or due to world events?”

That is reasonable fear that the direction you are going is the wrong one, and you fear the results of being perceived to be tone deaf. A healthy and productive way to deal with this fear is to research and check before proceeding, and adapt as needed. An unhealthy reaction is to freeze up and scuttle the new direction altogether. (Unless it is so inappropriate it needs to go.)

You may also have fear about changing direction quickly to adapt. “What will the marketing people think? Will they agree or will they see me as indecisive or over reacting?” Once you set that fear aside, you may also have some fear about whether they have availability to adjust to better suit the product and environment.

Finally, once you have pivoted to better match the current environment, there will be fear about that new campaign and whether it will work or flop.

This is all normal. Most times, when you confront that initial fear and adapt, your product is now placed to reflect the current reality, even if your campaign isn’t perfect. Your competitors who do not adapt are left behind and need to play catch up.

This scenario plays out in various ways in a product, from marketing, feature development, technology choice and others. When a force causes you to worry about what you are doing, and you are able to pivot and adapt, there will be fear. Addressing the fear will help improve your product. Being ruled by the fear will have negative effects. When you are afraid to deal with problems or opportunities, the product and the team will suffer.

Personality and habits

Some of us are conflict averse, others thrive on it. However, too much in one direction or the other can cause frozen outcomes.

Some of us are decisive, while others indecisive. Making decisions too quickly without enough information can put a product on the wrong trajectory, while indecision leads to paralysis.

Some of us are procrastinators, while others get things done efficiently. Sometimes procrastination pays off when a solution arrives due to time researching and thinking, but too much can lead to zero productivity.

It’s important to be aware of what your preferences and habits are, and to counter them when they become an impediment. Each of us has a mix of skills and preferences, and while it can be uncomfortable, it is important to have personal tactics to counter them when they get in our way. Experience will help you learn and grow as a person, and to know when to indulge or reject what is comfortable or habitual.

Biases and stereotypes

Our own biases and negative beliefs can also cause frozen behaviour. A product manager may dislike certain technologies and have deeply held beliefs that are based on hearsay or folklore. I worked with a product manager who hated JavaScript and did everything they could to prevent the technical team from using any JavaScript in their projects. When pressed for reasons, they would just repeat flimsy arguments that they had read online that were often outdated, or bordered on conspiracy theories. When pressed and shown that their negative opinions were unfounded, they would respond with anger, shutting down the debate.

They may stereotype other people and groups, such as marketing being out of touch with technical needs, or the technical team being clueless about user experience. If communication with others
is driven by stereotypes, it will create tension and destroy cross functional collaboration.

Idealistic philosophies

A lot of people love to subscribe to a particular management approach or philosophy. Frozen managers can be impacted by a blind, zealous adherence to one approach that they will stick to no matter what the outcome. Management styles that focus on building consensus and democratic team decisions can be especially susceptible to frozen behavior.

Product decisions are difficult, and it is extremely hard to get consensus between individuals and teams. I have to make decisions that anger one group or another, because the big picture for the product needs to win out.

When I start working with a new team, I tell different groups that at some point during the project, their group will be upset with me. In fact, when it seems I am getting along well with one group, another is likely upset because of a decision I had to make. Sometimes, everyone is mad at me.

When working with a new startup,  I explained this to marketing, sales, business stakeholders and the technical team. For whatever reason, when I was resolving conflict between marketing and the technical team, I tended to side with marketing. The techies were upset, but they were good natured and understood that the decisions helped move the project forward. One day though, the dev manager came to me with a serious problem related to a technical decision. I understood immediately that they needed to change course, but the business stakeholders, marketing and sales, and others were against it. They felt that the tech team were making a capricious decision towards new and shiny tech because they had already settled on a different solution.

We had an emergency meeting, and I sided with the tech team. While it seemed like a distraction, it was necessary due to unforeseen tech limitations that had emerged. It would add an extra couple of weeks to the schedule, but it meant we could actually ship with critical features. The tech team was shocked, and came to talk to me afterwards, thanking me. I reminded them that I had told them that some days they would be angry, other days they would be happy with my decisions. They laughed, and we gained mutual trust.

It isn’t fun to have people and groups be upset with you. However, it is part of the role as a product manager to find a path through competing views on how to move forward. A lot of my work is to step in to help companies when the product manager role is frozen with indecision. Consensus builders struggle when there isn’t a compromise, such as the tech issue described above. You can’t go half way with a tech change mid-project. In that case, the conflict averse product manager will side with the most politically powerful. That might be good for self preservation in an organization, at least in the short term, but it means the product itself will suffer.

The idealism of everyone getting along and building consensus can be a difficult one to maintain, especially when there are time and market pressures. Making a decision and disappointing some for the good of the whole is difficult, but necessary at times.

Politics

Often, there are political forces at play, and a product manager may be pressured by others in the organization who wield influence and power over them in various ways. In some cases, the product manager perceives pressure and undue influence when there is none. Their insecurities get the best of them and they start imagining how other groups would prefer they behave, rather than the obvious and positive value creation activities.

“But our team doesn’t have politics” people say. That is false. When an individual starts a project and transitions from a sole hustler to growing a team, the politics start as soon as you add a second person. The team of two may appear to be in complete agreement, even when there is conflict and decisions are finally agreed to, but add a third person and the rough edges appear immediately.

It is tough and scary to go against the politically powerful. If you can demonstrate the good of the product, and do it with respect and kindness, you will usually win them over. If you don’t, their own ego and insecurities will prevent product success. If they force you out, it was meant to be because you are attached to a losing proposition.

Intimidated by Others

When you’re good at something technical, it requires a lot of effort and a certain amount of special skills. Techies are often elevated by others who lack those skills, and to be honest, it can feel good to be put on a pedestal. It can be easy for our egos to get out of control when people compliment us all the time, and we are known for solving certain kinds of problems better than those around us. The problem with too much of this is that it gets comfortable and we start to let it get to our heads. An unfortunate outcome is to feel superior to others, and then be excessively threatened when someone else comes along who can do some things better than you can.

Technical teams can be full of people who are intimidated by others who have different perspectives, different skills, and different strengths and weaknesses. Often this fear results in mediocre teams full of people who are all the same. (Sadly, tech teams of white males is extremely common in North America at the time of writing this.) Beyond the visible similarities, those with political power often hire people who have the same level or lesser skills than they do. That way they never feel uncomfortable, and are reinforcing their position and ego. Sadly, this fear always results in an inferior product (not to mention the organizational and societal problems that are created and reinforced. Diversity of people, skills, ideas and strengths always results in better products, even though it can be scary when you aren’t the smartest person in the room anymore.

What do you do instead?

  • Try to make a decision, any decision, when you are scared and frozen. A poor decision is much better than no decision.
  • Ignore your biases and stereotypes. Give people and groups the chance to meet or exceed expectations. If you treat them a particular way and expect them to behave in a particular way, they will likely meet those expectations, be they poor or good.
  • Have courage and follow your instincts. You can always fix a poor decision if you are honest and forthright. You can’t fix something you hide from everyone, and you can’t fix being frozen and doing nothing. The market and your customers will move on.
  • Be careful of letting the political environment dictate your product decisions. When the product suffers, the people you are sucking up to politically will lose respect for you, and your credibility will suffer.
  • Get advice from key, trusted people. You can’t know everything, and their perspectives are valuable when informing you. Once in a while, the ideal solution will appear from another source.
  • Build a team of people who are better at you at things. Don’t be intimidated because they excel in areas you are weaker at, be drawn to them and use their expertise and ideas to help the team and product succeed.
  • Research, research, research. Stay on top of market forces, customer needs, technology solutions and current events. If you get stuck in your ways and always do the same thing because it feels safe, you will miss out and your product will suffer.
  • Do the opposite of what feels comfortable. When you are frozen with indecision, look at the source of your feelings. What do you wish was different to make things easier? How would you react if things were ideal? Now, look at how this situation is preventing you from doing what is comfortable, and do the opposite of what feels comfortable. For example, if you always push for consensus no matter how long it takes, but you re unable to get agreement from others, make a snap decision to break the logjam.

Frozen Startup Founders

One of the most common frozen situations I run into is with startups that don’t have an official product manager. The founder tends to play that role in addition to everything else they do. This feels natural because the company exists because they had a great software idea. They were able to identify the need, spin the right narrative about it, sell it to investors and potential customers, and attract and motivate a technical team around them to deliver.

However, they have a lot riding on success or failure of the project. They have huge financial interests and may be so extended financially that they face personal financial ruin if the project fails. There are also reputation and personal dynamics at play as well.

Startup founders fear the failure of their product and their personal ramifications so much that they are extremely motivated to succeed. To investors, this is ideal. They feel that this person is so motivated, they will succeed no matter what. Sadly, this is often a false impression. The fear can paralyze the person when it comes to making decisions about the product that they are uncomfortable with.

Startup founders that get frozen are often gifted at finding a market niche, addressing that with a product idea, raising money and assembling a team. However, those skills are not necessarily transferable to product management. Here is why:

  • They may not be up on the latest market trends or recent changes with regards to technology or preferences of people they aren’t used to dealing with. (eg. older founders refusing to learn about  younger people and social media marketing effectiveness)
  • They may insist on using technology that they are familiar with, but it might be outdated or unsuited to the current environment. (eg. refusing to create a mobile app for their solution because they can access a website on any device.)
  • They may be held hostage by the technical team. It is incredibly hard to attract tech talent to startups. Startups are risky and don’t pay very much, so it is tempting to make sacrifices to provide them with alternatives. They may grant people with senior titles and roles they don’t have the experience for. They might attract desperate people who have mediocre tech skills who are willing to take on the risk and lower pay. They may supplement pay by providing stock in the company, but that is dangerous because it provides techies with a huge amount of power, especially if they are voting shares.
  • They may be held hostage by investors, many of whom are not current with market, tech and other forces, and will have undue influence on the company because they want to see a return on investment.
  • They may have a massive ego that prevents them from learning about new things, listening to others, and respecting talent and skills that they lack.

When startup founders are feeling extreme fear, they often do the following:

  • Double down on what they have already been doing, even though it isn’t working. (The market is rejecting their idea, or the technology isn’t well suited at the time, etc.)
  • Make a desperate move. When this comes from a place of extreme fear, it is usually foolhardy. I have seen startup founders fire people unnecessarily, pull funding from successful efforts, take out massive personal loans, lie, cheat and steal.

Contrary to what pop culture likes to portray and business folklore loves to perpetuate, desperate last minute moves in the face of certain failure do not tend to pay off. Instead, calm, rational, informed decisions can help save the day.

One of the most difficult parts of my job is helping frozen founders. Their fear of failing can be so paralyzing and difficult, and requires them to not only cope with their startup and all of those pressures, but to work on trust and personal growth.

A good product manager will provide sober second view of the founder’s dream, while owning that dream of a successful product themselves. Since they don’t have the weight of the world on their shoulders, they can move forward with a lot less fear than the founder is usually capable of.

Product managers are either vital in helping address the paralyzing fear that presents product success, or they contribute to it and doom the product to mediocrity or even failure. 

As the song says, sometimes the best thing you can do when you and your product is frozen is to let it go.

When Product Management Goes Wrong – Part 5 – the Dread Pirate Roberts

In part 1 of this series, we looked at the underminer product manager. In part 2 we talked about the dinosaur. In part 3, we looked at the erratic driver. In part 4, we looked at the micromanager. In this post, we will discuss the Dread Pirate Roberts product manager.

The Dread Pirate Roberts manager threatens people in a misguided attempt to motivate them.

This is the final post in this series on product management anti-patterns. While it seems outrageous, and maybe amusing, I can assure you it is not the least bit funny to the people who suffer under it.

SPOILER ALERT.

In the book (and movie adaptation) The Princess Bride, there is a character called The Dread Pirate Roberts. Another character, Westley, is captured by the Dread Pirate Roberts. Every night, the Dread Pirate Roberts says to him: “Goodnight, Westley. Good work, sleep well, I’ll most likely kill you in the morning.” The Dread Pirate Roberts never does kill him, but this goes on for three years.

END SPOILER.

Dread Pirate Roberts

The character in the movie is fantastic and likeable, but if you analyze their behaviour, much of it is deplorable. This scene is so memorable that I have named this particular anti-pattern after it.

There are some product managers who think this pattern is a good tool. “We can’t let people get too comfortable, and there is nothing like a threat to motivate them, right?” Well no, fear is a terrible motivator. You’ll get a minimal level of compliance, but you will not motivate people at all. In fact, you’ll do the opposite and demotivate people. (You might also might end up being disciplined by HR, or by legal authorities if you do this.)

Here is an example.

I was helping a software company with some process tuning and advisory work. I would drop in for a couple of days a week and help them with whatever came up. They had created a new team to work on emerging technology, and they worked independently from everyone else on a brand new product line. It was an experiment to try to jolt some new thinking into a company that was resting on past success a bit too much. They had assembled a young, inexperienced but highly talented and motivated development team. They also had communications, marketing, PR and visual designers from the rest of the company working on the same floor. The office was in a gorgeous brick building that was a former factory and it was warm and inviting. The team used an open environment that practiced hot-desking, meaning no one had assigned space. You arrived, plugged in your laptop into a docking station, and claimed that space for your own that day. The next day, someone else might be sitting there, and you would find an alternative.

The development team had staked their claim on one side of the office and found it more productive to sit together. They would pair, diagram, brainstorm or play with nerf guns or have foosball tournaments to ease some stress. The marketing, sales and PR folks claimed another area on the other side of the office. The kitchen was closer to them, and they laid claim to a couple of vintage arcade games for stress release. Everything else in between was a free for all. There was a bit of an empty gap between the teams, but they tended to sit closer or further away depending on what they were working on, and there was a lot of positive energy and good natured joking.

One day that all changed for the worse.

I walked into the office that morning and instantly felt tension and stress. Heads were down, conversations were short and hushed, and the product owner had stopped working in the dev area and set up shop on the other side with the marketing ands sales people. The visual designer dragged herself into the office, clearly fighting the flu. Normally she would have spent the day in bed or worked from home. There was nowhere else to sit, so with reluctance, she set up near the developers. A mountain of kleenexes began to build around her and there were copious amounts of hand sanitizer being used by by everyone else. She looked nervous and afraid.

When the daily standup started, the individual reports were short, guarded and the usual joking and camaraderie was completely absent.

The product manager had returned that morning from a two week trip out of town visiting various client sites. Obviously, they had done something to throw off the team. Whatever they had done was clearly having a negative effect on productivity. The build server was quiet and when I checked the feed for source control, there wasn’t the usual pace of checkins. When I checked the productivity software, stories weren’t moving through the process like usual.

This was odd because they were self-organizing team with strong DevOps practice and they usually pushed several builds to a staging server every day. There was at least one production push per week that included bug fixes, new features and other things that customers and stakeholders were interested in.

Over the next couple of weeks, there were no pushes to production. I monitored source control and build machines, and found code check-ins to have slowed down considerably. The quality of conversations on new designs lacked the creativity that they did before, and the product owner would only show up briefly to clarify or discuss an issue, then would scurry back to the other side of the building. Everyone got sick because they were stressed and working too much, and they would struggle to work from home, or drag themselves in to meetings.

I couldn’t get much out of anyone other than “Talk to the product manager”, so I made sure to track him down. I asked what was going on and that I had noticed some changes with the team. He explained that senior managers weren’t happy with the team’s productivity, so he called a team meeting and threatened them. He told the team that he had picked each of them for this project, and he could easily get rid of them and replace them with other people who were coming off of other projects in the company, or hire new people. He had created this team and he wasn’t afraid of tearing it down and starting over if they didn’t start working faster.

“Are they afraid for their jobs?” I asked.

“They should be.”

Wow.

You can imagine my reaction. I tried to be calm and not ask if he was ****ing crazy, and if he really wanted the product to fail. Well, I did say it, but in much more diplomatic terms. The product manager told me he was directed by senior management to “put the fear of God in the team” and they wouldn’t relent. They understood it was killing productivity and it had a negative effect, but they were just doing what they were told.

The product manager felt the approach was drastic, but appropriate. When I also talked with senior management, they said they had no intention of letting people go, but if they feared for their jobs, they would work harder. No matter that people were so stressed out now that they were getting sick, no matter that productivity had plummeted, no matter that some team members weren’t even hiding the fact they were looking for new jobs, no matter that the development team had packed their belongings in boxes, waiting for the inevitable tap on the shoulder from HR. They felt that eventually people would work faster and do exactly what they wanted.

The opposite occurred. Productivity cratered, and people did get some work done, but the creativity, unique and amazing solutions disappeared. (There go some of your product differentiators.) After advising that this was abusive and not a suitable way to treat people, I left the project. It seemed that leadership were far more interested in feeling in control than creating a great product and having a productive, happy team. Their egos were stroked by controlling people, so that trumped everything. (This is likely why the Dread Pirate Roberts behaved the way he did in the movie threatening and toying with a captive.)

As you can imagine, the product didn’t succeed. Yes they managed to force compliance, but they created a weak imitation of their competitors’ offerings and it went in the market. People on the team left the project and most eventually left the company for more suitable working environments. (ie. non-abusive ones.)

As a team member, if you feel a constant threat hanging over you, you are going to feel fearful and not at your best. All this brain power is needed to deal with your fear and worry about what might happen instead of using that brain power for problem solving. It is horrible to experience. In fact, it is a form of abuse. People will respond to abuse in non-deterministic ways. ie. you have no idea how they will react, but it won’t be good. It won’t be good for the individuals, the team, the product or the company. It won’t be good for you either, if you are the abuser.

Threats and fear are tools that will enforce some level of outward compliance, but they are terrible motivators. Managers who think that they will increase productivity or get products or features out the door faster if they use empty threats are abusing their colleagues. Much like in the movie The Princess Bride, they may feel a sense of control and justify using empty threats. That was a movie though, in real life, pirate behaviour is abuse at best, and criminal at worst.

The Dread Pirate Roberts:

  • Makes threats, either explicit or implicit
  • Never follows through on said threats
  • Believes in antiquated management theory (ie. that fear is a good motivator)
  • Manipulates people to get their way
  • Changes their mind all the time so you never feel like you can make them happy
  • May deny making the threats when confronted, expecting you to doubt your own version of events. (Also known as gaslighting.)
  • Is likely a psychopath, or is directed by one

In another case, I witnessed an accidental Dread Pirate Roberts, but the outcome was the same: demoralized staff who weren’t as productive as they could be.

I had worked with the CTO of a rapidly growing startup at a couple of other organizations. We weren’t close, but we had mutual respect for each other. He told me that their product teams needed some help, but wanted to make sure there was a good fit with me and the VP of Product at his current venture. The VP of Product was a big, good natured man, with full sleeve tattoos and a shaved head with a large beard. He looked intimidating, even though he was kind hearted. He played up the look, wearing black t-shirts, designer work boots and a wallet on a chain. He joked that he was the king hipster of the company. He had a loud voice and a loud laugh, and because he had a nagging shoulder injury, he frequently crossed his arms when talking or listening.

The office, like many software companies, was dog friendly, and he brought his Rottweiler in to work every day. The dog was large and imposing, but quiet and would stare at people. Since it didn’t have much of a tail, it was hard to tell if it was happy to see you or not. It silently followed him around. It was an intimidating creature, especially if you weren’t used to dogs.

Prior to working at this company, I had heard rumours of this “big biker type VP” who would walk around the office with his dog yelling at people. I chalked it up to campfire project horror stories that programmers tell to scare each other. When I met with him and other leaders, I didn’t find him intimidating at all. He was open minded, kind hearted and quick to laugh. Reputations usually have a grain of truth to them though, so I was wary.

It seemed like a good fit, so I was brought in initially to help with User Experience (UX.) The UX folks had done a fantastic job, but the VP of Product had decided to forgo usability testing prior to shipping. The CTO asked me to come in and do an expert review and a heuristic evaluation. My work product was a report, and I made a few recommendations. I was careful to back up all the recommendations with at least two citations from UX experts. The UX team loved the report (I independently verified their concerns), but I was a bit worried when a nervous CTO told me to be careful, the VP of Product might need to be “treated delicately”.

The next morning, the VP of Product marched into my office, with his dog right behind him. His huge frame filled my doorway, and he threw a printed copy of my report on my desk and started yelling at me. His dog peered from behind him, and it felt like the dog was scowling at me too.

Picture in your mind a large, slightly heavyset man with a trucker hat, large framed glasses, Harley Davidson t-shirt, full sleeve tattoos with crossed arms yelling at you. His mouth was open wide and his large beard was wagging. Not to mention both he and his large, intimidating dog are blocking your only exit.

The horror stories were true! I was witnessing it in person. I could hardly believe it and I almost burst out laughing at the absurdity of it.

I can’t remember everything he said, but I got the gist of it: my report was garbage and he was going to rip up my contract and send me on my way. Once I got over my initial shock, I told him that his behavior was unacceptable, and I wasn’t going to put up with it. That escalated the situation, so I invited him in (so he wasn’t blocking my exit) and I asked if he would sit down. He declined, but he did at least come in to the office and loom over me. I was rescued by a call on his mobile, (which he took and talked at length in front of me) and that seemed to calm him down. When he returned to me, I distracted him by asking about his tattoos. Eventually, we found some common ground, had an awkward conversation about tattoos and then he picked up the report he had thrown on my desk and walked out. The dog gave me a look over its shoulder as it trotted after him. I almost felt like it was trying to get in the last word and give me one last scolding.

I’m not easily intimidated, but it was an incredibly unsettling experience. As a consultant, I had a lot of power and could walk away. What about employees who weren’t used to people who dressed and acted this way? What about subordinates or others who felt they had to do what he said to keep their jobs? This would be frightening to endure.

I immediately walked to the CTO’s office and told him what happened and that behavior was completely inappropriate. I said I would not work in an environment like this. He agreed, and asked what we should do. I said we needed to meet with the VP of Product and see if there was a way we could deal with conflict in a more healthy way.

The next morning I met with both the CTO and a now contrite VP of Product. We explained how his behaviour was intimidating, and that the threats, no matter how empty they might be, were distressing and abusive. We also explained that having his dog around while he behaved like this made it the situation worse. It was bad enough for us dog lovers. People from cultures that aren’t obsessed with dogs like North Americans are found it absolutely terrifying. He was shocked and visibly upset. He said he knew he had a temper problem, and that he was working on it. He said he had become angry because he felt that my feedback had criticized the product, and it was his heart and soul, so he felt personally threatened. When we explained that my job was to help the product be better, and not to tear anything down, and I asked if there was anything I could do to help with delivering the information so it felt less threatening. He said he had over reacted, apologized and asked for help.

As we talked, he told us that his father was a successful entrepreneur, and was a big “Theory X Management” believer. He felt that employees are inherently lazy, and need proper motivation to work for you. That upbringing had rubbed off on him, so he resorted to it at times.

In his father’s factory, in another era, it had worked. (Or so he claimed.) However, software companies tend to be egalitarian, and technical people prefer meritocracies. You might be able to yell and scream and get your way, but it isn’t the only factory in town. They can leave. Conversely, theory Y management works well in knowledge fields, where you assume people are intrinsically motivated, and with proper encouragement will work hard and thrive. He mimicked his father’s management style, and when you combined that with his large size, loud speaking, intimidating mannerisms (not to mention his large, intimidating sidekick dog), he tended to get his own way a lot.

He was asked to leave his dog at home for a while during office hours, and to not behave aggressively because of the way it made team members feel. If he felt threatened or angry, the CTO said to come into his office and talk to or yell at him first, if needed. Once the emotion was vented, he should then have the uncomfortable conversation with a peer or subordinate.

At first there was a dramatic improvement, and he really tried hard to adjust and grow personally. Eventually, he was able to bring the dog back to the office, but didn’t bring the dog with him to meetings or when he wanted to confront someone.

I would love to tell you that this story had a happy ending, but it doesn’t. Even though he said the right things, and seemed to sincerely want our help, he quickly grew to resent it. As time went on, the behaviour came back as his attitude soured. He had trouble taking responsibility for his actions, and while he would feel terrible after each outburst, he would rationalize it. He said that people knew that he didn’t mean it when he threatened them anyway, it was just his style. But not following through on threats is even worse than following through with something drastic. At least people know where they stand and what to expect. Furthermore, empty threats cause people to lose respect for you and your ability to tell the truth.

He was eventually forced out of the company, and it took a terrible toll on him afterwards. This was a case of someone really needing to grow up, it didn’t seem like the behaviour was intentional. His motivations aside, the company rightfully saw him as a liability, and got rid of him. Sadly “the Veep with the dog who yells at people” story was spread by every employee who left the company. (And for a while, there were a lot of people who did.)

I don’t have much to say about how to deal with the Dread Pirate Roberts product manager, other than to run away when you encounter it. Unlike the movie (or book by Goldman), I have yet to witness a happy ending. (Sorry for the spoiler.) I have seen this sort of abuse many times, and it never ends well for anyone. If you’re a product manager and you ever feel the need to threaten someone, don’t. Just don’t. Stop yourself and do something to distract yourself to get yourself under control. Take a walk. Count to ten. Drink a glass of water. Go home for the day and collect your thoughts. If you do this regularly, get some counselling and take a courses on managing people.

Bottom line: this behaviour is abusive and you should never do it, even if directed to by someone else. It isn’t worth the cost to the person you threaten, and you’ll not only destroy the people who you inflict it on, you will eventually destroy yourself and your reputation.