It took me all month to realize that in each case I’d made the same stupid testing mistake: not enough dumb luck.
|I know what you’re thinking, punk. You’re thinking “I’ve tested every possible permutation.” You’ve gotta ask yourself a question: “Do I feel lucky?” Well, do ya, punk?|
Being quality minded, before considering my work “finished” I had of course written unit tests for a bunch of different cases that I thought would best stress the code. I wrote a few standard cases, edge cases, cases around data limits, cases around bad input, and cases that I thought really put my algorithms to the limit. With all of my test cases passing, I confidently handed off the code for real-world use.
Somehow the real world managed to create data states and logic paths that my tests had not exercised, even thought I thought I’d been very clever about getint all possible cases into my test suite.
The real world is like that sometimes. Most of the time.
The Problem Behind The Problem
It so happens that in writing these three bits of code I was being about as smart as I know how to be. I.E., writing the code required 100% of my IQ.
But figuring out code tests that mimic all the possible real-world situations and permutations and complications requires more smarts than it takes to write the code. It requires, I dunno, maybe 27.3% more smarts to test the code than to write it. No matter how hard I try I’m just not 27.3% smarter than myself.
Solving the Problem (by adding a lot more Problems)
Working alone, I have no one to lean on who is smarter than me. But I do have a pretty good random number generator, and I know how to use it. So here’s what I did in each of the three cases.
First, for each of my projects I wrote test suites that ran the tests against tons of generated data and queries. Whereas the original tests may have had something like a dozen well-designed cases, the additional random test might have 2 million cases. This was the easy part.
Second (the hard part), since the data was now going to be random, and so I wouldn’t know ahead of time what the correct results would be, there was the difficulty of determining at runtime if the results were correct. This required writing more code than was in the original libraries simply to evaluate the results. Fortunately this new result-validation code didn’t need to be fast or clever, just accurate, and so it could be written so simplistically that even someone like me would have a hard time getting it wrong. (As a bonus, I stored the random input in a temporary file on each run, so if it did fail I could see just what input had caused the failure.)
With millions of random cases running, every error that had been reported (and more) very quickly popped up and I could very quickly squash them. I released my code again and this time no error reports came back.
I know a lot of people dislike random testing because it is, almost by definition, not reproducible. Those people must be smart enough to figure out enough non-random tests to exercise every part of their code--I’m not.
So in the end I didn’t have to be smarter than myself to test my own code. I just needed to make lots of my own dumb luck!
Today’s Takeaway: When you’re done writing all your best test cases, add a few million random cases for good measure. If you’re anything like me, you’re not smart enough to think of a test for everything.