Beautiful Testing – II

Here’s the first bit of my chapter of Beautiful Testing. I’d be interested in your thoughts …

Peeling the Glass Onion at Socialtext
“I don’t understand why we thought this was going to work in the first place” – James Mathis, 2004

It’s not business … it’s personal
I’ve spent my entire adult life developing, testing, and managing software projects. In those years, I’ve learned a few things about our field:

(1) Software Testing, as it is practiced in the field, bears very little resemblance to how it is taught in the classroom – or even described at some industry presentations
(2) There are multiple perspectives on what good software testing is and how to do it well, which means –
(3) There are no ‘best practices’ – no single way to view testing or do it that will allow you to be successful in all environments – but there are rules of thumb that can guide the learner
beyond that, in business software development, I would add a few things more. First, there is a sharp difference between checking[1], a sort of clerical, repeatable process to make sure things are fine, and investigating – which is a feedback-driven process.

Checking can be automated, or, at least, parts of it can. With small, discrete units, it is possible for a programmer to select inputs and compare them to outputs automatically. When we combine those units we begin to see complexity.

Imagine, for example, a simple calculator program that has a very small memory leak every time we press the clear button. It might behave fine if we test each operation independently, but when we try to use the calculator for half and hour it seems to break down without reason.

Checking can not find those types of bugs. Investigation might. Or, better yet, in this example, a static inspector looking for memory leaks.

And that’s the point. Software exposes us to a variety of risks. We will have to use a variety of techniques to limit those risks. Because there are no “best practices”, I can’t tell you what to do, but I can tell you what we have done, at Socialtext, and why we like it – what makes those practices beautiful to us.

This positions testing as a form of risk management. The company invests a certain amount of time and money in testing in order to get information – which will decrease the chance of a bad release. There is an entire business discipline around risk management; insurance companies practice it every day. It turns out that testing for it’s own sake meets the exact definition of risk management. We’ll revisit risk management when we talk about testing at Socialtext, but first, let’s talk about beauty.

Tester remains on-stage; enter beauty, stage right
Are you skeptical yet? If you are, I can’t say I blame you. To many people, the word “testing” brings up images of drop-dead simple pointing and clicking, or following a boring script written by someone else. It’s a simple job, best done by simple people who, well, at least you don’t have to pay them much. I think there’s something wrong with that.

Again, that isn’t a picture of critical investigation – it’s checking. And checking certainly isn’t beautiful, by any stretch of the word. And Beauty is important.

Let me explain.

For my formative years as a developer, I found that I had a conflict with my peers and superiors about the way we developed software. Sometimes I attributed this to growing up in the east coast vs. the midwest, and sometimes to the fact that my degree was not in Computer Science but Mathematics[2]. So, being young and insecure, I went back to school at night and earned a Master’s Degree in Computer Information Systems to “catch up”, but still I had these cultural arguments about how to develop software. I wanted simple projects, whereas my team-mates wanted projects done “right” or “extensible” or “complete.”

Then one day I realized: They had never been taught about beauty, nor that beauty was inherently good. While I had missed a class or two in my concentration in computer science – they also missed something I had learned in Mathematics – an appreciation of aesthetics. Sometime later I read Things a Computer Scientists rarely talks about by Dr. Donald Knuth, and found words to articulate this idea. Knuth said that mathematicians and computer scientists need similar basic skills: they need to be able to keep many variables in their head, and they need to be able to jump up and down a chain of abstraction very quickly to solve complex programs. According to Knuth, the mathematician is searching for truth – ideas that are consistently and universally correct – while the computer scientists can simply hack a conditional[3] in and move on.

But mathematics is more than that – to solve any problem in math, you simplify it. Take the simple algebra problem:

2X – 6 = 0

So we add six to each side and get 2X = 6 and we divide by two and get X=3. At every step in the process, we make the equation more simple. In fact, the simplest expression of any formula is the answer. There may be times when you get something like X=2Y; you haven’t solved for X or Y, but you’ve taken the problem down to it’s simplest possible form and you get full credit. And the best example of solving a problem of this nature I can think of is the proof.

I know, I know, please don’t fall asleep on me here or skip down. To a mathematician, a good proof is a work of art – it’s the stuff of pure logic, distilled into symbols[4]. Two of the highest division courses I took at Salisbury University were number theory and the history of mathematics from Dr. Homer Austin. They weren’t what you would think. Number theory was basically re-creating the great proofs of history – taking a formula that seemed to make sense, proving it was true for value one. Then you provide that if any number is true, then value N+1 is true – which means the next one is true, which means … you get it. That’s called proof by induction. Number theory was trying to understand how the elements of the universe were connected – such as the Fibonacci sequence – which appears in nature on a conch shell – or how to predict what the next prime number will be, or why Pi shows up in so many places.

And, every now and again, Dr. Homer Austin would step back from the blackboard, look at the work, and just say “Now … there’s a beautiful equation.” The assertion was simple: Beauty and simplicity were inherently good.

You could tell this in your work because the simplest answer was correct. When you got the wrong answer, your professor could look at your work and show you the ugly line – the hacky line – the one line that looked more complex than the one above it. He might say “Right there Matt – that’s where you went off the rails[5].”

By the end of the semester, we could see it too. For that, I am, quite honestly, in his debt[6].

Of course, you can learn to appreciate beauty from any discipline that deals in abstraction and multiple variables. You could learn it from chess, or chemistry, aerospace engineering, or music and the arts[7]. My experience was that, at least in the 1990’s, it was largely missing from computer science. Instead of simplicity, we celebrated complexity. Instead of focusing on value to customers, more senior programmers were writing the complex frameworks and architectures, leaving the junior developers to be mere implementers. The goal was not to deliver value quickly but instead to develop a castle in the sky. We even invented a term, “gold plating”, for when a developer found a business problem too simple and had to add his own bells and whistles to the system, or, perhaps, instead of solving one problem and solving it well, could create an extensible framework to solve a much larger number of generic business problems.

Joel Spolsky[8] would call this person an “architecture astronaut”, in that they get so abstract, they actually “cut off the air supply” of the business. In the back of my mind I could hear the voice of Doctor Austin saying “right there – there – is where your project went off the rails.”

Ten years later, we’ve learned a great deal. We have a growing body of knowledge of how to apply beauty to development – O’Reilly even has a book on the subject, But testing – testing is inherently ugly, right? Aside from developer-facing testing, like TDD, testing is no fun at best and rather – have – a – tooth – pulled – with – no – anesthetic at worst, right?

No, I don’t think so. In math we have this idea of prima facie evidence – that an argument can be true on it’s face and not require proof. For example, there is no proof that you can add one to both sides of an equation – or double both sides – and the equation remains true. We accept this at face value – prima facie – because it’s obvious. All of our efforts in math build on top of this basic prima facie (or “axiomatic”) arguments [9].

So here’s one for you: Boring, brain-dead, gag-me-with-a-spoon testing is /bad/ testing – it’s merely checking. And it is not beautiful. One thing we know about ugly solutions is that they are wrong; they’ve gone off the rails.

We can do better.

[1] My colleague and friend, Micheal Bolton, is the first person I am aware of to make this distinction, and I believe he deserves a fair amount of credit for it
[2] I am a member of the context-driven school of software testing, a community of people who align around such ideas, including “there are no best practices” –
[3] Strictly speaking, I have a Bachelor’s degree in Mathematics with a concentration in Computer Science. A concentration is more than a minor but less than a major, so you could argue that I’m basically a dual major – or argue that I’m not quite either one. The upshot of that was that I never took compiler construction, and, because of that, had an inferiority complex that fueled a massive amount of time and energy into learning. Overall, I’d say it could be worse.
[4] “Conditional” is a fancy word for an IF/THEN/ELSE statement block
[5] I am completely serious about the beauty of proofs. For years, I used to ask people I met with any kind of mathematics background what their favorite math proof was. Enough blank stares later and I stopped asking. As for mine, I’m stuck between two: my favorites are the proof of the limit of the sum of 1/2^N for all positive integers, or Newton’s proof of integration, take your pick. (Rob Sabourin is one notable exception. I asked him his favorite, and he said he was stuck between two …)
[6] No pun on Ruby intended. I am a perl hacker.
[7] That, and Dr. Kathleen Shannon, Dr. Mohammad Mouzzam, Professor Dean Defino, and Professor Maureen Malone
[8] My co-worker and occasional writing partner, Chris McMahon has a good bit to say about testing as a performing art. You should check out … oh, wait, he left Socialtext and has his own chapter. All right, then.

7 comments on “Beautiful Testing – II

  1. Matt,

    I concur with your post. In its purest form, testing is about truth. Following the scientific method we tend to:
    1) Define the question – What can I look/test at and why do I want to?
    2) Gather information and resources (observe) – via oracles such as our experience, requirements etc…
    3) Form hypothesis – This is our test scenario, when we do x, we expect y
    4) Perform experiment and collect data – Execute the test, noting all observations
    5) Analyze data – both in isolation and with the bigger picture
    6) Interpret data and draw conclusions that serve as a starting point for new hypothesis – Is this what I expected? Did I see something I wasn't testing for? This is where the ideation of defects begins to occur, but also suggestions for enhancements. Not to mention, questions begin to form of whether or not something else should be tested based on these observations. This is the power of rapid software testing. Allowing observations and oracles to guide the effort in a systematic way.
    7) Publish results – via both defect reports and test results/findings
    8) Retest (frequently done by other scientists) – While we may or may not retest to verify what we observed, and unless the software is never used, it most certainly will be retested by the end users.

    *I prefer the term observation to defects as it more accurately reflects notable observations that are not captured in expectations of a test result. (See Here)

    The beauty in testing is in the thinking, not the checking.

  2. So weird. I was just explaining to a tester on my team how distilling good acceptance tests was similar to simplifying equations.

    On a technical note, I found a couple of typos. 🙂

  3. First, my favorite proofs are the divisibility proofs from modern algebra. They proved:

    1) A number is divisible by three of the sum of the digits is divisible by three.
    2) A number is divisible by nine if the sum of the digits is divisible by nine.

    They're dead simple when you understand all the concepts leading up to them… but that's what made them beautiful.

    Second, I think what you have is pretty correct but would argue that, in most schools at least, they teach nothing about beauty in computer science [1]. Rather, computer science is about understanding concepts and algorithms that are deemed important.

    Misko Hevery recently created a blog entry that compares computer scientists vs. computer engineers. I believe his assertions are true. It takes engineering skill to understand and love the beauty of a simple extensible implementation (or one that can easily be refactored to be extensible if later needed).

    [1] Some schools probably make an effort to teach the students the difference between clean, beautiful code, and code that isn't… but I haven't yet found one.

  4. thank you, Kaleb. Yes, part of the point was that the reason I learned Beauty was because of my /math/ degree. That it is absent in CS is, I believe, well … sad.

  5. "Checking can not find those types of [memory leak] bugs."

    Not exactly true. I run my unit and integration tests in Valgrind, which detects the tiniest memory leak in my code. So, checking can detect these bugs. Better yet, some unit testing frameworks even have leak detection built into them.

  6. Fair enough. I'm always interested in automating checking-work away. But I suspect the number of dev teams using tools like that are under 10%.

Leave a Reply

Your email address will not be published.