What happens when this stuff doesn’t work?

Thanks to the internet, people in software delivery seem to actually remember the last decade or two of history, and are less bound to repeat and reinvent it.

Yet the time before that is largely forgotten.

Edward Yourdon‘s case is particularly interesting. His book Decline and Fall of the American Programmer, while not wrong, was perhaps four decades too early. I find Deathmarch, the book about projects that are created with impossible odds, to be his capstone work. In Deathmarch, instead of predicting the future, Yourdon just wrote about what he knew, and he knew a fair bit about a fair number of things that were fairly relevant to my work experience. Perhaps he was better off sticking to what he knew, instead of what he predicted.

Sadly, Yourdon did make one major blunder. He completely over reacted to Y2K, the idea that computer years were stored as two digits and everything would stop working on January first, two thousand.

Now, I’m not saying that the advice to focus on the problem and fix it was bad. That was solid advice. What Yourdon got wrong was the extent of his advice to prep. I’m talking serious advice to prep – have a few cases of Meals Ready to Eat (MREs) in Your basement, have guns and ammo, purchase silver and gold, get a used car, old enough that it had no computer chips in it, get a bunker in the woods, power it with a diesel generator, get a large diesel storage tank, put a fence around your bunker, have a natural spring well that does not rely on electricity, and so on. I’m really not exaggerating; he published a great deal of material, including the Y2K home preparation guide. In some ways, you might say, he profited off fear.

Don’t get me wrong. I believe Mr. Yourdon had a high degree of personal integrity. He was trying to help people. He was just … wrong. After Y2K was essentially a non-event, Yourdon published a second edition of DeathMarch and quietly retired to New York to take pictures. When my writing got serious in 2012, he honored me by sending me a connection request on Linkedin. Sadly, Ed Yourdon passed away in 2016.

The one thing I have mixed feelings about is that principled, and sadly, wrong, stand he took for Y2K. Mixed, because I feel called to make a similar stand about the use of AI in software development and, in particular, testing. While I possess neither the age, gravitas, or financial position of Ed Yourdon in 1998, I am secure enough to take some risk. So, if I may, I’m going to go with a quote from James Whittaker here: 

“This Shit’s Never Gonna Work”

The Whittaker Video is especially interesting to watch, you can watch first the technology demo he was talking about, then his commentary, then his commentary on the future of test. You don’t have to watch the whole thing, just click the link and watch for fifteen minutes.

One of the reasons the video is great is because like Yordon, Whittaker took a bold stand, that the future of testing was test sourcing and reusable test assets. That vision hasn’t really borne out. As it turned out, if we could reuse the tests, we probably could just reuse the software as well. Yet James Whittaker managed to have a dozen more years in software, before moving on in 2021 to start a brewery, bar and grille in Seattle. 

So, again, this shit’s never gonna work.

Allow me to be specific.

We have this idea that there will be automated tools that can be given a website, test it, and generate results for us. Likewise, that we can give tools plain English and they will generate a website or mobile app for us. Better yet, we will have multiple agents, one that is our dev agent, who will talk back and forth to a test agent, who will then pass the code to an ops agent, who will put the code in production.

Once More: This isn’t going to work

Now it is humanly possible, for some of the simplest, Create-Read-Update-Delete software, with a clearly defined database, you might be able to generate some ruby on rails code, or something like it. For sure. But think back to any moderately complex piece of software you ever programmed. Didn’t you have questions? Didn’t the questions cause you to send an email, stop work, and go work on something else? The best agile shops I have ever worked at decreased this significantly with a kickoff meeting. I suppose, on the best day, we could have a conversation with AI that is a back-and-forth that answers the questions and out pops the software. Sadly, that concept of a “conversation” is not really what is happening when we interact with the LLM. 

To some extent, Large Language Models (or “LLMs”) are able to spit out code. That code needs to be understood and grounded and adapted. Even in the rare situation where the new code radically speeds up the programmer, once we get to a certain speed, our time will be bound by something else, such as the answers to those questions I mentioned earlier. Nearly forty years ago, Fred Brooks pointed out that Software Developments consists of a large number of very different activities (planning, design, coding, test, ops) and that even if we made one of those things free, the overall savings would be modest, perhaps in the 20% range, because we still have to do all the other stuff. Now, if you are your own product owner, working on a video game, or have deep subject matter expertise, perhaps you can compress some of that down. In that case though, realistically, you’ve have to have deep expertise and essentially done the task before. Likewise, there are some impressive gains with programmers using Large Language Models to do Test Driven Development (TDD). That is, the programmers write the test, and the tool writes the code to implement that test. Sadly, though it is technically superior, TDD never really took off for programmers, likely because it requires a kind of intense personal discipline. If a movement comes up that tries to AI on top of TDD, it could have some success (the method could work), but I doubt the movement will have broad commercial success.

Bottom line: Once you understand how LLM’s actually work, and understand the human process to create and test software, the idea that we will have magical software created by typing in a few sentences on a screen is kind of silly. There may be some proof of concepts; you might be able to generate a web page that is a starting point to edit. Some programmers of very structured applications will be able to create code. But until the technology radically improves (as in a big step up, of the size from Google to ChatGPT), then these ideas simply will not work.

It is possible you do lose that promotion to the new political kid who talks a great AI game. The problem is, he will be blowing smoke. Somewhere between six and thirty-six months from now, the project will end, the AI team will wind down, and people will try to pretend the malinvestment didn’t happen.

Or maybe I’m wrong.

This is how I see the land of the AI tester-verse. I have tried to lay it bare as I understood it.

If ya’ll want to go back to actually talking about testing now, I’d like that.

 

Leave a Reply

Your email address will not be published. Required fields are marked *