The place for #ChatGPT

, Two weeks ago I took my first, presumptive look at #ChatGPT for testing. The app has real potential, and seems to have some ability to learn. It’s tempting to run ChatGPT through a few exercises and come to a conclusion. Really the software needs a bit of a deep dive to come to any significant conclusion.

If you’re not going to do, but you want to outsource it, well, real on.
But first, I digress.(It’s worth it.)

The Bluffer’s Guide

When I was in high school my mother purchased a series of books called something like “The bluffer’s guide.” Each would have a title worded “The bluffer’s guide to X”, where X might be golf. wine, the stock market, computer programming, or skiing. Each book was very thin, perhaps 30 pages, and very small – perhaps three inches by five inches in size. A quick reader could digest one of these books in about an hour. Each book would teach you how to convey the idea that you had much more expertise in the subject than you actually did. The computer programming book, for example, suggested that you find out what programming language the other person knew. If they knew C++, you could say you were a Fortran Programmer, and if they said Fortran, you could say C++. The finance book said that if you were ever pressed on a recommendation, you could say “I’m holding off a bit to see what the Fed does next”, as the Federal Open Market Committee (FOMC) meets about every six weeks, and higher interest rates will be bad for the stock market, while lower interest rates will hurt the economy.

As I read I remember coming to the slow realization that the books were actually teaching you about the subject. Not a great deal; I didn’t know why rising interest rates make the market go down. But a little. I particularly remember the Bluffer’s guide to golf suggesting that you wake up early in the morning and walk a famous course, enough that you could make up an inventive story about a few holes, along with the hole number. Anyone in your listening group hearing the story who played that course would be inclined to believe you. It struck me at the time that that was a lot of work to go to for a lie. A reader might be more likely to actually try golf instead. Perhaps that was the point.On Bullshit

The point of bluffing was to create the illusion of mastery. With an hour, a good bluffer could appear to mere civilians as an expert.  JB Rainsberger told me that among practitioners, all that it takes to have expertise in a specific subject is to read three books. I’m not sure I agree, but perhaps JB was speaking of real tutorials, while doing exercises and ffollowing along.

All this reminds me of bullshit. And by bullshit, believe it or not, I mean something specific. Harry Frankffort’s wonderful little book defines Bullshit and something between the truth and a lie. Or, perhaps, in a direction.  Where truth and lies are left to right, Bullshit might be up and down. I will explain.

On Bullshit

Frankfort says that bullshit is speech designed to convince without regard for truth. The bullshitter has an agenda – to get you to do something. Words come out of their mouth. The words might be true, they might be a lie – the bullshitter does not care. Asked why a practicing political philosopher would focus on such a lowly task, Frankfort replied “Respect for the truth and a concern for the truth are among the foundations for civilization. I was for a long time disturbed by the lack of respect for the truth that I observed… bullshit is one of the deformities of these values.”

As a software tester and journalist, I understand that statement. Often I have explained the value of testing is that it has the least incentive to lie. Project Managers can ignore problems; analysts can say that the analysis is done because a deadline has arrived. Architects can wave their arms and draw boxes. Programmers can shrug and say “it compiles” It is the tester who wills uh “I can’t log in”, “search is failing”, or “I cannot check out with a credit card.” In the other roles, you can get in trouble for speaking truth to power. In testing, when I have asked respectfully “You want I should lie to you? I can do that if you’d like”, the answer has always been no.

Combine bluffing and bullshit and you get ChatGPT.

That sounds overly critical, so let me explain. Bluffing Bullshit is different than full of sh*t. It implies the reader at least did read one book on the topic. Think of it as a college freshman or sophomore, in a class outside their major, trying to get at least a C minus on an essay. There won’t be any real insight into the topic, but our college student will put the effort in to go to wikipedia, plus doing keyword searches, to come up with the right buzzwords to use. They will also have conversational English. Words will have proper tense. I asked #ChatGPT to create a sales script for outsourced and offshore testing services. This is what it came up with:

Certainly! Here is a template you can use as a starting point:


Subject: Offshore Software Testing Services

Hello,

Are you looking to improve the efficiency and cost-effectiveness of your software testing process? Our company specializes in providing offshore software testing services.

Our team of experienced QA engineers can help you ensure the quality and reliability of your software products, while providing significant cost savings compared to in-house testing.

Some of the benefits of working with us include:

  • Flexible engagement models to fit your specific needs
  • A wide range of testing services, including functional, performance, and security testing
  • State-of-the-art testing infrastructure and tools
  • Timely and accurate reporting

We would love the opportunity to discuss how our services can support your business goals. Please let me know if you have any questions or would like to schedule a call to learn more.

Best regards, [Your Name]

This is not wrong, per se. Not exactly. It just demonstrates no real insight. A human copywriter with no testing experience at all, tasked to create a sales page or email for testing services, would likely come up with something like this. Efficient is good, right? Effective? Low-cost? State of the art? Good reports? You could make something very similar for almost any outsourced knowledge-work job. In fact, humans without expertise often do come up with these sort of nothing statements. A few years ago when the folks at TechBeacon asked me to write an article on container technology tools, I found similar language. After reading the sales pages I still did not know what half of them did, just that they were all “scalable”, “enterprise grade”, and “secure.” This was especially true of the tools earlier in development. In some cases, perhaps the new CEOs had not even decided what the product was yet, so they cold pivot. It seems more likely the company has not made the investment to high professional marketing slash product people yet, who could do both jobs.
So, how does this happen?

How ChatGPT Works

Imagine software that can scan the open web and understands concepts at least as well as a college freshman outside of their own major. The software also has an understanding of programming languages, perhaps to a deeper level, because the programming languages have a strict structure. The code could be trained on all the source code on GitHub and Stackoverflow, all the descriptions on wikipedia. The software can parse out complex words from the common, recognizing, for example, that “service layer” is a term that is important and different on the Test Automation page of wikipedia. If can do the equivalent to a search on those keywords, even if they are not links, on whatever it is trained on, which is likely a subset of the open web — plus insight gleaned from users. Like the JB Rainsberger idea mentioned earlier, it can look up a dozen different sites that discuss page objects, and find the commonalities that the majority agree on. The tool can take two-dozen programming languages in a half-dozen languages, compile them down to intermediate code, then spit out the result in whatever language you ask it for. And, like any good bluffer, the tool is changing the words just enough that no one can accuse it of plagiarism.
Remember, we are talking college sophomore level, but coding expertise. If you ask for a program to print out all numbers from one to a hundred, it can generate it. If you ask gotcha coding challenges, and the answer exists on the web, it can figure it out and spit out an answer. If you ask mathematical questions in conversational language, the tool can read at a college level, convert your words to symbols, enter that into an interpreter and spit out an answer. All that is what it does outside the box, without expertise. Interestingly, #ChatGPT is currently performing in the 73rd percentile of language models tests for general knowledge and 65th percentile for the US SAT exam for analogies. Knowing it had this sort of general skill, I did try to take ChatCPT on a “programming quiz” website, and found the answers were not quite good enough. To get to success, the programmer would need to be more like a college senior in Computer Science, both asking more refined questions and refactoring the provided code. This would require a very similar amount of expertise as actually solving the problem.

Bullshitting a Bullshitter

Alex has an interesting video where he talks about the potential for tools like ChatGPT. It’s worth a look, but here are a few things he mentions.
  • AI Labs already has a tool, DALL-E, that can generate images based on a few words. A movie is just the same image changing slightly. We could make explainer videos text-to-speech, possibly using speech from #ChatGPT. Text-to-speech can add audio.
  • We could analyze millions of online dating conversations to figure out “what works” and generate unique approaches to get to that first date.
  • We can analyze millions of people and create an avatar for the most beautiful person, then combine item one and two to make it respond to words
  • We can use AI to analyze, summarize large bits of information, then use it to create emails or other communications.
Hormozi’s final idea was to use #ChatGPT as a research assistant, as that college sophomore, with the human doing the training. That last bit is my words, not his, but it boils down to his suggestion. Watching the video, I couldn’t help but notice that the ideas could  happen … someday. Maybe. They are mostly logical extensions of ideas we already have.
However, I don’t expect any of this to happen tomorrow. The most current work I see in the field with ChatGPT is not using the chatbot nature of the software at all, but instead throwing large datasets at the tool and asking which if these is not like the other.
If you are worried about #ChatGPT taking over your job, please realize, I just did on one site for COBOL remote jobs and found two hundred and eighty. Newspapers, have been around since 1605 and beleaguered for two decades. Yet many small papers have found ways to innovate in their business models and do just fine. The internet threatened to make magazines irrelevant, again, many of them are finding ways to do just fine. The first talk on the death of testing I know of was in 2011. The people who gave those talks have retired, or at least moved out of the field directly. Yet testing remains.

Moving Forward

Ironically, the way to lock down an AI bullshitter is the same way to figure out if AI is bullshit – ask for details.
  • * What is possible with AI?
  • * How would you do it
  • * Can you show it working?
  • * How do I do it?
  • * Okay, I did it. So … what?
The answers you get might not be nearly as gee wow nifty powerful as the video I linked to, but they’ll be real.
As for the rest, we’ll continue to see new ideas creep in. Having an assistant that can summarize wikipedia and the open internet has some value — remember, books are behind a paywall.
I’m not the first person to recognize that when ChatGPT is out of ideas, it tends to guesstimate conversationally, nor the first to label it BS. What I hoped to do in this essay was put a little more precision on what that BS looks like. This might help the reader figure out how it could be useful, what the limits are.
All that might change, of course. For today, the work the tool can do seems superficial. Part of that is the complexity of the examples and limits of training. Hopefully, the reader gets some ideas of what we can experiment with next.
Tell me how I’m wrong.
And tell me what we should do next.

One comment on “The place for #ChatGPT

Leave a Reply

Your email address will not be published. Required fields are marked *