The problem with describing the mistakes that large language models (LLMs) make as hallucinations is that it implies a distinction between comprehensible, meaningful, factual output and nonsensical, meaningless, false output. There is, of course, but only as far as you are concerned. The system is not behaving as intended in one case and not in the other; the process of creating the text is identical regardless of the output. Does that sound intelligent to you?
Extending metaphors to inanimate objects comes naturally. People think nothing of saying that ‘the car needs filling up’ or that ‘the vacuum cleaner is being a nuisance’. Language games aside, however, nobody thinks that these statements mean that the car has wants and desires or that the vacuum cleaner chose to behave badly. Yet, when it comes to interactive computer systems, the line begins to blur. This leaves people more open to manipulation in the digital domain than in other areas of life.
There is nothing intrinsically different about a silicon etched dye capable of executing computer programs and, say, a series of cogs and pulleys, in the sense that both are physical mechanisms that extend human capabilities through engineering. Ah, but computers are not purely physical mechanisms, you may be thinking, computers run software. That’s the magic, that’s the secret sauce, right?
Is software really so different from the hardware that it directs? A complex arrangement of 1s and 0s that trigger switches inside the processor and computer memory with the aim of achieving an effect in the world. Whether that effect is changing the colour of a pixel on a screen, moving a robot arm in a factory or the Tesla ‘autopilot’ system, the mechanism is the same. The size (smaller than anything perceivable with the naked human eye) and speed (modern CPUs execute billions of instructions per second) obscure the physical actions that produce the effect and the technical sophistication of the overall process which is in toto comprehensible to no one adds to the mystery but there is nothing occult about a machine that achieves an effect in the world.
Within that framework of understanding, there probably isn’t any real harm in people anthropomorphising an interactive computer system like ChatGPT, in the same way that they antropromorphise televisions, telephones or tumble driers, but that is not the frame in which anyone operates. Indeed, even without the natural human tendency to personify inanimate objects or the extreme technical arcana that makes a computer’s operation so opaque, considerable amounts of money and expertise are deployed in making you and I think about these systems as unlike any prior technological innovation.
Large language models are not just another way of interacting with a machine, they are capable of human-like or even beyond human understanding, we are told. Depending on how deranged you want to get—and people who are plenty deranged have been elevated into mainstream consciousness during the course of 2023—AI systems are so dangerous that centralised control should be established and rogue actors operating unlicensed models should have their data centres bombed. Follow the link if you think I’m joking.
Who can say whether the purveyors of such lurid imaginings are fooling themselves or setting out to fool you but it is hardly radical to note that the likes of OpenAI, Microsoft, Google and Tesla are not impartial actors when it comes to describing the capabilities of these systems. Neither, unfortunately, are most AI researchers who have responded like excited children to the extra attention being afforded their field. Hardly unique—epidemiologists and virologists didn’t cover themselves in glory in the midst the Covid panic. For what it’s worth, one of the saner voices in the room is Meta CEO, Mark Zuckerberg, but that is a conversation for another day.
If that is what AI/LLMs are not, what then are we looking at here? One year after ChatGPT was released, what we have so far is an (actually quite old) UI/UX paradigm—the chatbot—finally having its day. Asking questions is an effective way of thinking about a text and large language models provide a means for the text to answer back.
There is that use of metaphor again. The text (or just text) does not do anything, as such, but used correctly metaphors and similes are an aid to understanding, a convenience that expresses an idea succinctly with the help of a shared point of reference and provided no one is so foolish as to take the language too literally, that is all to the good.
If metaphors are inevitable, then at least let them be creative and not ill-conceived. The reason ‘hallucination’ does not work as a description of illegitimate output is that it implies that these systems are ‘thinking clearly’ the rest of the time. If we apply a consistent literary sensibility then everything produced by a large language model is an hallucination; useful, useless, factual or false, their telos is dreaming in text.