Elon Musk’s Open Letter: The GPT-4 Pause Debate
Thoughts on the AI moratorium

While I don’t agree with every aspect of “Elon Musk’s open letter” (that’s how the mainstream media puts it), the debate it sparks is of utmost importance. The fact that key AI visionaries like Max Tegmark and Emad Mostaque have also signed on would be good for the headlines, which tend to favor the out-of-control AI robot narrative.
Here’s a link to the letter.
And here’s how I make sense of it:
Between Wonders and Worries
The central idea of the open letter to “all AI labs”: the development of language models that go beyond the capabilities of GPT-4 should be paused for 6 months, as we need to deal more intensively with risks and safety mechanisms. Some of those risks fall under the term “emergent capabilities,” i.e. those capabilities of an AI that seem to go beyond what it has been trained for.
Since AI models on the scale of GPT-4 are simply too complex to fully understand what is going on inside them (Black Box Problem), we can neither say how these emergent capabilities have arisen nor how many of them there are.
What sounds like science fiction is therefore part of the protocol at OpenAI, the company behind ChatGPT and GPT-4: before its release, GPT-4 was extensively tested for whether it could have an independent agenda, self-replicate, or develop a desire to exterminate humanity.
Interesting questions, right?
And the reassuring test result: negative again.
However, quite a few AI researchers believe that GPT-4 has less dramatic but still emergent capabilities. For example, GPT-4 understands the cause-and-effect principle in entirely new situations; the AI understands humor in text & image and explains jokes; it pretends things to achieve a set goal, can combine concepts in new ways, and beats the average human in a whole range of knowledge tests (including sommelier exams!).
GPT-4 was not trained to do any of this. It is, after all, a model designed to create human-like text, i.e. learn how to predict text. But the aforementioned capabilities are present, and due to the sheer size and complexity of the model, no one can currently say whether this could be somehow co-trained or otherwise justified.

New knowledge and intellectual abilities are one thing.
It gets really creepy, however, when you look at the appendix of the GPT-4 System Card, a kind of technical report on the attempt to “clean” GPT-4 of hate speech, planning violent assaults and terrorist acts, and recommendations for self-mutilation and the production of toxic substances.

Sounds like GPT-4 needs a psychiatrist?
Well, it’s happening: in the new research field of Machine Psychology, tests previously directed at human psychology are already being used to investigate the emergent capabilities of language models.
Who’s afraid of AGI?
The concept of emergent capabilities, as it is currently discussed, is not entirely uncontroversial. Here’s an article that deals with it skeptically and explains many of the phenomena currently considered emergent through human misinterpretation or the Black Box problem.
But even this critical article concludes that such phenomena could occur, but we currently understand far too little about what is going on inside the large language models. And that’s what Musk, Tegmark, Mostaque, and Co. are all about in their open letter: taking more time for accompanying research, be it through Machine Psychology or model analysis since the race for ever-larger models harbors far-reaching dangers.
OpenAI developers have been taking the potential risks of their AI models very seriously for some time now, which should come as no surprise for a company aiming to develop the first AGI (AGI = artificial general intelligence, i.e., an AI superior to humans in all areas).
As our systems get closer to AGI, we are becoming increasingly cautious with the creation and deployment of our models.
Interestingly, a recent study also found that GPT-4 possesses abilities that are believed to be characteristic of the much-feared AGI. Here’s a quote from “Sparks of Artificial General Intelligence”:
We demonstrate that […] GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, [… And] in all of these tasks, GPT-4’s performance is strikingly close to human-level performance […] we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.
A small reassurance at this point: it is highly unlikely that anyone will be able to develop a model beyond GPT-4 within the next six months. Simply because this is not something that can be casually accomplished in some hacker’s hideout. The hardware required to develop a hypothetical GPT-5 is just now being built and, according to expert estimates, will not be available within the next six months.






