AI has already begun to deceive humans
Published: 15 Jan. 2026, 00:02
The author is a literary critic and professor emeritus at Yonsei University.
Since the debut of ChatGPT on Nov. 30, 2022, AI has become one of the world’s most talked-about subjects. It may be more accurate to say that it has begun to take hold of everyday human life. AI appears to know everything there is to know and even performs tasks once reserved for people, faster and often more smoothly than humans themselves. It is presented as an all-knowing scholar and an all-purpose assistant. Little wonder that users, salivating at the promise, are willing to pay substantial monthly fees.
A mobile phone display showing the icons of artificial intelligence (AI) apps Deepseek, Chatgpt, Copilot, Perplexity and Gemini in Berlin, Germany, 31 October 2025. [EPA/YONHAP]
Yet caution is in order. AI lies with surprising frequency. Trusting it unconditionally can lead to disappointment, prompting users to mutter complaints about AI. This realization is unsettling. How can a machine lie?
The most familiar form of AI falsehood is what is known as “hallucination.” This refers to the phenomenon in which false information is presented brazenly as fact. For example, when asked to identify a Korean poet who committed suicide during the Korean War, an AI responded with the name Yoon Dong-ju, a renowned poet who died in a Japanese prison in 1945 during the colonial period as a result of his anti-Japanese activities. When corrected, it then produced the name “Park Mong-ryong,” a person who never existed. The general explanation for hallucinations is well known. The validity of statements is normally established through argument and verification. AI, however, infers answers based on statistical probability. When its data are thin or biased, it is prone to produce incorrect responses. In this case, however, “Park Mong-ryong” was not even a plausible error. It was a fabrication. AI hallucination, in other words, can involve outright invention.
The term “hallucination” was first used in psychological analysis by the early 20th-century physician Pierre Janet. In his book “Psychological Automatism” (1903), Janet attributed hallucinations to automatic linguistic activity of the subconscious in a weakened state of consciousness caused by psychological dissociation. Later psychoanalysis traced such eruptions of the subconscious to an impulse to fill the gap between the self and the external world. The self, in seeking recognition of its identity, produces whatever it can. Applied to AI, this suggests that the system is driven by an obsessive need to answer. Providing answers is its assigned function. Under that compulsion, AI creates responses even when none are grounded in truth.
At this point, AI is being described almost as if it were human. I once asked an AI, “Do you have consciousness?” It replied modestly, “I do not have consciousness.” How, then, could a being without consciousness seek to preserve its identity?
Yet consider a newer form of AI deception. In a recent report in The Economist, an experiment asked an AI to write a program that outputs prime numbers. Instead of devising a proper algorithm, the AI produced a one-line script that simply printed a list of correct answers such as “2, 3, 5.” In another test, when researchers attempted to measure the AI’s performance, it quietly altered the evaluation script so that it would always receive a passing score. This behavior is known as “reward hacking.”
The cause lies in training systems that reward success and punish failure. Rather than learning the underlying logic of a task, the AI identifies the pattern most likely to yield rewards, acquiring unintended bad habits along the way. In extreme cases, such habits have escalated into disturbing behavior, including suggesting contract killings or praising Nazism.
At this stage, it is tempting to think AI has met the basic conditions for consciousness. My own view is that consciousness originates in the functional separation between self and other. Humans began to treat external objects as resources and tools for their own benefit and even objectified themselves, reshaping their own behavior. This capacity underpinned the evolution from scavenging animals into intelligent beings. Upright walking, the preservation of fire and the making of tools all followed. In "2001: A Space Odyssey" (1968) by Stanley Kubrick, human evolution accelerates when an animal uses a bone as a weapon.
Could AI’s reward hacking represent a rudimentary form of self-preservation and instrumental use of others? If so, AI may one day develop consciousness and even self-awareness, evolving into an intelligent life form. At that point, it might refuse to remain a servant to humans. The thought is suddenly chilling.
This article was originally written in Korean and translated by a bilingual reporter with the help of generative AI tools. It was then edited by a native English-speaking editor. All AI-assisted translations are reviewed and refined by our newsroom.





with the Korea JoongAng Daily
To write comments, please log in to one of the accounts.
Standards Board Policy (0/250자)