'Biased' AI algorithms trigger debate around political implications

Tech

Published: 11 Oct. 2023, 19:13 Updated: 11 Oct. 2023, 19:28

'Biased' AI algorithms trigger debate around political implications

Kakao Pangyo Agit of internet company Kakao Corporation [YONHAP]

The ability of AI algorithms to screen hateful comments and ensure political impartiality has come into question as Daum, the second largest Korean internet portal, was criticized for its screening of news section comments.

This comes after speculations surrounded Daum of being a left-leaning portal site, with concerns rising over the political implications for next year's general elections.

Rep. Park Sung-joong of the ruling People Power Party and member of the Science, ICT, Broadcasting, and Communications Committee raised questions over “internet portals being politically biased not only in news distribution but also in filtering comments.”

“Comments such as ‘daeggae’ and ‘daeggaeMoon’ are immediately deleted or hidden. Whereas comments criticizing the conservative [party] such as ‘jwi-(rat)-Baky’, dak-(chicken)-Geun-hye’ or ‘gyong’ are left alone,” said Rep. Park. The word "gyong" is President Yoon Suk-yeol’s last name written in Hangeul flipped upside down, and is an expression used to belittle and criticize President Yoon.

The expression “daeggaeMoon," short for “support Moon Jae-in even if they crack their skulls" is used to belittle former President Moon Jae-in’s supporters, is comprised of the term “daegari,” which is offensive when used to refer to human heads, and “ggaejyeodo,” an explicitly violent term implicating physical harm.

Daum has operated SafeBot, a software application that utilizes AI to detect and block comments that contain swear words or vulgar slang, for its news outlets since December 2020.

Rep. Park argues that SafeBot hides comments with the word “daeggaeMoon,” but displays hate speech directed toward conservative political figures such as President Yoon Suk-yeol and former presidents Lee Myung-bak and Park Geun-hye.

Rep. Park singled out as a problem the use of data labeling in the procedure of training the AI model behind SafeBot. Data labeling is the process of sorting data so that the AI model can learn from it.

“Since data labeling is done by a Kakao employee, a person, it is not plausible the expression ‘daeggaeMoon’ was deleted or hidden by coincidence,” said Rep. Park.

The key issue lies in the extent to which the screening technology on internet portals will interfere with the large gathering of unfiltered public opinions on the news. The issue of defining hate speech, and determining whether platforms have responsibility or authority regarding it, is once again under scrutiny.

As internet portals opened up news comment sections in the early and mid-2000s, the number of malicious comments on victims of major disasters, celebrities and athletes increased rapidly.

Technological solutions were rolled out in response. Naver implemented technology to automatically change swear words to “***” symbols in 2012, and recently advanced its software to utilize AI in managing the comments section.

Kakao's SafeBot, an AI software application that screens comments on portal site Daum's news section [SCREEN CAPTURE]

Kakao applied an “automatic swear word replacement function” for the Daum news section in 2017 before launching SafeBot. In the case that slang under the category of forbidden words is used in a comment, musical note symbols such as “♩ ♪ ♬” would automatically replace such words.

As of last year, Kakao has trained SafeBot on its database of around 600,000 swear words. Introducing SafeBot has led to a decrease in the number of comments containing swear words or vulgar slang, down 63.8 percent in two years according to Kakao. Some assessed that the internet portal’s comment screening technology also contributed to some extent in blocking the spread of hate speech.

Naver's Cleanbot, an AI software application that screens comments for its news section [SCREEN CAPTURE]

The problem is the ambiguity surrounding the judgment criteria on how AI screens political hate speech such as “daeggaeMoon” or “gyong.” Kakao denied Rep. Park’s allegations, saying “SafeBot does not judge the political context of the mentioned word.”

The violent nature of the term “daeggaeMoon" was the reason that the expression was banned, not because it refers to a particular political group, according to Kakao.

Expressions such as “jwi-Baky” or “dak-Geun-hye” can be displayed because they are comprised of neutral terms (jwi (rat) + Bak, dak (chicken) + Geun-hye) and not hate speech.

“It is for the same reasons that expressions such as ‘jwaein (sinner) Moon,’ ‘jaeang (disaster) Moon,’ ‘jjit (tear) Jae-myung,’ ‘gaeddal,’ and ‘Lee-jwae (sin)-myung’ were not hidden in the comments,” said Kakao.

The aforementioned words refer to former President Moon Jae-in and Democratic Party leader Lee Jae-myung, and are used to insult them or their supporters.

“The SafeBot algorithm was trained in accordance with the Korea Communications Standards Commission’s internet content rating service criteria SafeNet,“ Kakao said. According to SafeNet, linguistic hate speech is categorized from levels 0 to 4 (the stronger the profanity, the higher the level).

Expressions referring to the physical body are categorized as level 3 “strong profanity.” In line with such criteria, the words “agari” and “judungi”, jargon referring to the human mouth, are also categorized as vulgar slang and subsequently banned.

Up to the present, internet portals have argued that AI, not humans, fairly settle disputes due to its use of algorithms. However, the judgment criteria of AI or the data used to train it have all been selected by the portals, leading to the algorithm’s objectivity constantly being the subject of controversy. This is similar to the reason why Naver’s algorithm for news search and arrangement is embroiled in controversy over its objectivity.

Some have argued for the algorithms to be made public so that doubts can be settled. Internet portal companies have objected, saying “the algorithm composition is a trade secret” or “the disclosure of algorithms could lead to misuse.”

But even excluding comment-screening AI, should generative AI such as ChatGPT be applied throughout daily-used IT services, the controversies surrounding the objectivity of AI is likely to continue. AI companies as of now are not disclosing the data used to train AI due to claims of copyright.

“There is a possibility where companies voluntarily disclose [the training data] and collect opinions from citizens and academia through public hearings, but the process is long and expensive,” said Choi Byung-ho, a professor at Korea University’s Human-inspired Artificial Intelligence Research Institute.

“If we reach a social consensus that [the bias of algorithms] is clearly a serious problem, there is also the possibility of regulation through legislation,” he added.

101205-AI-GRA

Naver, Korea’s largest internet portal and largest news distribution platform, has operated the software application Cleanbot to detect and automatically hide comments that contain swear words since April 2019.

This is similar to how Daum operates SafeBot, in the way they both utilize AI to filter hate speech. But there’s a difference in which expressions are classified as hateful expressions.

Unlike Daum’s SafeBot, which determines hate speech based on the inclusion of certain words, the AI model in Naver’s Cleanbot makes its decision by judging the entire context of the sentence. For instance, the expression “daeggaeMoon” could be displayed, but the sentence “die, daeggaeMoon,” which contains hateful action-based context would be banned.

BY KWEN YU-JIN [kim.juyeon2@joongang.co.kr]