[Column] Produce more data — and in English

Home > Opinion > Columns

print dictionary print

[Column] Produce more data — and in English

Lee Kyung-bae

The author is CEO of Secta9ine.

Korea is a country famous for the world’s oldest block book — or a book printed from wooden blocks on which the text and illustration for each page had to be painstakingly cut by hand — and the “Jikji,” considered the oldest movable metal print book in history, which contains all the brilliant insights from Zen Buddhism before and during the Goryeo Dynasty (918-1392).

The Annals of the Joseon Dynasty (1392-1910) and the “Hwaseong Seongyeok Uigwe” — a collection of the details of the construction of Hwaseong Fortress in Suwon, Gyeonggi during the reign (1776-1800) of King Jeongjo — testify to the marvelous history of the country’s recording culture. Separate from the state, a great number of intellectuals also wrote books and shared their deep thoughts and experiences built up during their lifetimes. Even in exile, great thinkers like Chusa Kim Jeong-hee (1786-1856) and Dasan Jeong Yak-yong (1762-1836) never stopped writing books on finding practical ways to help reform the country. In the aftermath of the Japanese occupation (1910-1945) and the following Korean War (1950-53), however, the unrivaled recording culture unfortunately came to a stop. It has become practice to discard data after its use.

Analogue data from books can only be used on computers once they are converted into digital data. After books, documents, photos and voices are stored and managed digitally, they can be used in new information systems to create new values, including artificial intelligence (AI). To establish a top-caliber AI ecosystem, one must first accumulate a huge amount of quality data. ChatGPT could learn and interlink a huge volume of data by using as many as 175 billion parameters. In the rush to the AI era, where does Korea’s ability to collect ever-expanding data — and its competitiveness for AI — stand now?

A 2021 report from the Ministry of Science and ICT shows that the size of Korea’s data industry is approximately 20 trillion won ($15.4 billion), about 7.0 percent of the U.S. industry and 16.4 percent of the EU’s industry. The ratio of Korean companies applying big data to their business operation is merely 15.9 percent. That’s not all. More than half of them complained about a “lack of quality data” they can use.

A big problem comes from their blind recognition of only “numerical data” as data. As most of the data in numbers pertain to transactions, their informational value is not only low, but the use of the data is also restricted because of privacy requirements. Last year, the government pledged to raise the economic value of data — and the rate of its application — by revising the Data Industry Act. But the move still faces many hurdles.

Language emerges as a bigger barrier for Korea to join the global tide. Of the 8 billion-strong world population, 82 million — or less than one percent —­ use Korean language.

Compared to 1.5-billion English users, the mediocre number poses a serious challenge for the country. Coupled with a relative lack of data from Korea — and practical difficulties in translating Korean into English — the problem gets more serious. Even ChatGPT admits its limitation, saying, “I try to give good answers to Korean users, but there is a limit to my language ability.”

That underscores the need for us to foster the AI industry and translate Korean-based data into English. An adherence to Korean can help keep things Korean intact, but it helps isolate the country from the rest of the world. I wonder if the absolute lack of our historical data in English could be related to our inability to effectively refute China’s Northeast Project — a state-funded historical research aimed at expanding China’s territory in Manchuria, a territory of Korea’s ancient kingdom Gojoseon (2,333 BC to 108 BC) — or methodically counter Japan’s unceasing claims over the Dokdo islets in the East Sea. We need to systematically create our data in English.

AI technology has a long way to go. If you use open-source resources, you could easily create AI software. But it is still at rudimentary levels. If the generative AI can be compared to color TVs, the technology level of general AI remains at black-and-white TVs. The low level of technology and data in Korea pose a serious dilemma in the age of AI.

The time has come for the country to substantially nurture the data industry beyond uniform regulations. It must first produce applicable data, turn them into English, and let the rest of the world share them.

Schools must change from rote learning-based education to advanced education based on debates. At the same time, authorities must create professional education programs for young researchers to study and develop the next generation of AI technology rather than offering them a short training process or seminars. Otherwise, Korea will never be able to narrow the gap with advanced countries.

Translation by the Korea JoongAng Daily staff.
Log in to Twitter or Facebook account to connect
with the Korea JoongAng Daily
help-image Social comment?
s
lock icon

To write comments, please log in to one of the accounts.

Standards Board Policy (0/250자)