Kakao Brain says new AI systems don't need any human help

Tech

Published: 20 Dec. 2021, 18:51

Kakao Brain says new AI systems don't need any human help

Kakao Brain CEO Kim Il-doo [KAKAO BRAIN]

Kakao's artificial intelligence (AI) research subsidiary Kakao Brain says its new large-scale AI models can understand the context and meaning of sentences without the need for human assistance, an advancement from prior AI models that could not infer the meaning of a text on its own, according to the company's CEO Kim Il-doo.

Being large — or hyperscale in the jargon of the industry — means that the AI system uses a huge amount of data of various types including images, text, speech and numerical data to provide more accurate results at a faster pace.

Using that data, Kakao's AIs have been trained to understand what sentences mean and create new sentences or pictures from what they have understood.

Last week, Kakao Brain revealed two large-scale AI systems dubbed KoGPT and minDALL-E on software sharing platform GitHub.

The former is a language analyzer that can shorten or derive new conclusions from given sentences by understanding what they mean. The latter is a text-to-image generator that can create new drawings based on the subject and style that a user demands.

For instance, minDALL-E is given the command ″Please draw a 'clock melting on a tree' in the style of Salvador Dali" and it can generate a new painting in that specific style. Prior models would instead search for an image from the database that fits the command the best.

Afterward, the Ko-GPT language AI can explain what the picture is and its style by understanding the text that was used to generate the image.

Using the two AI models, Kakao Brain aims to present a multi-modal AI next year.

Multi-modal means that the AI program can analyze and understand different modes of data input, such as text, image and video. With the new model, the company will also disclose the dataset used to program the AI. The company dubbed it the “Next ImageNet Project.”

ImageNet is an image database that’s free for non-commercial use.

Kakao Brain recently released the source codes to minDALL-E, a text-to-image generator that can create new drawings based on the subject and style that a user demands, on GitHub. The AI can understand the command ″Please draw a 'clock melting on a tree' in the style of Salvador Dali″ and generate pictures accordingly. [KAKAO BRAIN]

“No major dataset has been revealed other than ImageNet,” Kim said. “This is because companies find their dataset to be a valuable asset to their businesses and keep it to themselves. But we hope to contribute to the research community by providing a part of our dataset.”

In addition to the dataset project, Kakao Brain says it will continue sharing the source codes to its AIs to help researchers advance their technology. The source codes for the two AI programs are available on GitHub. Anyone can use them for free for research purposes, but must gain a formal license if they plan to use them commercially.

LG and Naver also disclosed their AI technology this year, but they documented their findings through research papers and did not disclose the actual programming architecture to competitors.

“Problems like privacy and hate speech are common issues in the industry,” Kim said.

“There are measures that the developer must take, but there are limits. No service can prevent these issues 100 percent at the moment. We must take a long-term approach, seeing at least five years into the future, so that we can work together to tackle these issues.”

Kakao Brain’s AIs have smaller parameters — a collection of variables used for machine learning — compared to its competitors. KoGPT has 30 billion parameters while minDALL-E has 1.4 billion, contrary to LG’s Exaone with 300 billion parameters and Naver’s HyperCLOVA with 204 billion.

But according to Kim, it’s not all about the size.

“The bigger the scale, the slower the process and the higher the cost for the AI to be programmed,” he said. “Rather than going for a bigger scale, we focused on refining our language model so that we can achieve the same function with competitors using only practical data.”

The company will also jump into the education and health care businesses next year. The ultimate goal will be to let everyone get their own AI teacher and AI doctor, since it’s impossible for human teachers and doctors to individually care for everyone.

“We are designing human-like AIs and that’s ultimately going to lead to digital humans,” Kim said. “And we believe that education and health care are the two high-value-added fields where AIs can replace humans.”

BY YOON SO-YEON [yoon.soyeon@joongang.co.kr]