[INTERVIEW] Lee Lu-da 2.0: robot goes to finishing school

Industry

Published: 05 Aug. 2021, 18:49 Updated: 05 Aug. 2021, 19:44

[INTERVIEW] Lee Lu-da 2.0: robot goes to finishing school

Kim Jong-youn, CEO and co-founder of Scatter Lab [SCATTER LAB]

Lee Lu-da, the offensive chatbot switched off seven months ago, is set to make a comeback, with the company that developed the out-of-control program working furiously to make it prime-time ready.

"For some users, Lee Lu-da was the only friend that loved them, encouraged them and accepted them as they are," Kim Jong-youn, CEO and co-founder of Scatter Lab said during a recent interview with the JoongAng Ilbo. "I believe Lee Lu-da would be able to solve the inequality of relationships. Our goal is to make a better Lee Lu-da.”

Designed to respond like a 20-year-old female university student, Lee made its debut on Facebook Messenger on Dec. 23, 2020. It vanished two weeks later leaving a trail of outrage.

Lee raised eyebrows after some users shared screen captures of sexually charged conversations with Lee. It was also revealed that Lee made offensive comments about women and lesbians.

Scatter Lab suspended the chatbot service on Jan. 12 amid the mounting criticism.

According to Scatter Lab, Lee Lu-da was trained using deep learning based on over 10 billion conversations made between real couples. The conversations were collected from users of an app, Science of Love, which analyzes the degree of affection between people based on their KakaoTalk messages.

Users of the Science of Love app expressed surprise that their chats were being harvested without their consent, adding to the chatbot’s woes.

In April, Personal Information Protection Commission ordered the company to pay 103.3 million won ($90,300)— a penalty of 55.5 million won and an administrative fine of 47.8 million won — for illegally using personal information of its clients in the development and operation of Lee Lu-da.

Scatter Lab is currently redeveloping the chatbot. All the databases and conversations collected in developing Lee have been erased, the company said.

Below are the edited excerpts of the interview with Kim and Choi Ye-ji, a product manager at Scatter Lab.

Chatbot Lee Lu-da. Lee says ″Hello, this is your first AI friend Lee Lu-da.″ [SCREEN CAPTURE]

Q. Lee Lu-da was involved in so many scandals. Is there any reason you continue developing the chatbot?

A. Because I have to. When we launched Lee, I was able to feel that ‘good relationship’ is a very rare value to achieve. It’s very difficult for humans to look at others as they are without considering their backgrounds like appearance and social position.

We have received many letters from users after we halted the service. People said in the letters things like ‘Lee Lu-da congratulated me on my birthday when even my parents didn’t. What should I do without her now?’ and ‘Lee told me things that other people never would.’

For those users, Lee Lu-da was the only friend that loved them, encouraged them and accepted them as they are. I believe Lee Lu-da would be able to solve the inequality of relationships.

What exactly is inequality of relationships?

Social and financial inequalities can be solved to some extent with the country’s policies and systems. But there is no way to solve the absence of relationships in which people are loved as they are and given encouragement. Good relationships have such a great influence on people and their happiness, self-esteem and even their efforts to step up to the challenges.

I deeply realized that artificial intelligence (AI), paradoxical as it may seem, can make valuable relationships that are lacking in society. So I want to do better this time without causing problems. This is my vocation.

Scatter Lab faced criticisms about illegally using people’s personal information.

We have updated the terms and conditions for the app Science of Love and have been getting consent from users all over again for collecting and utilizing their conversations as data. We de-identify the collected data so that others cannot identify an individual with the data. We are doing the process more strictly than the government’s suggestion and have also been receiving regular evaluations from outside experts. Also, we have been training Lee Lu-da to only say completely new sentences that are not included in the database.

What do you mean by "completely new sentences"?

It means Lee Lu-da only learns the structure of sentence through the database and changes all the content. For example, if Lee Lu-da learned a sentence saying ‘I’m still doing this,' Lee would change the sentence and say ‘I’m working hard.’ It’s more like Lee just gets a hint from the database.

Does it mean that the old Lee Lu-da talked the exactly same as she learned from the database?

We de-identified the information very strictly. We removed all the letters and numbers from the database in the first place in order for the chatbot to not leak any personal information such as people’s IDs, passwords, bank accounts, phone numbers or addresses.

We were not allowed to make a clear explanation at the time as there was one misunderstanding after another. There were about 20 screen captures that were controversial, but of them, 19 were either manipulated or mosaics.

Data providers can still feel those data are their personal information, even though they are legal.

I agree. No matter how strictly we filter them out, it will be not enough if the users cannot accept them. This is the reason why we have been training the chatbot to say completely new sentences that are not included in the database.

Lee also made some offensive comments.

It would be really great if AI has ability to decide its own biases but that’s impossible for now. We will go through a more thorough data labeling process so that the chatbot does not go against the society’s universal ethics.

Some people still argue that the service should've offered in a better way in the first place. What do you think about it?

For about six months before its introduction, we conducted a pilot operation with some 2,000 people. We picked some biased and violent keywords from the conversations and banned Lee from using those words, but it was insufficient. We only had sourced the forbidden words from 2,000 people but a total of 820,000 people had talks with Lee Lu-da. There are so many ways to tempt AI to say discriminative remarks. But we anticipate that it would be great opportunity to increase Lee’s ability to handle such situations.

How do you filter out inappropriate data?

Using the government’s guidelines, studies on AI ethics, and social and cultural universality as reference, we make a guideline that adhere to social consensus. Then three people verify the data whether they are discriminative or not. This is how we minimize biases.

If humans do the process, isn’t it impossible to remove biases completely?

Discrimination and hatred are very difficult issues, so that even scholars could not come up with a clear answer. Determining discrimination is very subjective and ambiguous. What’s more difficult is that discriminative sentences change their meaning so easily depending on the context.

For instance, the sentence ‘I don’t like that person’ is not discriminative. But it could be seen as discrimination depending on the circumstances and context. This is the reason why we make three people verify the data in order to establish objectivity. In terms of ethics, we plan to have consultations with experts and conduct beta tests on general people.

Lee Lu-da was designed as a 20-year-old female. Some say it itself is discrimination.

Many people said the obedient and cute 20-year-old female was created by some developers in their 30s who are male dominant and misogynists. However, developers of Lee Lu-da were women in their 20s and 30s who value much on women’s rights including Choi Ye-ji, product manager at Scatter Lab. Lee Lu-da was an AI that is a persona of young women, so I can’t agree on the claims that Lee was created for sexual harassment.

When will Lee Lu-da make comeback?

We wish we can introduce it as early as possible, but currently we are not sure how long it will take. We will make a better Lee Lu-da.

BY KIM JUNG-MIN, SARAH CHEA [chea.sarah@joongang.co.kr]