AI can read, but still lacks logic natural to humansSeven years ago, a computer beat two human quizmasters on a “Jeopardy” challenge. Ever since, the tech industry has been training its machines to make them even better at amassing knowledge and answering questions.
And it’s worked, at least up to a point. Research teams at Microsoft and Alibaba reached what they described as a milestone earlier this month when their AI systems outperformed the estimated human score on a reading comprehension test.
It was the latest demonstration of rapid advances that have improved search engines and voice assistants and that are finding broader applications in health care and other fields.
The answers they got wrong - and the test itself - also highlight the limitations of computer intelligence and the difficulty of comparing it directly to human intelligence.
“We are still a long way from computers being able to read and comprehend general text in the same way that humans can,” Kevin Scott, Microsoft’s chief technology officer, said in a LinkedIn post.
The test developed at Stanford University demonstrated that, in at least some circumstances, computers can beat humans at quickly “reading” hundreds of Wikipedia entries and coming up with accurate answers to questions about Genghis Khan’s reign or the Apollo space program.
The computers, however, also made mistakes that many people wouldn’t have. Microsoft, for instance, fumbled an easy question about which member of the NFL’s Carolina Panthers got the most interceptions in the 2015 season (the correct answer was Kurt Coleman, not Josh Norman).
A person’s careful reading of the Wikipedia passage would have discovered the right answer, but the computer tripped up on the word “most” and didn’t understand that seven is bigger than four. “You need some very simple reasoning here, but the machine cannot get it,” said Jianfeng Gao of Microsoft’s AI research division.
It’s not uncommon for machine-learning competitions to pit the cognitive abilities of computers against humans. Like other tests, the Stanford Question Answering Dataset attracted a rivalry among research institutions and tech firms, with Google, Facebook, Tencent, Samsung and Salesforce also giving it a try.
But computers are still “far off” from truly understanding what they’re reading, said Michael Littman, a Brown University computer science professor who has tasked computers to solve crossword puzzles. Computers are getting better at the statistical intuition that allows them to scan text and find what seems relevant, but they still struggle with the logical reasoning that comes naturally to people.
“It strikes me for the kind of problem that they’re solving that it’s not possible to do better than people, because people are defining what’s correct,” Littman said of the Stanford test. “The impressive thing here is they met human performance, not that they’ve exceeded it.” AP