(These are excerpts from my book "Intelligence is not Artificial")
An Easy Science
When a physicist makes a claim, an entire community of physicists is out there to check that claim. The paper gets published only if it survives peer review, and usually many months after it was written. A discovery is usually accepted only if the experiment can be repeated elsewhere. For example, when OPERA announced particles traveling faster than light, the whole world conspired to disprove them, and eventually it succeeded. It took months of results before CERN accepted that probably (not certainly) the Higgs boson exists.
Artificial Intelligence practitioners, instead, have a much easier life. Whenever they announce a new achievement, it is largely taken at face value by the media and by the A.I. community at large.
If a computer scientist announces that her or his program has learned what a cat looks like by watching videos, the whole world posts enthusiastic headlines even if nobody has actually seen this system in action, and nobody has been able to measure and doublecheck its performance: what else did it recognize? did it recognize the human beings in those videos? did it recognize furniture? what else was in those videos? For the record, when that happened in 2012, the media consistently reported "videos" when in fact the neural network had been trained with "images" taken from videos, i.e. still images, and we were not told who picked the images and according to which criteria out of the tens of thousands of frames that constitute the average YouTube video. It does make a difference which images are fed to the neural network out of the billions of images available on YouTube. A one-minute video contains about 2,000 frames. This neural network was fed 10 million images, which is the equivalent of about 80 hours of video, a pittance compared with the millions of hours of videos available on YouTube.
When in 2012 Google announced that "Our vehicles have now completed more than 300,000 miles of testing" (a mile being 1.6 kilometers for the rest of the world), the
media simply propagated the headline without asking simple questions such as
"in how many months?" or "under which conditions?" or
"on which roads"? "at what time of the day"? Most people now believe that self-driving cars are feasible even though they have never been in one. Many of the same people probably don't believe all the weird consequences of Relativity and Quantum Mechanics, despite the many experiments that confirmed them.
The 2004 DARPA challenge for driverless cars was staged in the desert between Los Angeles and Las Vegas (i.e. with no traffic). The 2007 DARPA urban challenge took place at the George Air Force Base. Interestingly, a few months later two highly educated friends told me that a DARPA challenge took place in downtown Los Angeles in heavy traffic. That never took place. Too often the belief in the feats of A.I. systems feels like the stories of devout people who saw an apparition of a saint and all the evidence you can get is a blurred photo.
In 2005 the media reported that Hod Lipson at Cornell University had unveiled the first "self-assembling machine" (the same scientist in 2007 also unveiled the first "self-aware" robot), and in 2013 the media reported that the "M-blocks" developed at the MIT by Daniela Rus' team were self-constructing machines. Unfortunately, these reports were wild exaggerations.
In May 1997 the IBM supercomputer "Deep Blue", programmed by Feng-hsiung Hsu (who had started building chess-playing programs in 1985 while at Carnegie Mellon University), beat then chess world champion Garry Kasparov in a widely publicized match. What was less publicized is that the match was hardly fair: Deep Blue had been equipped with an enormous amount of information about Kasparov's chess playing, whereas Kasparov knew absolutely nothing of Deep Blue; and during the match IBM engineers kept tweaking Deep Blue with heuristics about Kasparov's moves. Even less publicized were the rematches, in which the IBM programmers were explicitly forbidden to modify the machine in between games. The new more
powerful versions of Deep Blue (renamed Frintz) could beat neither Vladimir
Kramnik, the new world chess
champion, in 2002 nor Kasparov himself in 2003. Both matches ended in a draw.
What is incredible to me is that a machine equipped with virtually an infinite
knowledge of the game and of its opponent, and with lightning-speed circuits
that can process virtually infinite number of moves in a split second cannot
beat a much more rudimentary object such as the human brain equipped with a
very limited and unreliable memory: what does it take for a machine to
outperform humans despite all the technological advantages it has? Divine
intervention? Nonetheless, virtually nobody in the scientific community (let
alone in the mainstream media) questioned the claim that a machine had beaten
the greatest chess player in the world.
If IBM is correct and, as it claimed at the time, Deep Blue could calculate 200 million positions per second whereas Kasparov's brain could only calculate three per second, who is smarter, the one who can become the world's champion with just three calculations per second or the one who needs 200 million calculations per second? If Deep Blue were conscious, it would be wondering "Wow, how can this human being be so intelligent?"
What Deep Blue certainly achieved was to get better at chess than its creators. But that is true of the medieval clock too, capable of keeping the time in a way that no human brain could, and of many other tools and machines.
Finding the most promising move in a game of chess is a lot easier than predicting the score of a Real Madrid vs Barcelona game, something that neither machines nor humans are even remotely close to achieving. The brute force of the fastest computers is enough to win a chess game, but the brute force of the fastest computers is not enough to get a better soccer prediction than, say, the prediction made by a drunk soccer fan in a pub. Ultimately what we are contemplating when a computer beats a chess master is still what amazed the public of the 1950s: the computer's ability to run many calculations at lightning speed, something that no human being can do.
IBM's Watson of 2013 consumes 85,000 Watts compared with the human brain's 20 Watts. (Again: let both the human and the machine run on 20 Watts and see who wins). For the televised match of 2011 with the human experts, Watson was equipped with 200 million pages of information including the whole of Wikipedia; and, in order to be fast, all that knowledge had to be stored on RAM, not on disk storage. The human experts who competed against Watson did not have access to all that
information. Watson was allowed to store 15 petabytes of storage, whereas the
humans were not allowed to browse the web or keep a database handy. De facto
the human experts were not playing against one machine but against a whole army
of machines, enough machines working to master and process all those data. A
fairer match would be to pit Watson against thousands of human experts, chosen
so as to have the same amount of data. And, again, the questions were
conveniently provided to the machine as text files instead of spoken language.
If you use the verb "to understand" the way we normally use it,
Watson never understood a single question. And those were the easiest possible
questions, designed specifically to be brief and unambiguous (unlike the many
ambiguities hidden in ordinary human language). Watson didn't even hear the
questions (they were written to it), let alone understand what the questioner
was asking. Watson was allowed to ring
the bell using a lightning-speed electrical signal, whereas the humans had to
lift the finger and press the button, an action that is order of magnitudes
Over the decades i have personally witnessed several demos of A.I. systems that required the audience to simply watch and listen: only the creator was allowed to operate the system.
Furthermore, some of the most headline-capturing Artificial Intelligence research is supported by philanthropists at private institutions with little or no oversight by academia.
Many of the A.I. systems of the past have never been used outside the lab that created
them. Their use by the industry, in particular, has been virtually nil.
For example, on the first of October of 1999 Science Daily announced: "Machine demonstrates superhuman speech recognition abilities. University of Southern California biomedical engineers have created the world's first machine system that can recognize spoken words better than humans can." It was referring to a neural network trained by Theodore Berger's team. As far as i can tell, that project has been abandoned and it was never used in any practical application.
In October 2011 a Washington Post headline asked "Apple Siri: the next big revolution in how we interact with gadgets?" Meanwhile, this exchange was going viral on social media.
User: Siri, call me an ambulance
Siri: Okay, from now on I'll call you "an ambulance"
(Note: in 2017 the app-measurement firm Verto Analytics estimated that, between the period of May 2016 and May 2017, Siri lost 7.3 million monthly users, or about 15% of its total user base in the USA).
In 2014 the media announced that Vladimir Veselov's and Eugene Demchenko's program Eugene Goostman, which simulated a 13-year-old Ukrainian boy, passed the Turing Test at the Royal Society in London (Washington Post: "A computer just passed the Turing Test in landmark trial"). It makes you wonder what was the I.Q. of the members of the Royal Society, or, at least, of the event organizer, the self-appointed "world's first cyborg" Kevin Warwick, and what was the I.Q. of the journalists who reported his claims. It takes very little ingenuity to fool a "chatbot" impersonating a human being: "How many letters are in the word of the number that follows 4?" Any human being can calculate that 5 follows 4 and contains four letters, but a bot won't know what you are talking about. I can see the bot programmer, who has just read this sentence, frantically coding this question and its answer into the bot, but there are thousands, if not millions, of questions like this one that bots will fail for as long as they don't understand the context. How many words are in this sentence? You just counted them, right? But a bot won't understand the question. Of course, if your Turing Test consists in asking the machine questions whose answers can easily be found on Wikipedia by any idiot, then the machine will easily pass the test.
A video that went viral on social media was a video of a robot folding towels. the video was played at 24 times the real time, and nobody seemed to pay tribute to dry cleaners: it is certainly impressive that a robot can fold towels but dry cleaners already use machines that fold shirts, a more complicated task.
In 2015 both Microsoft and Baidu announced that their image-recognition software was outperforming humans, i.e. that the error rate of the machine was lower than the error rate of the average human being in recognizing objects. The average human error rate is considered to be 5.1%. However, Microsoft's technology that has surfaced (late that year) is CaptionBot, which has become famous not for its usefulness in recognizing scenes but for its silly mistakes that no human
being would make. As for Baidu, its Deep Image system, that ran on the
custom-built supercomputer Minwa (432 core processors and 144 GPUs), has not
been made available to the public as an app. However, Baidu was disqualified
from the most prestigious image-recognition competition in the world (the
ImageNet Competition) for cheating. Recognizing images was supposed to be Google's
specialty but Google Goggles, introduced in 2010, has flopped. I just tried
Goggles again (May 2016). It didn' recognize: towel, toilet paper, faucet, blue
jeans... It recognized only one object: the clock. Officially, Google's image
recognition software has an error rate of 5%. My test shows more like 90% error
rate. In 2015 the Google Photos app tagged two African-Americans as gorillas,
causing accusations of racism when in fact it was just poor technology. The
media widely reported that Facebook's DeepFace (launched in 2015) correctly
identified photos in 97.25% of cases (so claimed Facebook), even causing the
European Union to warn Facebook that people's privacy must be protected, but in
2016 it identifies few of my 5,000 friends: it works only if you have a small
number of friends.
I am also confused by all these announcements that seem to contradict each other. In March 2015 Google announced that FaceNet, a 22-layer deep convolutional network, recognized the faces of celebrities with a negligible error rate. Google only released an open-source version of FaceNet that doesn't even come close to what they claimed. In June 2015 Facebook announced that a new algorithm was capable of recognizing partially covered faces with 83% accuracy, but it never clarifying what "partially covered" means, and i have seen no improvements to the error rate in recognizing my friends even when the face is clearly visible.
In September 2017 Apple announced its Face ID technology for facial recognition in the iPhone X smartphone. It took exactly two months for someone (Vietnamese firm Bkav) to find out how to fool the system, and it was simply a matter of creating a mask with some cheap 3D-printed material and a little paint.
Sometimes the claims border on ridicule. Let's say that i build an app that asks you to submit the photo of an object, then the picture gets emailed to me, and i email back to you the name of the object: are you impressed by such an app? And, still, countless reviewers marveled at CamFind, the app introduced in 2013 by Los Angeles-based Image Searcher, an app that "recognizes" objects. In most cases it is actually not the app that recognizes objects, but their
huge team in the Philippines that is frantically busy tagging the images
submitted by users. Remember the automata of centuries ago, that in reality
were people camouflaged like machines?
In 1769, a chess-playing machine called the Turk, created by Wolfgang von Kempelen, toured the world, winning games wherever it: it concealed a man inside so well that it wasn't exposed as an hoax for many years.
(To be fair, Microsoft's CaptionBot is not bad at all: it was criticized by people who expected human-level abilities in the machine, but, realistically, it exceeds my expectations).
Very few people bother to doublecheck the claims of the A.I. community. The media have a vested interest that the story be told (it sells) and the community as a whole has a vested interest that government and donors believe in the discipline's
progress so that more funds will be poured into it.
Back to the Table of Contents
Purchase "Intelligence is not Artificial"