Okay, here are Alicebot inventor Dr. Richard Wallace's answers to your questions. You're about to enter a world that contains interesting thoughts on A.I., a bit of marijuana advocacy, a courtroom drama, tales of academic politics and infighting, personal ranting, discussion of the nature of mental illness, and comments about the state of American society and the world in general. Yes, all this in one interview so long and strong we had to break it up into three parts to make it fit on our pages. This is an amazing work, well worth reading all the way to the end.
1) AI through simulation?
by Jeppe Salvesen
Do you think that the ever increasing processing power will eventually enable us to fully simulate the human brain? What ramifications would this have for the A.I. discipline?
My longstanding opinion is that neural networks are the wrong level of abstraction for understanding intelligence, human or machine.
Neurons are the transistors of the brain. They are the low level switching components out of which higher-order functionality is built. But like the individual transistor, studying the individual neuron tells us little about these higher functions.
Suppose an alien came down to Earth who had never seen a computer before. Assuming interstellar travel is possible without a computer! He/she might be tempted to break it open, and discover that it is made of millions of tiny transistors. The alien may try to discover how the computer works by measuring the electronic signals in the transistors. But they would miss the operating system completely. The transistors tell us nothing about the software.
Similarly, neurons tell us little about the higher order software running on our brains.
Significantly, no one has ever proved that the brain is a *good* computer. It seems to run some tasks like visual recognition better than our existing machines, but it is terrible at math, prone to errors, susceptible to distraction, and it requires half its uptime for food, sleep, and maintenance.
It sometimes seems to me that the brain is actually a very shitty computer. So why would you want to build a computer out of slimy, wet, broken, slow, hungry, tired neurons? I chose computer science over medical school because I don't have the stomach for those icky, bloody body parts. I prefer my technology clean and dry, thank you. Moreover, it could be the case that an electronic, silicon-based computer is more reliable, faster, more accurate, and cheaper.
I find myself agreeing with the Churchlands that the notion of consciousness belongs to "folk psychology" and that there may be no clear brain correlates for the ego, id, emotions as they are commonly classified, and so on. But to me that does not rule out the possibility of reducing the mind to a mathematical description, which is more or less independent of the underlying brain archiecture. That baby doesn't go out with the bathwater. A.I. is possible precisely because there is nothing special about the brain as a computer. In fact the brain is a shitty computer. The brain has to sleep, needs food, thinks about sex all the time. Useless!
I always say, if I wanted to build a computer from scratch, the very last material I would choose to work with is meat. I'll take transistors over meat any day. Human intelligence may even be a poor kludge of the intelligence algorithm on an organ that is basically a glorified animal eyeball. From an evolutionary standpoint, our supposedly wonderful cognitive skills are a very recent innovation. It should not be surprising if they are only poorly implemented in us, like the lung of the first mudfish. We can breathe the air of thought and imagination, but not that well yet.
And remember, no one has proved that our intelligence is a successful adaption, over the long term. It remains to be seen if the human brain is powerful enough to solve the problems it has created.
Functionalism is basically the view that the mind is the software, and the brain is the hardware. It holds that mental states are equivalent to the states of a Turing Machine. Behaviorism was a pre-computational theory, which imagines the nervous system as a complex piece of machinery like a telephone exchange, but they didn't think much about software. Dualism goes back to Descartes. It is the view that the mind and brain are separate and distinct things, possibly affecting each other, or possibly mirroring each other.
My view is a kind of modified dualism in which I claim that the soul, spirit, or consciousness may exist, but for most people, most of the time, it is almost infentesimally small, compared with the robotic machinery responsible for most of our thought and action. Descartes never talked about the relative weights of brain and mind, but you can read in an implicit 50-50 assumption in most Dualist literature. My idea is more like 99-1, or even 99.999999% automatic machinery and .00000001% self-awareness, creativity, consciousness, spirit or what have you.
That's not to say that some people can't be more enlightened than others. But for the vast herd out there, on average, consciousness is simply not a significant factor. Not even a second- or third-order effect. Consciousness is marginal.
I say this with such confidence because of my experience building robot brains over the past seven years. Almost everything people ever say to our robot falls into one of about 45,000 categories. Considering the astronomical number of things people could say, if every sentence was an original line of poetry, 45,000 is a very, very small number.
2) Turing Test
I noticed that your AliceBot won the 2000 Loebner Prize for most human responses. My question is: "As an Artificial Intelligence researcher, do you feel that the Loebner Prize represents a legitimate variety of testing, or did you just want the $2000?"
I was pretty sure that almost all AI researchers came to the agreement about thirty years ago that the original imitation game as proposed by Turing in 1951 was useful only as a mental exercise, not in practice. Do you feel that the types of developments that the Loebner prize supports(intentional, hard-coded spelling mistakes, etc.) are actually productive in terms of the AI research project?
In case you haven't noticed, the field of Artificial Intelligence (defined however you wish) has almost nothing to do with science. It is all about politics. When you look at all the people working professionally in the field of A.I., it brings to mind the old joke:
Q: How many Carnegie Mellon Ph.D.s does it take to screw in a light bulb?
A: Two. One to change the bulb, and one to pull the chair out from under him.
The only rule most of these people know is: undermine the competition at all costs, by whatever legal means, or whatever they can get away with. That is how you become King of the A.I. Anthill.
Having a good theory or better implementation of anything is beside the point. Being able to "play the game" and knock out the competition, that is what it is all about. Swim with sharks or be eaten by them.
Especially in the age of increased competition for diminishing jobs and funding, scientific truth takes a back seat to save-your-ass.
Unfortunately it seems that the A.I. problem is inseperable from politics.
When I say that academia is corrupt in America, I don't mean that professors are accepting bribes and giving kickbacks for government contracts. There may be a financial motive in some cases, such as the use of overhead funds for a "course buyout" to reduce a professor's workload, but I am not talking about the kind of corruption associated with Wall Street and Washington exactly. I am talking about the replacement of science with politics as the main item on the academic agenda.
It must not have always been so. At one time, I believe academics were appointed and promoted primarily on the basis of merit and accomplishment. Within the last 20 years or so in the United States this has gradually changed into a system in which political correctness, slickness, and good salesmanship are more highly valued than good science. I don't pretend to understand the reasons for this, but I can point to many examples within our own community.
I have written that it is like a dysfunctional family. Those in positions of leadership and authority have mental health, drug and/or alcohol problems that make them incapable of carrying out their administrative responsibilities. In response, people who are skilled at "enabling" or "nursing" the dysfunctional leaders get promoted and advanced. Those who are prone to logical thinking and speaking the truth are discarded, because they make the authorities face their unconscious anxieties.
I often say, people don't go into computer science because they enjoy working with the public. But as the field has matured, I think it has attracted people who are more comfortable wearing business suits and attending strategy meetings than tinkering on a lab bench or writing a research paper. As computer science departments matured, the people already in them began to want everything to remain the same until they retired. They didn't want to hire young professors with a lot of new ideas about the administration. They hired young professors who wanted everything to stay exactly like it was, no matter what.
You may think that the politicization of a field like computer science is no big deal. We can have slick politicians instead of scientists running university CS departments, and not cause a lot of problems. But I think it is a really big problem in other fields, especially in medical science, especially in drugs and mental health.
Take LSD for example. Discovered by Albert Hoffmann in 1945, LSD is the most powerful drug ever developed. If you have ever gotten a prescription for any drug, you may have noticed that the dosage is usally given in "milligrams". But the dosage of LSD is "micrograms". It has the lowest ED50 of any known drug.
In the early 1960's there was some very promising research at Harvard applying LSD to depressed patients like me. The work was never completed or published for, guess what, political reasons. Subsequently, LSD was classified as a "Schedule I" drug with no useful medical value. This was not a decision based on sound science but on politics and fear. Even today there is zero research on this topic. Did you ever wonder why there is no Department of Psychedelic Studies on any university campus? It is a gaping hole in the academic curriculum, filled only by the informal undergraduate ratings of colleges as "party schools".
Even the very name of the federal agency that provides funding for drug research, the National Institute on Drug Abuse, prejudices the applications and the results. The native born American hippie agronomy student who got his Ph.D. in the 1970's is growing pot underground in California today. The immigrant doctor who "proved" that marijuana causes cancer got the NIDA grant and has tenure at UCLA. What's wrong with this picture?
Until 2 years ago, there was no federally funded research on the medical benefits of marijuana since the 1970's. Even now the only funded research is for terminal illnesses, and it seems like it will take a long time before they consider mental illnesses like mine. I conducted a survey of patients in San Francisco and discovered that "pain" was the #1 symptom for medical marijuana but "depression" was #2, and terminal illnesses like AIDS and cancer were lower on the list. So I am not alone in the perception that there is a patient need for research on this drug.
The problem here, my friends, is that NIDA is part of a specturm of trouble that includes once respected agencies such as NASA, NSF and DARPA. It is an octopus of political corruption that reaches into MIT and CMU and Berkeley and darkens everything it touches. It calls into question the quality and even the veracity of the scientific results and publications. We all witnessed the beginning of this even when we were all friends together at the ICRA conferences in the acrimonious interchanges between academia and industry. I myself saw enough of the system from the inside at NYU and Lehigh to know that science plays almost no role in the hiring, promoting or review process. It's all politics.
Not to place blame, but I think graduate advisors should be more straightforward with students about this point. It would be better to put more time into training them how to "shmooze" and "work the system" than how to solve mathematical problems, if they want their students to be successful. Either that, or they should work on changing the system back to merit based promotion.
3) My question (with answer)
Historically, AI has done poorly managing public expectations. People expected thinking, understanding computers, while researchers had trouble getting computers to successfully disambiguate simple sentences. This is not good PR. Do you think the field has learned from this? If so, what should the public expect, and how do we excite them about it?
Just for fun, I asked slashwallace a shortened version of the question, do you think your response would differ?
Human: Historically AI has done poorly managing the public's expectations,
do you think this will continue?
SlashWallace: Where did he get it?
Hugh Loebner is an independently wealthy, eccentric businessman, activist and philanthropist. In 1990 Dr. Loebner, who holds a Ph.D. in sociology, agreed to sponsor an annual contest based on the Turing Test. The contest awards medals and cash prizes for the "most human" computer. Since its inception, the Loebner contest has been a magnet for controversy.
One of the central disputes arose over Hugh Loebner's decision to award the Gold Medal and $100,000 top cash prize only when a robot is capable of passing an "audio-visual" Turing Test. The rules for this Grand Prize contest have not even been written yet. So it remains unlikely that anyone will be awarded the gold Loebner medal in the near future. The Silver and Bronze medal competitions are based on the STT. In 2001, eight programs played alongside two human confederates. A group of 10 judges rotated through each of ten terminals and chatted about 15 minutes with each. The judges then ranked the terminals on a scale of "least human" to "most human." Winning the Silver Medal and its $25,000 prize requires that the judges rank the program higher than half the human confederates. In fact one judge ranked A.L.I.C.E. higher than one of the human confederates in 2001. Had all the judges done so, she might have been eligible for the Silver Medal as well, because there were only two confederates.
To really understand how we accomplished this, I have to teach you some AIML.
The basic unit of knowledge in AIML is called a category. Each category consists of an input question, an output answer, and an optional context.
The question, or stimulus, is called the pattern. The answer, or response, is called the template. The two types of optional context are called "that" and "topic."
The AIML pattern language is simple, consisting only of words, spaces, and the wildcard symbols _ and *.
The words may consist of letters and numerals, but no other characters. The pattern language is case invariant.
Words are separated by a single space, and the wildcard characters function like words.
The first versions of AIML allowed only one wild card character per pattern.
The AIML 1.01 standard permits multiple wildcards in each pattern, but the language is designed to be as simple as possible for the task at hand, simpler even than regular expressions.
The template is the AIML response or reply. In its simplest form, the template consists of only plain, unmarked text.
More generally, AIML tags transform the reply into a mini computer program which can save data, activate other programs, give conditional responses, and recursively call the pattern matcher to insert the responses from other categories.
Most AIML tags in fact belong to this template side sublanguage.
The optional context portion of the category consists of two variants, called <that> and <topic>. The <that> tag appears inside the category, and its pattern must match the robot's last utterance.
Remembering one last utterance is important if the robot asks a question. The <topic> tag appears outside the category, and collects a group of categories together.
The topic may be set inside any template. AIML is not exactly the same as a simple database of questions and answers. The pattern matching "query" language is much simpler than something like SQL. But a category template may contain the recursive <srai> tag, so that the output depends not only on one matched category, but also any others recursively reached through <srai>.
AIML implements recursion with the <srai> operator. No agreement exists about the meaning of the acronym.
The "A.I." stands for artificial intelligence, but "S.R." may mean "stimulus-response," "syntactic rewrite," "symbolic reduction," "simple recursion," or "synonym resolution." The disagreement over the acronym reflects the variety of applications for <srai> in AIML. Each of these is described in more detail in a subsection below:
(1). Symbolic Reduction-Reduce complex grammatic forms to simpler ones.
(2). Divide and Conquer-Split an input into two or more subparts, and combine the responses to each.
(3). Synonyms-Map different ways of saying the same thing to the same reply.
(4). Spelling or grammar corrections.
(5). Detecting keywords anywhere in the input.
(6). Conditionals-Certain forms of branching may be implemented with <srai>.
(7). Any combination of (1)-(6).
The danger of <srai> is that it permits the botmaster to create infinite loops. Though posing some risk to novice programmers, we surmised that including <srai> was much simpler than any of the iterative block structured control tags which might have replaced it.
(1). Symbolic Reduction
Symbolic reduction refers to the process of simplifying complex grammatical forms into simpler ones. Usually, the atomic patterns in categories storing robot knowledge are stated in the simplest possible terms, for example we tend to prefer patterns like "WHO IS SOCRATES" to ones like "DO YOU KNOW WHO SOCRATES IS" when storing biographical information about Socrates. Many of the more complex forms reduce to simpler forms using AIML categories designed for symbolic reduction:
<pattern>DO YOU KNOW WHO * IS</pattern>
<template><srai>WHO IS <star/></srai></template> </category>
Whatever input matched this pattern, the portion bound to the wildcard * may be inserted into the reply with the markup <star/>. This category reduces any input of the form "Do you know who X is?" to "Who is X?"
(2). Divide and Conquer
Many individual sentences may be reduced to two or more subsentences, and the reply formed by combining the replies to each. A sentence beginning with the word "Yes" for example, if it has more than one word, may be treated as the subsentence "Yes." plus whatever follows it.
The markup <sr/> is simply an abbreviation for <srai><star/></srai>.
The AIML 1.01 standard does not permit more than one pattern per category. Synonyms are perhaps the most common application of <srai>. Many ways to say the same thing reduce to one category, which contains the reply:
(4). Spelling and Grammar correction
The single most common client spelling mistake is the use of "your" when "you're" or "you are" is intended. Not every occurrence of "your" however should be turned into "you're." A small amount of grammatical context is usually necessary to catch this error:
<pattern>YOUR A *</pattern>
<template>I think you mean "you're" or "you are" not "your."
<srai>YOU ARE A <star/></srai>
Here the bot both corrects the client input and acts as a language tutor.
Frequently we would like to write an AIML template which is activated by the appearance of a keyword anywhere in the input sentence. The general format of four AIML categories is illustrated by this example borrowed from ELIZA:
<pattern>MOTHER</pattern> <template> Tell me more about your family. </template>
<pattern>_ MOTHER</pattern> <template><srai>MOTHER</srai></template>
<pattern>_ MOTHER *</pattern>
The first category both detects the keyword when it appears by itself, and provides the generic response. The second category detects the keyword as the suffix of a sentence. The third detects it as the prefix of an input sentence, and finally the last category detects the keyword as an infix. Each of the last three categories uses <srai> to link to the first, so that all four cases produce the same reply, but it needs to be written and stored only once.
It is possible to write conditional branches in AIML, using only the <srai> tag. Consider three categories: <category>
<pattern>WHO IS HE</pattern> <template><srai>WHOISHE <get name="he"/></srai></template>
<template>He is <get name="he"/>.</template>
<template>I don't know who he is.</template>
Provided that the predicate "he" is initialized to "Unknown," the categories execute a conditional branch depending on whether "he" has been set. As a convenience to the botmaster, AIML also provides the equivalent function through the <condition> tag.
The keyword "that" in AIML refers to the robot's previous utterance. Specifically, if the robot responds with a multiple sentence paragraph, the value of that is set to the last sentence in the sequence. The choice of the keyword "that" is motivated by its use in ordinary language:
R: Today is yesterday.
C: That makes no sense.
R: The answer is 3.1412926 approximately.
C: That is cool.
In AIML the syntax <that>...</that> encloses a pattern that matches the robot's previous utterance. A common application of <that> is found in yes-no questions:
<that>DO YOU LIKE MOVIES</that>
<template>What is your favorite movie?</template>
This category is activated when the client says YES. The robot must find out what is he saying "yes" to. If the robot asked, "Do you like movies?," this category matches, and the response, "What is your favorite movie?," continues the conversation along the same lines. One interesting application of <that> are categories that enable the robot to respond to knock-knock jokes.
<template>Who is there?</template>
<that>WHO IS THERE</that>
<template>Ha ha very funny, <get name="name"/>.</template>
produce the following dialogue:
C: Knock knock.
R: Who's there?
R: Banana who?
C: Knock knock.
R: Who's there?
R: Banana who? C: Knock knock.
R: Who's there?
R: Orange who?
C: Orange you glad I didn't say banana.
R: Ha ha very funny, Nancy.
Internally the AIML interpreter stores the input pattern, that pattern and topic pattern along a single path, like: INPUT <that> THAT <topic> TOPIC When the values of <that> or <topic> are not specified, the program implicitly sets the values of the corresponding THAT or TOPIC pattern to the wildcard *.
The first part of the path to match is the input. If more than one category have the same input pattern, the program may distinguish between them depending on the value of <that>. If two or more categories have the same <pattern> and <that>, the final step is to choose the reply based on the <topic>. This structure suggests a design rule: never use <that> unless you have written two categories with the same <pattern>, and never use <topic> unless you write two categories with the same <pattern> and <that>. Still, one of the most useful applications for <topic> is to create subject-dependent "pickup lines," like:
<li>What's your favorite car?</li>
<li>What kind of car do you drive?</li>
<li>Do you get a lot of parking tickets?</li>
<li>My favorite car is one with a driver.</li>
Considering the vast size of the set of things people could say that are grammatically correct or semantically meaningful, the number of things people actually do say is surprisingly small. Steven Pinker,in his book How the Mind Works wrote, "Say you have ten choices for the first word to begin a sentence, ten choices for the second word (yielding 100 two-word beginnings), ten choices for the third word (yielding a thousand three-word beginnings), and so on. (Ten is in fact the approximate geometric mean of the number of word choices available at each point in assembling a grammatical and sensible sentence). A little arithmetic shows that the number of sentences of 20 words or less (not an unusual length) is about 1020."
Fortunately for chat robot programmers, Pinker's calculations are way off. Our experiments with A.L.I.C.E. indicate that the number of choices for the "first word" is more than ten, but it is only about two thousand. Specifically, about 2000 words covers 95% of all the first words input to A.L.I.C.E.. The number of choices for the second word is only about two. To be sure, there are some first words ("I" and "You" for example) that have many possible second words, but the overall average is just under two words. The average branching factor decreases with each successive word.
We have plotted some beautiful images of the A.L.I.C.E. brain contents represented by this graph (http://alice.sunlitsurf.com/documentation/gallery/).
More than just elegant pictures of the A.L.I.C.E. brain, these spiral images (see more) outline a territory of language that has been effectively "conquered" by A.L.I.C.E. and AIML. No other theory of natural language processing can better explain or reproduce the results within our territory. You don't need a complex theory of learning, neural nets, or cognitive models to explain how to chat within the limits of A.L.I.C.E.'s 25,000 categories. Our stimulus-response model is as good a theory as any other for these cases, and certainly the simplest. If there is any room left for "higher" natural language theories, it lies outside the map of the A.L.I.C.E. brain. Academics are fond of concocting riddles and linguistic paradoxes that supposedly show how difficult the natural language problem is. "John saw the mountains flying over Zurich" or "Fruit flies like a banana" reveal the ambiguity of language and the limits of an A.L.I.C.E.-style approach (though not these particular examples, of course, A.L.I.C.E. already knows about them).
In the years to come we will only advance the frontier further. The basic outline of the spiral graph may look much the same, for we have found all of the "big trees" from "A *" to "YOUR *". These trees may become bigger, but unless language itself changes we won't find any more big trees (except of course in foreign languages). The work of those seeking to explain natural language in terms of something more complex than stimulus response will take place beyond our frontier, increasingly in the hinterlands occupied by only the rarest forms of language. Our territory of language already contains the highest population of sentences that people use. Expanding the borders even more we will continue to absorb the stragglers outside, until the very last human critic cannot think of one sentence to "fool" A.L.I.C.E..
[Continue to part 2 of the interview.]