## Notes to Artificial Intelligence

1. The pair of parentheticals here are indispensable, and worth noting, since some AI researchers and/or engineers will surely not see themselves as striving to build animals and/or persons. Nonetheless, if they are operating under any of the orthodox accounts (some of which are explored below) of what artifacts AI research and engineering is to produce, the bottom line is that the artifacts that are intended to be built are accurately said to be artificial correlates of the only non-artificial intelligent beings the human race has been able to locate so far: viz., animals of the non-human variety, and us. It’s true, however, that some aspire to build artificial creatures that greatly exceed the cognitive powers of what nature has supplied; we discuss this issue separately, below.

2. Alas, none of the original attendees are still with us.

3. LT was specifically designed so as to be able to prove theorems from Russell and Whitehead’s Principia Mathematica. Upon learning of LT’s accomplishments, Russell was apparently delighted. Viewed from today, the theorems in question seem stunningly simple. For example, LT proved the law of contraposition in the propositional calculus (from $$p \rightarrow q$$ one can infer $$\lnot q \rightarrow \lnot p$$). Contemporary counterparts to LT include powerful theorem provers like Vampire (Voronkov 1995). The combination of the unprecedented LT, combined with the logicist leanings of attendee McCarthy (which would never wane), serve to mark the dawn of modern AI as one dominated by logic, which makes the current state of AI, dominated as it is by non-logicist formalisms and techniques (as we discuss below) quite interesting historically.

4. Many, many automated theorem provers (ATPs) are available to study, and in many cases obtain for experimentation. When it comes to sheer performance, Vampire is revered, but those with a background and/or interest in philosophy, and some background in logic, are perhaps best served by study of and experimentation with philosopher John Pollock’s Oscar system, discussed below. In addition, the wonderfully readable proofs generated by the ATP Prover9 make it in our estimation worthy of study and experimentation; it’s available, along with the model finder Mace4, here. But however powerful these ATPs may be when compared to LT, and to each other in the CADE ATP System Competition , one of the remarkable things about the state of automated theorem proving at present is that many logic problems that are routinely solved by the best undergraduates in logic courses [e.g., in elementary axiomatic set theory; witness the machine-resistant problems in the venerable (Suppes 1972)] can’t be solved by these ATPs. This can be verified by simply inspecting some of the problems on which ATPs, today, falter. It’s perhaps not uninteresting that today, nobelist-in-economics Herb Simon is known in economics for establishing the foundation for behavioral economics (roughly, a form of economics sensitive to the limitations of human reasoning and decision-making), but the stunning work of his that inaugurated automated theorem-proving is generally completely unknown to economists.

5. Sci-fi cinema, in point of fact, is filled with variation and shades of TT. E.g., the very recent Ex Machina is a narrative that rotates around a version of TT. And if we turn back the clock to before A.I., we can take note of Blade Runner, the “Voight-Kampff” test (taken from Dick’s (1968) seminal Do Androids Dream of Electric Sheep?) in which is none other than a version of TT. Of course, given that (as we soon point out) Turing echoed Descartes, perhaps it’s more accurate to say that such films carry the shadow of the latter, not the former.

6. It’s interesting to note that in playing chess by operating as a computer himself, Turing behaved in a fashion that accords perfectly with how he conceived, and mathematized, a computer: see (Turing 1936), wherein Turing machines are introduced as a formalization of the concept of a computist, a human carrying out simple calculation. By the way, despite speaking, as we have noted just above in the main text, of so-called child machines, Turing’s own work appears to have involved nothing of the sort. In some remarkable work that perhaps all students of logic and AI/computer science should study, Turing spent time, very early on (von Neumann alone seems to have preceded Turing in this regard, in stunningly original work of his own: Goldstine & von Neumann 1947), systematically considering how one could go about formally verifying a computer program – but the program (which computes the factorial function) is entirely classical, and Turing apparently never spent any time investigating the nature and verification of so-called learning machines. See (Morris & Jones 1984). (We are indebted to Jack Copeland for information conveyed by personal communication.) This may be as good a place as any to report that it would not be at all unreasonable (or perhaps, put with maximum circumspection, not all that unreasonable) to maintain that Leibniz unto himself at the very least brought AI extremely close to reality, from even the engineering point of view. There is in fact precious little that Leibniz was not in command of in this regard, since e.g. he invented the binary number system, came quite close to building a computing device with universal (= any Turing-computable function within its reach) capability, and had by any metric a clear conception of at least modern-day logicist AI. There is also the now-confirmed (see Lenzen 2004) fact that Leibniz had Boolean logic 1.5 centuries before Boole, and in addition had a large part of modern modal logic.

7. One of the interesting things about Descartes’ position is that he seems to anticipate a distinct mode of reasoning identified by contemporary psychologists and cognitive scientists: so-called System 2 reasoning. The hallmark of System 2 reasoning is that it is efficaciously applicable to diverse domains, presumably by understanding underlying structure at a very deep level. System 1 cognition, on the other hand, is chained inflexibly to concrete situations. Stanovich and West (2000) provide an excellent treatment of System 1 and 2 cognition, and related matters (such as that the symbol-driven marketplace of the modern civilized world appears to place a premium on System 2 cognition.) It’s by the way accurate to say that today’s behavioral economics is based on the attempt to systematize System-2 cognition in humans, and then employ that systematization in economic modeling, explanation, forecasting, and so on. For a readable, engaging introduction to behavioral economics, see Nobelist Kahneman’s (2013).

8. Actually, Descartes proposed a test that is much more demanding than TT, but we don’t explain and defend this herein. In a nutshell, if you read the passage very carefully, you’ll see that Descartes’ test is passed only if the computer has the capacity to answer arbitrary questions. A machine which has a set of stored chunks of text that happen to perfectly fit the queries given it during a Turing test would not pass Descartes’ test – even though it would pass Turing’s.

9. Those who either watched the landmark competition, or read about what happened, may be a bit surprised to see that we use the adjective ‘nail-biting,’ since in the end Watson won handily. The key phrase is ‘in the end.’ For during the competition, when it was very close between man and machine, Watson managed to quite by luck draw what is called in Jeopardy a ‘Daily Double.’ This allowed Watson to secure, on this one question, a decisive amount of money. Unlike chess, Jeopardy! has an element of chance built into the game.

10. Which isn’t to say that there are no proposals for what natural languages fundamentally are, formally speaking: Montague, e.g., famously proposed that natural language is at its heart formal language of a sort with which logicians are comfortable; see e.g. (Montague 1974), in which he declares that it is “possible to comprehend the syntax and semantics of [natural and artificial languages] within a single natural and mathematically precise theory” (p. 222).

11. Actually, the computational complexity of both Chess and Go is EXPTIME within the Polynomial Hierarchy. This implies two things, immediately: that in a clear (and formal) sense Go is no harder than Chess, and that both are Turing-solvable games. It’s obvious that humans routinely tackle and succeed on problems that are in the space of Turing-unsolvable problems (a first course in axiomatic set theory provides examples), and on such problems no AI system excels. Jeopardy! presents the logician with a different “assessment” challenge, because e.g. the level of bets made in this game by Player A may well be dictated in part by A’s beliefs about the states of mind of Players B and C (e.g. how nervous they are). On the other hand, the straight QA side of Jeopardy! is rather easily proved to be NP-complete; Bringsjord can be emailed for the proof.

12. For reactions from members of the AI community working in techniques close to that used by AlphaGo, see Yann Le Cunn’s reaction here and Langford’s reaction here. For a more detailed commentary on what this AlphaGo’s victory means for AI, see Brundage’s commentary.

13. It would probably be more accurate to say here ‘inductive logic’ rather than ‘probability theory’. Those working in AI will almost invariably be familiar with the latter, and yet (even if they are intimately familiar with logicist AI) not be familiar with the former. The received view of inductive logic is that it subsumes probability theory (Fitelson 2005). As of yet, we have no explanation for why even in quarters of AI dominated by logic, the logic is almost not inductive logic.

14. The cover of the most recent edition references Bayes and Aristotle.

15. This theory is by the way another yet another indicator of the ancient roots of AI, as suggested pictorially by AIMA covers, discussed above: AIMA1e’s cover, as we noted in passing earlier, offers a glimpse of Lewis Carroll’s notation for Aristotle’s theory.

16. What do we mean by the “expected” utility of an agent? For readers unfamiliar with probability theory, a quick intuitive explanation follows. Given an agent represented by $$f$$ working in an environment $$E$$, the performance measure $$U$$ will assign a concrete value $$u_k$$ over one specific lifetime $$l_k$$ of the agent. This value might not be representative of how good the agent is: The environment or the agent or both could be nondeterministic, and different runs or lifetimes could produce different values. One can think of the expected utility as nothing but the average of the utility over a large number of lifetimes of the agent $$\left\{l_1, l_2, \ldots, l_m\right\}$$.

$V(f,\bE,U) = \frac{\left(u_1 + u_2 + \ldots + u_m \right)}{m}$

17. There are some obvious objections that come to mind once Russell’s position is understood. For example, bounded optimality seems to be at odds with carrying out research now that lays a foundation for future work – work that will inevitably be based on machines that are much, much more powerful than the ones we have today. This is an objection Russell anticipates; it leads him to present an account of asymptotic bounded optimality. Informally put, this account says that a program is “along the right lines” iff with speedup (or more space) its worst-case performance is as good as any other program in all environments. Details are available in (Russell 1997).

18. In late 2015 and early 2016, the Allen Institute for Artificial Intelligence ran a competition titled “Is your model smarter than an 8th grader” soliciting submissions that could beat the state-of-the-art system in answering a standardized grade 8 science exam.

19. Of interest to some may be the fact that uncertainty cashed out non-probabilistically is nowhere to found in Part IV. Rigorous approaches to reasoning with, about, and over uncertainty in the absence of probability theory include that of (Chisholm 1966, 1977).

20. Some alert readers will note that there is an apparent clash of conceptualizations between Russell & Norvig on the one hand, and Charniak & McDermott on the other. In terms of the four-bin classification of approaches to AI discussed above (and provided, as noted, by R&N themselves), C&M seem to be endorsing the think/human and act/human views of AI, while R&N, whose progression of increasingly smart agents we’ve just presented, strongly endorse the ideal/act view. Actually, the animal-and person-centric language of C&M can be viewed as a very convenient shortcut, because the creatures they have in mind as targets for AI have powers that vault them high up the intelligent-agent progression laid down by R&N. This gives rise to the question: What would the difference be, if any, between a person, and a human-like rational agent? Certainly a cardinal difference, at least for philosophers, would be that persons by definition are presumably conscious (and indeed self-conscious), yet we don’t find such properties to be entailed by rationality, nor by any of the behaviors central to the the R&N intelligent-agent continuum.

21. We assume that any perceived narrowness in the phrase ‘learning by reading’ evaporates once one considers that e.g. learning by being told something via an utterance is reducible to the former.

22. Of all the the agencies within the United States, the Information Processing Technology Office (IPTO) at the Defense Advanced Research Projects Agency (DARPA) occupies a unique position. IPTO has supported AI since its inception, and at present, it continues to guide AI forward through visionary programs. It is therefore interesting to note that, at a 2003 celebration of IPTO and its (at that time) 40 years of steadfast sponsorship of research and development in the area of intelligent systems, a number of scientists and engineers whose careers go back to the dawn of AI in the 1950s complained that contemporary machine learning has come to be identified with function-based learning. They pointed out that most clever adults predominantly learn by reading, and called for an attack on this problem in the future. In a sign that the concerns voiced here have gained some traction, there was a Spring 2007 American Association for Artificial Intelligence Spring Symposium on Machine Reading.

23. A practical manifestation of the above discussed deficiency in machines can be seen in how difficult it is for non-practitioners of AI to teach machines to do a certain task or improve already built systems. For example, consider GloVe (Pennington et al. 2014), a function learning framework that maps words in a natural language such as English to vectors in $$\mathcal{R}^n$$. Mappings like this have a variety of uses in building natural language processing systems. Let $$W$$ be the set of all English (or any other natural language) words we are interested in. Then, given just a large volume of natural text $$t\in W^k$$ (e.g. the English Wikipedia), the system derives a function $$\mathbf{g}: W \rightarrow \mathcal{R}^n$$. Among other uses, one quite astonishing task that can be performed using such vector representations is analogical reasoning. An analogical pair such as $$man$$:$$woman$$::$$king$$:$$queen$$ is said to hold when $$\big(\mathbf{g}(man)-\mathbf{g}(woman)\big) \approx\big(\mathbf{g}(king)-\mathbf{g}(queen)\big)$$. The system can by just looking at the text from Wikipedia (or other similar sources) derive a function $$\mathbf{g}$$ such that following relations hold:

$$man$$:$$woman$$ :: $$king$$:$$queen$$
$$man$$:$$woman$$ :: $$uncle$$:$$aunt$$
$$Anaheim$$:92804 :: $$Honolulu$$:$$96817$$ (the relationship here is city:zip code)
$$strong$$:$$stronger$$ :: $$dark$$:$$darker$$

While GloVe can derive such insightful representations, it is nowhere near complete or accurate; we do not expect it to be. For instance, the reader can easily find a pair of words that ought to be similar but the system thinks otherwise. Using the largest out-of-the-box models shipped with the GloVe system, the reader can find one such pair to be: $$\langle paris, Paris\rangle$$, for which the system assigns quite disimilar vectors (the exact pair is not important here). If a human commits such a mistake, it is easy to fix the mistake: we just inform the person of the error. Right now, fixing the above system for such mis-learnt pairs of words involves a lengthy retraining period over a much larger volume of text that hopefully captures the relationship we need. While a learning system can incorporate a single example and improve upon it, the example presented has to be in a rigid format. This is usually done by the system’s builders and not by end lay users. The relatively new subdiscipline of machine teaching aims to rectify this by building machines that can be taught in a more human-like fashion by lay people. See also this blog post from Microsoft Research. (For an overview of how systems like GloVe work, see this course on neural networks in natural language processing.)

24. For confirmation in the case of cognitive psychology, see (Ashcraft 1994). The field of computational cognitive modeling seeks to uncover the nature of human cognition by capturing that cognition in standard computation, and is therefore obviously intimately related to AI. For an excellent overview of computational cognitive modeling that nonetheless reveals the field’s failure to confront subjective consciousness, see (Anderson & Lebiere 2003). Ron Sun (1994, 2002) is perhaps unique among computational cognitive modelers in that he considers topics of traditional interest to philosophers.

25. Of course, once something at least fairly narrow that was formerly the province of humans is sufficiently “AI-ified,” the machine may be better at it than any human. Exhibit A: chess.

26. Sometimes AI that puts an emphasis on declarative knowledge and reasoning over that knowledge is referred to not as logic-based or logicist AI, but instead as knowledge-based AI. For example, see (Brachman and Levesque 2004). However, any and all formalisms and techniques constitutive of knowledge-based AI are fundamentally logic-based, but their underlying formal structure may be concealed in the interests of making it easier for practitioners without extensive training in mathematical and philosophical logic to grasp these formalisms and deploy these techniques.

27. I point out for cognoscenti that I here expand the traditional concept of a logical system as deployed, e.g., in Lindström’s Theorems, which are elegantly presented in (Ebbinghaus et al. 1984).

28. There is a confession to be made about work that has been carried out so far in AI. Though there has been work in multi-modal logics (for example logics with necessity, possibility, knowledge, belief etc), there has not been much bonafide work in formally addressing some of the thorniest of philosophical issues. For instance, how does one distinguish between abstract and fictional entities such as the square circle, a golden mountain, colorless green ideas etc. In standard extensional logics, all these objects would be the same as they have the same empty extension. Intensional logics are needed to model such concepts with fidelity. For a formal and seminal discussion of such issues in intensional logic and intentionality, see (Zalta 1988). Here, Zalta has a theory of abstract objects that he uses to interpret, for example, fictional characters, stories etc. There does exist one AI system which tries to deal with such issues. The SNePS system described in (Rapaport & Shapiro 1999) can read stories in natural language and distinguish between fictional objects and real objects, and between facts and fictions about real entities.

29. Please understand that AI has always been very much at the mercy of the vicissitudes of funding provided to researchers in the field by the United States Department of Defense (DoD). (The inaugural 1956 workshop was funded by DARPA, and many representatives from this organization attended AI@50.) It’s this fundamental fact that causally contributed to the temporary hibernation of AI carried out on the basis of artificial neural networks: When Minsky and Pappert (1959) bemoaned the limitations of neural networks, it was the funding agencies that held back money for research based upon them. Since the late 1950s it’s safe to say that the DoD has sponsored the development of many logics intended to advance AI and lead to helpful applications. It has occurred to many in the DoD that this sponsorship has led to a plethora of logics between which no translation can occur. In short, the situation is a mess, and now real money is being spent to try to fix it, through standardization and machine translation (between logical, not natural, languages). It may be worth noting here, as well, that the 1956 conference was sponsored by DARPA in response to a proposal claiming that “a large part of human thought” is based on declarative knowlege, and logic-based reasoning over that knowledge. See Ron Brachman’s “A Large Part of Human Thought,” July 13, 2006, Dartmouth College, at AI@50.

30. The broader category, of which neural nets may soon enough be just a small part, is that of statistical learning algorithms. Chapter 20 of AIMA2/3e provides a very nice discussion of this category. It’s important to realize that that artificial neural networks are just that: artificial. They don’t correspond to what happens in real human brains (Reeke & Edelman 1988).

31. In point of fact, the global economy, and in particular and especially that of the U.S., has exploded in the area of AI. There can e.g. be little question that Google is fundamentally an AI company. And Apple Inc, measured by market capitalization that largest company on Earth, has not only Siri, but an extensive patent portfolio that includes many AI agents on systems.

32. Were you to have begun formal coursework in AI in 1985, your textbook would likely have been Eugene Charniak’s comprehensive-at-the-time Introduction to Artificial Intelligence (Charniak & McDermott 1985). This book gives a strikingly unified presentation of AI – as of the early 1980s. This unification is achieved via first-order logic (FOL), which runs throughout the book and binds things together. For example: In the chapter on computer vision (3), everyday objects like bowling balls are represented in FOL. In the chapter on parsing language (4), the meaning of words, phrases, and sentences are identified with corresponding formulae in FOL (e.g., they reduce “the red block” to FOL on page 229). In Chapter 6, “Logic and Deduction”, everything revolves around FOL and proofs therein (with an advanced section on nonmonotonic reasoning couched in FOL as well). And Chapter 8 is devoted to abduction and uncertainty, where once again FOL, not probability theory, is the foundation. It’s clear that FOL renders (Charniak & McDermott 1985) esemplastic. Today, due to the explosion of content in AI, this kind of unification is no longer possible.

Though there is no need to get carried away in trying to quantify the explosion of AI content, it isn’t hard to begin to do so for any skeptics. (Charniak & McDermott 1985) has 710 pages. The first edition of AIMA, published ten years later in 1995, has 932 pages, each with about 20% more words per page than C&M’s book. The second edition of AIMA weighs in at a backpack-straining 1023 pages, with new chapters on probabilistic language processing, and uncertain temporal reasoning. The third edition has 1109 pages with the authors estimating that 20% of the content is new.

The explosion of AI content can also be seen topically. C&M cover nine highest-level topics, each in some way tied firmly to FOL implemented in (a dialect of) the programming language Lisp, and each (with the exception of Deduction, whose additional space testifies further to the centrality of FOL) covered in one chapter:

1. FOL for Internal Representation
2. Vision
3. Language Parsing
4. Language Understanding
5. Search Techniques
6. Deduction (two chapters)
7. Abduction and Expert Systems
8. Planning
9. Learning

In AIMA the expansion is obvious. For example, Search is given three full chapters, and Learning is given four chapters. AIMA also includes coverage of topics not present in C&M’s book; one example is robotics, which is given its own chapter in AIMA. In the second and third editions, as mentioned, there are two new chapters: one on constraint satisfaction that constitutes a lead-in to logic, and one on uncertain temporal reasoning that covers hidden Markov models, Kalman filters, and dynamic Bayesian networks. A lot of other additional material appears in new sections introduced into chapters seen in the first edition. For example, the second edition includes coverage of propositional logic as a bona fide framework for building significant intelligent agents. In the first edition, such logic is introduced mainly to facilitate the reader’s understanding of full FOL.

33. In no way do we mean to suggest that AI research is now exclusively hybrid. A recent treatment of the symbolic approach makes this clear: (Brachman & Levesque 2004). B&L explain that their book is based on what they say is a “daring” hypothesis, viz., that a top-down approach which ignores neurological details in favor of abstract models of cognition pays great dividends. In addition, a recent argument against a connectionist approach to simulating human literary creativity can be found in (Bringsjord & Ferrucci 2000).

34. Sometimes casual students of AI, logic, and philosophy come to believe that uncertainty has been the phenomenon causing a departure from logicist/symbolic approaches. It’s important, especially given the nature of the present venue, to realize that the topic of uncertainty has long been a staple in logic, logicist AI, and epistemology. In fact, alert readers will have noted that (Charniak and McDermott 1985) contains a chapter devoted to the topic.

The uncertainty challenge can be expressed by considering difficulties that arise when an attempt is made to capture what philosophers often call practical reasoning: Suppose that we would like to take a bit of a break from working on this entry for SEP, and would specifically like to fetch today’s mail from my mailbox. What does it take for us to accomplish this goal? Well, in order to get the mail, and here we zero in on Bringsjord’s point of view to ease exposition, I will need to exit the house in which I live, walk approximately half way down my driveway, cut across grass under my three Chinese elms, reach my mailbox, open it, reach in, and so on; you get the idea; I omit the remainder. Suppose that this plan consists in the successive execution of actions, starting from some initial state $$s_1$$ (my being in my study, before deciding to take the postal break), a constant in FOL. Suppose that a function does can be applied to a constant $$a_i$$ (which denotes some action) in a situation $$s_j$$ to produce a new situation $$does(a_i, s_j)$$. Given this scheme, we can think of what I plan to do as performing a sequence of actions that will result in a situation in which I have today’s mail:

$HaveMail(does(a_n, does(a_{n-1}, \ldots does(a_1, s_1 \ldots))))$

Given this, if I were a robot idling in my study, how would I retrive a plan that would allow me to reach the goal of having mail? To ease exposition, let’s say that I would simply attempt to prove the following formula. If provable, the witnesses are the actions I need to successively perform.

$\exists x_1, \ldots, x_n (HaveMail(does(x_n, does(x_{n-1}, \ldots does(x_1, s_1 \ldots)))))$

Unfortunately, this approach will not work in many, if not most, cases: Yesterday I went to retrieve my mail, and found to my surprise that my usual route to my mailbox, which runs beneath my beloved elms, was cordoned off, because one of these massive trees was being cut down by a crew – without prior authorization from me. My plan was shot; I needed a new one – one with a rather elaborate detour, given the topography of my land. (I of course also needed, on the spot, a plan to deal with the fact that this tree, my tree, was unaccountably targeted for death.) I had made it to the point just before passing beneath the elms, so I now needed a sequence of actions that, if performed from this situation, would eventuate in my having my mail. But this complication is one from among an infinite class: lots of other things could have derailed my original plan. The bottom line is that the world is uncertain, and using “straight” logic to deduce plans in advance will therefore work at best in only a few cases.

Notice that we say straight logic. The postal example we have given is a counter-example to only one approach (situation calculus and Green’s Method) within one logical system (FOL). It hardly follows that, in general, a logicist approach can’t deal with the uncertainty challenge.

We would be remiss if we did not point out that the uncertainty challenge is a core problem in epistemology. It has long been realized that an adequate theory of knowledge must take account of the fact that, while some of what we know may be self-evident, and while some of what we know may be derived deductively from the self-evident, most of what we know is far from certain. Moreover, there are well-known arguments in the philosophical literature purporting to show that much of what we know cannot be based on the result of inductive reasoning over that which is certain. (A classic treatment of these issues can be found in Roderick Chisholm’s (1977) Theory of Knowledge.) The connection between this literature, and the uncertainty challenge in AI, is easy to see: The AI researcher is concerned with modeling and computationally simulating (if not outright replicating) intelligent behavior in the face of uncertainty, and the epistemologist seeks a theory of how our intelligent behavior in the face of uncertainty can be analyzed.

35. Malle et al. (2015) have found out that humans judge robot and humans actors differently when they participate in hypothetical moral dilemmas.

36. We express deep gratitude to the Office of Naval Research for grant support that enabled and enables us to analyze the landscape of work in robot/machine ethics, from both a philosophical and a technical perspective.

37. A deeper understanding of the distinction can be obtained by elaborating upon the route we suggest, in connection with a specific paradox. Take The Liar Paradox, for instance. An economical presentation of this paradox, given in each cycle of introductory philosophy and logic classes at colleges and universities around the globe, is often given by writing down or displaying some such sentence as ‘This sentence is false.’ (L), whereupon the instructor deduces a contradiction (e.g., L is true iff L is not true). One then passes immediately to Philosophical AI if one configures the following challenge: Write a computer program $$P_1$$ that maps one or two English sentences to their underlying meaning, expressed in a rigorous representation scheme that allows deduction. In addition, write a computer program $$P_2$$ that, given formulae (composing set $$\Phi$$) that represent information conveyed in English, searches for a proof of contradiction (= searches for a proof confirming $$\Phi \vdash \phi \wedge \neg \phi$$). Show by a working demonstration on simple examples that both $$P_1$$ and $$P_2$$ work, and then show that running first $$P_1$$ on the pair ‘The sentence S2 is false’ (S1) and ‘The sentence S1 is true’ (S2) does not result in a proof of a contradiction.

38. The literature on hypercomputation has exploded recently. As I have mentioned, one of the earliest sort of hypercomputational device is a so-called trial-and-error machine (Putnam 1965; Gold 1965), but much has happened since then. Volume 317 2004 of Theoretical Computer Science is devoted entirely to hypercomputation. Before this special issue, TCS also featured an interesting kind of hypercomputational machine: so-called analog chaotic neural networks: (Siegelmann and Sontag 1994). These machines, and others, are discussed in (Siegelmann 1999). For more on hypercomputation, with hooks to philosophy, see: (Copeland 1998; Bringsjord 1998, 2002).

39. Searle has given various more general forms of the argument. For example, he summarizes the argument on page 39 of (Searle 1984) as one in which from

2. Syntax is not sufficient for semantics.

3. Computer programs are entirely defined by their formal, or syntactical, structure.

4. Minds have mental contents; specifically, they have semantic contents.

No computer program by itself is sufficient to give a system a mind. Programs, in short, are not minds, and they are not by themselves sufficient for having minds.

40. The dialectic appeared in 1996, volume 2 of Psyche, available online.

41. An analogous version of this argument, employing Metcalfe’s law, for the Semantic Web has been given by Hendler and Golbeck (2008).

42. It seems reasonable to say, about at least most of these predictions, that they presuppose a direct connection between the storage capacity and processing speed of computers, and human-level intelligence. Specifically, the assumption seems to be that if computers process information at a certain speed, and can store it in sufficiently large quantities, human-level mentation will be enabled. This is actually a remarkable assumption, when you think about it. Standard Turing machines as defined in the textbooks (e.g., as they are defined in Lewis and Papadimitriou, 1981) have arbitrarily large storage capacity, and perform at arbitrarily fast speeds (each step can be assumed to take any finite amount of time). And yet programming these Turing machines to accomplish particular tasks can be fiendishly difficult. The truly challenging part of building a computer to perform at the level of a human is devising the representations and algorithms to enable it to do.

43. Joy’s paper is available online. Also, rest assured that you can type “Why The Future Doesn’t Need Us Bill Joy” into any passable search engine.

44. The pattern runs as follows: If science policy allows science and engineering in area $$X$$ to continue, then it’s possible that state of affairs $$P$$ will result; if $$P$$ results, then disastrous state of affairs $$Q$$ will possibly ensue; therefore we ought not to allow $$X$$. Of course, this is a deductively invalid inference schema. If the schema were accepted, with a modicum of imagination you could prohibit any science and engineering effort whatsoever. You would simply begin by enlisting the help of a creative writer to dream up an imaginative but dangerous state of affairs $$P$$ that is possible given $$X$$. You would then have the writer continue the story so that disastrous consequences of $$P$$ arrive in the narrative, and lo and behold you have “established” that $$X$$ must be banned.

45. Here’s the relevant quote from Joy’s paper: “I had missed Ray’s talk and the subsequent panel that Ray and John had been on, and they now picked right up where they’d left off, with Ray saying that the rate of improvement of technology was going to accelerate and that we were going to become robots or fuse with robots or something like that, and John countering that this couldn’t happen, because the robots couldn’t be conscious.”

46. A sustained discussion of the nature of AI in connection specifically with distinction between mere animals and persons can be found in (Bringsjord 2000).

## Notes to Supplements

A1. The study of this process goes by various names such as computational learning theory, language learning or formal models of science (Osherson et al. 1986). One crucial component missing in such models is justification. Scientists are much more than hypothesis generating machines. Each hypothesis needs to have some justification or argument that builds upon accepted knowledge or experimental evidence and proceeds via correct rules of inference. This requires a much more expressive formalism than agents divining the contents of a black box computer, some ingredients of such a formalism are discussed in our (Bringsjord et al. 2010).

A2. Though there has been some work in what the authors of this entry term as a Serious Computational Science of Intelligence, very few approaches fall under this umbrella. The Universal Artificial Intelligence (UAI) model from Hutter (2005), of which AIXI is a part, comes closest. See the workshop paper for a scorecard of different related formalisms.

B1. For readers familiar with Excel, an Excel and Apple Numbers file for computing the graph can be downloaded from GitHub. Also see the Google Docs sheet.

B2. Obviously, there are other excellent textbooks that serve to introduce and, at least to some degree, canvass, AI. For example, there is the commendable trio: (Ginsberg 1993), (Nilsson 1987), and (Winston 1992). (Winston’s book is the third edition. In Nilsson’s case, this is his second intro book; the first was (Nilsson 1987).) The reader should rest assured that in each case, whether from this trio or whether AIMA, the coverage is basically the same; the core topics don’t vary. In fact, Nilsson’s (1998) book, as he states in his preface, is explicitly an “agent-based” approach, and in fact the book, like AIMA, is written as a progression from the simplest agent through the most capable. (We say a bit about this in the main text, later.) Clearly, then, our reliance on AIMA in no way makes the present SEP entry idiosyncratic. Finally, arguably the attribute most important to an entry such as the present one is “encyclopedic” coverage of AI – and AIMA delivers in this regard like no other extant text. This situation may change in the future, and if it does, the present entry would of course be updated.

O1. It must be said here that, for a while at least, OSCAR was also distinguished by being quite a fast automated theorem prover (ATP). I say ‘was’ because there has of late been a new wave of faster and faster first-order provers. Speed in machine reasoning is an exceedingly relative concept. What’s fast today is inevitably slow tomorrow; it all hinges on what the competition is doing. In OSCAR’s case, the competition now includes not just the likes of Otter (see e.g. Wos et al. 1992; and go to the Automated Deduction at Argonne page for a wonderful set of resources related to Otter), but also Vampire (Vronkov 1995). (Vampire’s core algorithm coincides with Otter’s, but increased speed can come from many sources, including how propositions are indexed and organized.) It seems to me that some of OSCAR’s speed derives from the fact that in searching for proofs OSCAR approximates some form of goal analysis as a technique for finding proofs in a natural deduction format. Goal analysis will be familiar to those philosophers who have taught natural deduction. The performance of OSCAR and other systems can be found at the TPTP site.

O2. Why is this a problem? First-order logic has issues with representing statements of the form “Person X knows that ‘Roses are red’.” or “Person Y does not believe that ‘Roses are red’.” Such statements are known as intensional statements. Statements such as “Roses are red.” or “$$3^2 + 4^2 = 5^2$$” are known as extensional statements. When we try to model intensional statements in first-order logic, we quickly run into problems (Anderson 1983; Bringsjord & Govindarajulu 2012).