Notes to Associationist Theories of Thought

1. Empiricists who have wanted more than one type of learning mechanism have tended to be constructivists. The basic constructivist position is to posit a single mental process, the ability to associate ideas, and to construct new processes out of the single innate process (see, Fodor 1983 for discussion).

2. Though many later associationists, such as Pavlov and the behaviorists, had only one mental process, Hume in fact also had the imagination. For discussion on how the imagination meshes with Hume’s empiricism and associationism see Fodor (2003).

3. That said, one can detect aspects of associationism in earlier writers, such as Descartes when discussing memory and Spinoza when discussing the emotions (see the entry on Descartes and on 17th and 18th Century Theories of Emotions.

4. Although Hume is generally acknowledged as laying the theoretical foundation of associationism, there is some evidence that Francis Hutcheson’s use of associations greatly influenced him. See the entry on Scottish Philosophy in the 18th Century.

5. “All our simple ideas in their first appearance are deriv’d from simple impressions, which are correspondent to them, and which they exactly represent” (T

6. This is a bit of a loose formulation. Strictly speaking, impressions themselves don’t instantiate any associative relation, rather the contents of the Impressions do. For example, it isn’t that one’s Impression (understood as a vehicle of thought) of chickens resembles roosters; rather it’s the content of one’s impressions resemble one another. Presumably, all Impressions qua vehicles of thought resemble one another merely by being Impressions. What differs between Impressions is, e.g., whether the content they represent resembles other represented content. This distinction between vehicle and content is important for Hume’s overall architecture: it’s not the vehicle of the Impression that gets copied into an Idea, but rather the content of that vehicle. That said, to ease exposition the distinction between vehicles and contents is elided in the main text except where it is important to distinguish.

7. Although some contemporary associationist views still retain all three original Humean associative relations, the resemblance relation has come under the most scrutiny and is the least popular of the three. For discussions of the problem of the resemblance criterion see Field and Davey (1999), and De Houwer (2009). In the canonical Rescorla Wagner model (Rescorla and Wagner 1972), both contiguity and resemblance are superseded by the contingency requirement.

8. A variation on classical conditioning is evaluative conditioning, where one tries to transfer the valence of the US onto the CS (see, e.g., De Houwer et al. 2001 for an overview). For instance, one might pair a favorable flavor (e.g., sugar) with a novel neutral face stimulus, in order to transfer the positive valence to the previously neutral face.

9. There are many different ways of construing the details of Pavlovian conditioning. For example, some would restrict the usage further by arguing that the US must be biologically significant, or widen the usage, as Rescorla does (see section 7). Some anti-associationists even believe that Pavlovian conditioning is real, but not predicated on associations (Mitchell et al. 2009).

10. Classical conditioning also had some consequences that were a bit unpalatable for empiricists: if all learning was to be given as forming associative bonds between USs, CSs, and responses, then all of our learning had to bottom out in some behaviors that were preprogrammed to correspond to certain stimuli: in other words, certain instinctual patterns of behavior were innately set to be elicited by certain stimuli. Even more problematically, such instinctual patterns were apt to be species specific, so not generalizable to humans.

11. Note how Thorndike does not hesitate to speak of mental states like satisfaction and dissatisfaction, as opposed to the most famous practitioner of operant conditioning, the radical behaviorist B.F. Skinner (see the behaviorism entry).

12. From this level of abstraction, Pavlov and Skinner were united. Here’s Garcia’s on Skinnerian learning:

Any stimulus applied immediately after the response which, by empirical test, would increase response production was deemed a reinforcer…The general procedures were said to be applicable to any and all reflexes, in any and all organisms. There was no need to concern ourselves with species differences, with brain differences, or with reinforcer differences. The payoff schedule’s the thing wherein we’d capture control of the organism. (Garcia 1981: 155)

13. Radical behaviorists such as Skinner (e.g., 1953) would deny this claim, but only because of their ontological objections to reifying mental states. But Eliminativism of the mental is a different thesis than associationism, although both fit together well (see section 6).

14. Hereafter I will use the forward slash to denote an associative bond between the entities on either side of the slash. Additionally, expressions written in small caps will be used to denote concepts, and I will assume that the concepts’ structural descriptions are given by the expressions. Thus red bird is taken to be a complex concept consisting of two meaningful parts, the concept red and the concept bird. However, bird will be assumed to be a simple concept with no semantically decomposable parts. The structural descriptions are stipulated for exegetical reasons and without commitment to the actual structure of the corresponding concepts.

15. The mediation parenthetical can get a bit complicated to state, for one might want to claim that, e.g., wrench and hammer are associated, even if the association is mediated through a link between screwdriver. In which case, it’s best to say that two concepts form a basic associative structure if the activation of one concept brings on the activation of another without there being any other mediating psychological variable.

16. This claim should be qualified in a few ways. First, the mapping might not be a full mapping of a single thinker as opposed to a subsystem of a single thinker (such as their intramodular representation of their lexicon, see Fodor 1983). Secondly, the mapping needn’t be between concepts per se, and can instead be between mental representations that for some reason or another one needn’t bestow the honorific of “concepts” to (because, for example, the mental representations are intramodular and thus not properly “general”, see Evans 1982).

17. “Experiencing Xs and Ys” generally means something such as “having formed representations of Xs and Ys based on their appearance in the ambient environment,” but needn’t necessarily mean that. If one just happened to keep thinking x followed by y for any reason, even though Xs and Ys weren’t given in experience that too could change the associative strength of the x/y bond. Additionally, some theories allow “piggybacking” associations—associations formed from activated propositional structures. For example, constantly having the propositional thought molly owns a dog could affect the associative bond between molly and dog (see Mandelbaum forthcoming for discussion).

18. Although bare-boned associationism provides a good approximation of Hume and Pavlov, it doesn’t quite capture the full theory of those working in operant conditioning paradigms for it doesn’t involve any notion of reinforcement, or updating one’s associative structure based on consequences. This isn’t accidental: how to square cognitive updating (i.e., association-based or belief-based updating) based on consequences with the Spartan tenets of associationism has often been a point of difficulty (see, e.g., Festinger and Carlsmith 1959).

19. Curiously, it appears that extinction isn’t very effective in evaluative conditioning paradigms, though counterconditioning is (see De Houwer 2011 for many citations, such as Diaz et al. 2005 and Vansteenwegen et al. 2006).

20. Technically, reinstatement is the reappearance of the CR upon reexposure to the US after successful extinction, whereas spontaneous recovery is the name for the return of the associative pairing just due to the passage of time. Both reinstatement and spontaneous recovery are related, and both provide difficulties for the traditional view of extinction.

21. Interestingly, Locke also seemed to understand the nature of taste aversions (see section 9.4):

A grown person surfeiting with honey no sooner hears the name of it, but his fancy immediately carries sickness and qualms to his stomach, and he cannot bear the very idea of it; other ideas of dislike, and sickness, and vomiting, presently accompany it, and he is disturbed; but he knows from whence to date this weakness, and can tell how he got this indisposition. Had this happened to him by an over-dose of honey when a child, all the same effects would have followed; but the cause would have been mistaken, and the antipathy counted natural. (Locke 1690: 2.23.7).

22. In the example of associative transitions offered above, we used associations between propositions. But of course a pure associationist view would not allow propositional structures. It is thus a bit more difficult for a pure associationist to distinguish associative transitions from associative structures. For the pure associationist, all transitions are associative transitions among associative structures, for association is the only available mental process and associative structures the only available mental structure. Thus, for the pure associationist, the only possible difference between an associative structure and an associative transition is a contingent temporal one (where an associative structure is ideally contemporaneous whereas an associative transition unfolds over time).

23. The question of how many levels of explanation one allows in their cognitive architecture is a wholly separate question of whether any of those architectures are associationistic. Generalizations here vary wildly from theorist to theorist. For example, many theorists, roughly following Marr (1982), assume there is just one algorithmic (psychological/representational) level which is then instantiated in a physical (neurological) level (see, e.g., Mitchell et al. 2009). Others generally assume that there are multiple psychological levels. For instance, Fodor writes, “psychological faculties at the nth level are typically implemented by psychological faculties at the n−1th level” (2003: 132).

24. In this context, “sub-symbolic” just means that the node on its own has no semantic value. In other words, a single node wouldn’t represent any content.

25. There are no domain-specific associationists because associative learning is incompatible with domain specificity. Domain specificity assumes different mental processes for different domains, and associative learning presupposes the same learning mechanism regardless of domain.

26. For example, in a “default-interventionist” model System 2 processes are not always engaged though they are in “parallel competitive” models (both models include the constant automatic engagement of System 1). See Evans and Stanovich 2013 for discussion.

27. Gallistel and King (2009: 239) argue that there is no such window. Instead they argue that what matters for learning in place of contiguity is a ratio of the time between the presentation of the CS and the appearance of the US as compared to the time between different US presentations (in a given context). For example, speeding up the CS/US connection by a factor of two reduces the amount of US presentations one needs by half.

28. It appears that content specificity of associations needn’t just be based on innate dispositions. For example, in an evaluative conditioning paradigm using odors as USs and faces as CSs, the evaluative conditioning only commenced when the odors were interpreted as plausibly human (Todrank et al. 1995). But “plausibly human” included learned information (such as the odors associated with soap). When the odors were typically associated with objects and not humans, no learning transpired. Additionally, there appears to be content-specific differences in associative learning at a greater level of abstraction: there is evidence that negative US/CS pairings are learned more quickly, and form stronger bonds than positive US/CS pairings (Rozin 1986, Baeyens et al. 1990.)

29. Blocking has been observed in humans (see Dickinson et al. 1984) but one needn’t delve into the empirical literature to feel the pull of the phenomenon. Imagine you’ve eaten an orange and immediately have an allergic reaction. If in your next meal you eat an orange and an apple and have the allergic reaction, you will be less likely to think the apple caused the reaction than you would were you to have never experienced the allergic reaction after eating the orange.

30. More problematically for associationists, blocking doesn’t always work, but when it doesn’t isn’t predictable by associative theory. For example, if a weak odor is paired with a strong taste and the pairing is followed by gastrointestinal distress, the taste magnifies the sensitivity of the odor as a signal (Rusiniak et al. 1979). Relatedly, if a hawk eats a black mouse and gets sick, the hawk won’t just avoid black mice but will avoid all mice. However, if the black mouse tastes different than a white mouse, then the hawk will continue to eat white mice even after black mice make it sick (Brett et al. 1976).

31. Oddly enough, evaluative conditioning does not seem as sensitive to base rates or as susceptible to “occasion setting” as classical conditioning is. See De Houwer et al. 2001).

32. The more one looks into how locational properties become associated, the more problems seem to mount. For example, if a rat has a strong preference for a particular drink but gets shocked while ingesting that drink, the rat will not change its preference of the flavor. Instead, the rat will just learn to avoid the drink when it encounters it in the experimental location. But when the rat is given a chance to ingest the drink anywhere else (e.g., back in its home cage) it will still continue to ingest the drink. Furthermore, in the case where the rat gets shocked while drinking the highly desirable flavor in the Skinner box on trial N, the rat will increase how much of the drink it will intake on trial N+1. This is a reasonable strategy: assuming that one knows they are going to get shocked, they might as well intake as much as possible while getting shocked. For more on these points, see Garcia (et al. 1970).

33. In other versions of the problem it is understood as the problem the organism faces in trying to figure out which of its behaviors produced the environmental change that interests the organism. It also appears in problems in Artificial Intelligence (see Minksy 1963).

34. For a pure associationist, one would phrase this as the organism learning to associate the lack of CS with the US. How the pure associationist analyzes the absence of a CS while using only associative structures can also be a difficult issue.

Copyright © 2015 by
Eric Mandelbaum <>

This is a file in the archives of the Stanford Encyclopedia of Philosophy.
Please note that some links may no longer be functional.
[an error occurred while processing this directive]