Stanford Encyclopedia of Philosophy
This is a file in the archives of the Stanford Encyclopedia of Philosophy.

Explanation in Mathematics

First published Sun Apr 6, 2008

The philosophical analysis of mathematical explanations concerns itself with two different, although connected, areas of investigation. The first area addresses the problem of whether mathematics can play an explanatory role in the natural and social sciences. The second deals with the problem of whether mathematical explanations occur within mathematics itself. Accordingly, this entry surveys the contributions to both areas, it shows their relevance to the history of philosophy and science, it articulates their connection, and points to the philosophical pay-offs to be expected by deepening our understanding of the topic.

1. Mathematical explanations in the natural sciences

Mathematics plays a central role in our scientific picture of the world. How the connection between mathematics and the world is to be accounted for remains one of the most challenging problems in philosophy of science, philosophy of mathematics, and general philosophy. A very important aspect of this problem is that of accounting for the explanatory role mathematics seems to play in the account of physical phenomena. Consider the following example from evolutionary biology mentioned in Lyon & Colyvan 2007. Why do hive-bee honeycombs have a hexagonal structure? The nature of the question is contrastive: why hexagonal as opposed to, say, any other polygonal figure or combination thereof? Part of the explanation depends on evolutionary facts. Bees that use less wax and thus spend less energy have a better chance at being selected. The explanation is completed by pointing out that “any partition of the plane into regions of equal area has perimeter at least that of the regular hexagonal honeycomb tiling”. Thus, the hexagonal tiling is optimal with respect to dividing the plane into equal areas and minimizing the perimeter. This fact, known as the “honeycomb conjecture” was recently proved in Hales 2001. The explanation of the biological fact seems to depend essentially on a mathematical fact.

Another example from evolutionary biology has been discussed in Baker 2005. It has to do with the life-cycle of the so-called ‘periodical’ cicada. It turns out that three species of such cicadas “share the same unusual life-cycle. In each species the nymphal stage remains in the soil for a lengthy period, then the adult cicada emerges after 13 years or 17 years depending on the geographical area. Even more strikingly, this emergence is synchronized among the members of a cicada species in any given area. The adults all emerge within the same few days, they mate, die a few weeks later and then the cycle repeats itself.” (2005, 229). Several questions have been raised about this specific type of life cycle but one of them is why such periods are prime. One explanation appeals to the biological claim that cicadas that minimize intersection with other cicadas' and predators' life cycles have an evolutionary advantage over those that do not. The mathematical component of the explanation complements the biological claim by pointing out that prime periods minimize intersection.

One interesting difference between the two examples is that the first appeals to a geometrical theorem whereas the second appeals to an arithmetical theorem. This shows that different areas of mathematics can contribute to scientific explanations, potentially in different ways.

When we move to physics, it becomes even more difficult — given the highly mathematized nature of the subject — to distinguish between the mathematical and the physical components of an explanation. Consider the following example. Mark the faces of a tennis racket with R (for rough) and S (for smooth). Hold the tennis racket horizontally by its handle with face S facing up. Let y be the intermediate principal axis. This is the vertical axis perpendicular to the handle and passing through the center of mass of the racquet. Toss the racket attempting to make it rotate about the y axis. Catch the racket by its handle after one full rotation. The surprising observation is that the R face will almost always be up (one would expect S to be up). In other words, the racket makes a half twist about its handle. An explanation of this phenomenon was given in Ashbaugh, Chicone, & Cushman 1991. They say: “In this paper we explain the twist by analyzing the [differential] equations of motion of the tennis racket in space…Our treatment of the twist is divided into two parts. In the first part we prove two theorems which show that the handle moves nearly in a plane and rotates nearly uniformly…In the second part, we discuss how the twist and rotation of the handle are related” (Ashbaugh & others 1991, 68). There is no question that we are explaining a physical regularity but mathematics enters here both in the modeling of the phenomenon and in the explanatory account by means of the classical dynamics of a rotating tennis racket.

Another simple example, in which a geometrical fact seems to do much of explaining, has been offered by Peter Lipton:

There also appear to be physical explanations that are non-causal. Suppose that a bunch of sticks are thrown into the air with a lot of spin so that they twirl and tumble as they fall. We freeze the scene as the sticks are in free fall and find that appreciably more of them are near the horizontal than near the vertical orientation. Why is this? The reason is that there are more ways for a stick to be the horizontal than near the vertical. To see this, consider a single stick with a fixed midpoint position. There are many ways this stick could be horizontal (spin it around in the horizontal plane), but only two ways it could be vertical (up or down). This asymmetry remains for positions near horizontal and vertical, as you can see if you think about the full shell traced out by the stick as it takes all possible orientations. This is a beautiful explanation for the physical distribution of the sticks, but what is doing the explaining are broadly geometrical facts that cannot be causes. (Lipton 2004, 9-10)

Lipton's description of the example points to one of the reason why philosophers are especially interested in such explanations, for they seem to be counterexamples to the claim that all explanations in the natural science must be causal.

Having established that mathematics seems to play an important role in giving explanations in the natural sciences, we now move to a few historical remarks on how this problem has emerged in the history of philosophy and science.

2. The explanatory role of mathematics in science: some historical remarks

Does mathematics help explain the physical world or does it actually hinder a grasp of the physical mechanisms that explain the how and why of natural phenomena? It is not possible here to treat this topic in its full complexity but a few remarks will help the reader appreciate the historical importance of the question.

Aristotle describes his ideal of scientific knowledge in “Posterior Analytics” in terms of, among other things, knowledge of the cause:

We suppose ourselves to possess unqualified scientific knowledge of a thing, as opposed to knowing it in the accidental way in which the sophist knows, when we think that we know the cause on which the fact depends as the cause of the fact and of no other, and further, that the fact could not be other than it is. (BWA, 111, Post. An. I.1, 71b 5-10)

The causes [aitia] in question are the four Aristotelian causes: formal, material, efficient, and final. Nowadays, translators and commentators of Aristotle prefer to translate aitia as ‘explanation’, so that the theory of the four causes becomes an account of four types of explanations. For instance here is Barnes' translation of the passage quoted earlier: “We think we understand a thing simpliciter (and not in the sophistic fashion accidentally) whenever we think we are aware both that the explanation because of which the object is is its explanation, and that it is not possible for this to be otherwise.” (Aristotle CWA, 115, Post. An. I.1, 71b 5-10)

But how do we obtain knowledge? Knowledge is obtained through demonstration. However, not all logically cogent proofs provide us with the kind of demonstration that yields scientific knowledge. In a scientific demonstration “the premisses must be true, primary, immediate, better known and prior to the conclusion, which is further related to them as effect to causes.” (BWA, 112, Post. An. I.1, 71b 20-25) In Barnes' translation: “If, then, understanding is as we posited, it is necessary for demonstrative understanding in particular to depend on things which are true and primitive and immediate and more familiar than and prior to and explanatory of the conclusion” (Aristotle CWA, 115, Post. An. I.1, 71b 20-25).

Accordingly, in “Posterior Analytics” I.13, Aristotle distinguished between demonstrations “of the fact” and demonstrations “of the reasoned fact”. Although both are logically cogent only the latter mirror the causal structure of the phenomena under investigation, and thus provide us with knowledge. We can call them “non-explanatory” and “explanatory” demonstrations.

In Aristotle's system, physics was not mathematized although causal reasonings were proper to it. However, Aristotle also discussed extensively the so-called mixed sciences, such as optics, harmonics, and mechanics, characterizing them as “the more physical of the mathematical sciences”. There is a relation of subordination between these mixed sciences and areas of pure mathematics. For instance, harmonics is subordinated to arithmetic and optics to geometry. Aristotle is in no doubt that there are mathematical explanations of physical phenomena:

For here it is for the empirical scientist to know the fact and for the mathematical to know the reason why; for the latter have the demonstrations of the explanations, and often they do not know the fact, just as those who consider the universal often do not know some of the particulars through lack of observation. (Aristotle CWA, vol. I, 128, Post. An. I.13, 79a1-79a7)

However, the topic of whether mathematics could give explanations of natural phenomena was one on which there was disagreement. As the domains to which mathematics could be applied grew, so also did the resistance to it. One source of tension consisted in trying to reconcile the Aristotelian conception of pure mathematics, as abstracting from matter and motion, with the fact that both physics (natural philosophy) and the mixed sciences are all conversant about natural phenomena and thus dependent on matter and motion. For instance, an important debate in the Renaissance, known as the Quaestio de Certitudine Mathematicarum, focused in large part on whether mathematics could play the explanatory role assigned to it by Aristotle. Some argued that lacking causality, mathematics could not be the ‘explanatory’ link in the explanation of natural phenomena (see also section 5).

By the time we reach the seventeenth century and the Newtonian revolution in physics, the problem reappears in the context of a change of criteria of explanation and intelligibility. This has been beautifully described in an article by Y. Gingras (2001). Gingras argues that “the use of mathematics in dynamics (as distinct from its use in kinematics) had the effect of transforming the very meaning of the term ‘explanation’ as it was used by philosophers in the seventeenth century” (385). What Gingras describes, among other things, is how the mathematical treatment of force espoused by Newton and his followers — a treatment that ignored the mechanisms that could explain why and how this force operated—became an accepted standard for explanation during the eighteenth century. After referring to the seventeenth and eighteenth centuries discussions on the mechanical explanation of gravity, he remarks:

This episode shows that the evaluation criteria for what was to count as an acceptable ‘explanation’ (of gravitation in this case) were shifting towards mathematics and away from mechanical explanations. Confronted with a mathematical formulation of a phenomenon for which there was no mechanical explanation, more and more actors chose the former even at the price of not finding the latter. This was something new. For the whole of the seventeenth century and most of the eighteenth, to ‘explain’ a physical phenomenon meant to give a physical mechanism involved in its production….The publication of Newton's Principia marks the beginning of this shift where mathematical explanations came to be preferred to mechanical explanations when the latter did not conform to calculations. (Gingras 2001, 398)

Among those who resisted this confusion between “physical explanations” and “mathematical explanations” was the Jesuit Louis Castel. In “Vrai système de physique générale de M. Isaac Newton” (Paris, 1734), he discussed Principia's proposition XIII of Book III (on Kepler's law of areas). He granted that the proposition connected mathematically the inverse square law to the ellipticity of the course of the planets. However, he objected that “the one is not the cause, the reason of the other” (Castel 1734, 97) and that Newton had not provided any physical explanation, only a mathematical one. Indeed “physical reasons are necessary reasons of entailment, of linkages, of mechanism. In Newton, there is none of this kind.” (Castel 1734, 121)

It would be interesting to pursue these questions into the nineteenth and the twentieth centuries but that is obviously not something that can be done here. Rather, the aim of the above was to prepare the ground for showing how in contemporary discussions in philosophy of science, to which we now turn, we are still confronted with such issues.

3. Philosophical relevance of mathematical explanations in science

There are two major areas in which the discussion of whether mathematics can play an explanatory role in science makes itself felt. The first concerns issues of modeling and idealization in science. The second, concerns the nominalism-platonism debate.

3.1 Mathematics, modeling and idealization

A good starting point here is Morrison's book “Unifying Scientific Theories” (2000). One of the major theses of the book is that unification and explanation often pull in different directions and come apart (contrary to what is claimed by unification theories of explanation). One of the examples discussed in her introduction reminds us of Castel's objections:

Another example is the unification of terrestrial and celestial phenomena in Newton's Principia. Although influenced by Cartesian machanics, one of the most striking features of the Principia is its move away from explanations of planetary motions in terms of mechanical causes. Instead, the mathematical form of force is highlighted; the planetary ellipses discovered by Kepler are “explained” in terms of a mathematical description of the force that produces those motions. Of course, the inverse-square law of gravitational attraction explain why the planets move in the way they do, but there is no explanation of how this gravitational force acts on bodies (how it is transported), nor is there any account of its causal properties. (Morrison 2000, 4)

Using several case studies (Maxwell's electromagnetism, the electroweak unification, etc.), Morrison argues that the mathematical structures involved in the unification “often supply little or no theoretical explanation of the physical dynamics of the unified theory” (Morrison 2000, 4). In short, the mathematical formalism facilitates unification but does not help us explain the how and why of physical phenomena.

By contrast, Batterman in “The Devil in the Details” (2002), analyzes a wide class of explanations — asymptotic explanations — which heavily rely on mathematics. “Asymptotic reasoning — the taking of limits as a means to simplify, and the study of the nature of these limits — constitutes the main method of idealization in the mathematician's tool box” (Batterman 2002, 132). These methods proceed by ignoring many details, even of a causal nature, about the phenomenon being analyzed. But despite this fact, nay, in virtue of it, one arrives at correct explanations of the phenomena. In fact, the reason why “asymptotic analyses so often provide physical insight is that they illuminate structurally stable aspects of the phenomenon and its governing equations.” (Batterman 2000, 59)

We thus see that the problem of the explanatory role of mathematics in science is intimately related to problems of modeling and idealization in science. In turn, understanding how modeling and idealization work is an integral part of addressing the question of how mathematics hooks on to reality, i.e. an account of the applicability of mathematics to reality (see Shapiro 2000, 35 and 217).

3.2 Indispensability arguments

Whereas the issues treated in section 3a affect the methodology of science, a different set of issues has emerged in connection to the nominalism-platonism debate in philosophy of mathematics. Much of the discussion in this area has focused on so-called indispensability arguments. There is actually a variety of indispensability arguments on offer (see Colyvan 2001) but the general structure of the argument runs as follows. One begins with the premise that mathematics is indispensable for our best science. But, second premise, we ought to believe our best theories. Thus, we ought to be committed to the kind of entities that our best theories quantify over. In general this is an argument in favor of Platonism, as our best science quantifies over mathematical entities.

There are many ways in which one can attempt to block the argument. However, the key feature related to our discussion is the following. Several versions of the indispensability argument rely on a holistic conception of scientific theories according to which the ontological commitment of the theory is determined by looking at all the existential claims implied by the theory. However, no attention is paid to how the different parts of the theory might be responsible for different posits and to the different roles that the latter might play. Baker 2005 offers a version of the indispensability argument that does not depend on holism. Baker starts from a debate between Colyvan (2001, 2002) and Melia (2000, 2002) that saw both authors agreeing that the prospects for a successful platonist use of the indispensability argument rests on examples from scientific practice in which the postulation of mathematical objects results in an increase of those theoretical virtues which are provided by the postulation of theoretical entities. Both authors agree that among such theoretical virtues is explanatory power. Baker believes that such explanations exist but also argues that the cases presented in Colyvan 2001 fail to be genuine cases of mathematical explanations of physical phenomena. Most of his article is devoted to the specific case study from evolutionary biology concerning the life-cycle of the so-called ‘periodical’ cicada, which was described in section 1. Recall that the question of interest was why the life cycle periods of such cicadas are prime numbers and that the answer appealed to evolutionary facts and mathematical properties of prime numbers. After the reconstruction of the explanation, Baker concludes that:

The explanation makes use of specific ecological facts, general biological laws, and number theoretic result. My claim is that the purely mathematical component [prime periods minimize intersection (compared to non-prime periods)] is both essential to the overall explanation and genuinely explanatory on its own right. In particular it explains why prime periods are evolutionary advantageous in this case. (2005, 233)

Such explanations give a new twist to the indispensability argument. The argument now runs as follows.

  1. There are genuinely mathematical explanations of empirical phenomena
  2. We ought to be committed to the theoretical posits postulated by such explanations; thus,
  3. We ought to be committed to the entities postulated by the mathematics in question.

The argument has not gone unchallenged. Indeed, Leng 2005 tries to resist the conclusion by blocking premise b). She accepts a) but questions the claim that the role of mathematics in such explanations commits us to the real existence (as opposed to a fictional one) of the posits. This, she argues, will be granted when one realizes that both Colyvan and Baker infer illegitimately from the existence of the mathematical explanation that the statements grounding the explanation are true. She claims that mathematical explanations need not have a true explanans and consequently the objects posited by such explanations need not exist. Another challenge has been raised by Bangu 2008, who claims that mathematical language is essential to the formulation of the question to be answered (“why is the life cycle period prime?”) and thus that the argument begs the question against the nominalist. The existence of numbers and properties of numbers is already assumed in the acceptance of the statement “the life cycle period is prime”. A similar objection to any attempt to use mathematical explanations in physics for inferring the existence of the mathematical entities involved in the explanation had already been raised in 1978b by Steiner, who had discarded such arguments with the observation that what needed explanation could not even be described without use of the mathematical language. Thus, the existence of mathematical explanations of empirical phenomena could not be used to infer the existence of mathematical entities, for this very existence was presupposed in the description of the fact to be explained. Indeed, he endorsed a line of argument originating from Quine and Goodman according to which “we cannot say what the world would be like without numbers, because describing any thinkable experience (except for utter emptiness) presupposes their existence.” (1978b, 20)

Modified versions of the indispensability argument stressing the importance of the indispensability of mathematics for explanations in science were considered, before Baker, by the nominalist Field as a challenge to the platonist use of such arguments. Field (1989, 14-20) accepts the cogency of this type of inference to the best explanation but he argued (Field 1980) that platonist mathematics could be replaced by a nominalistically acceptable theory that was sufficient for the development of classical mechanics. In addition, the nominalistic replacement would also have the virtue of providing ‘intrinsic’ explanations of the physical phenomena. That led to much discussion as to how far Field's program could be pushed. Malament 1982 had objected that the obstacles to the nominalization of phase-space theories in Hamiltonian mechanics seemed insuperable. Lyon and Colyvan (2007) go beyond Malament's claim by arguing that even if a nominalistic reconstruction of phase-space theories were available, the nominalist would still have to show that such reconstruction can yield the explanations yielded by the non-nominalistic version(s). They believe that the nominalist will fail in this task and make a plausible case for their thesis by providing a case study of a physical system known as the Hénon-Heiles system. The system describes the motion of a star around a galactic center. Their claim is that the phase-space analysis of the system provides explanations that cannot be provided by any nominalist reconstruction. At the end of their article, Lyon and Colyvan also review a few possible moves the nominalist can make in response. One such move would deny that mathematical explanations have any bearing on physical explanations and that some bridge principles linking the mathematics to the physical system, are required. They reply:

Our response to this is to agree that in order for the mathematical explanation to be an explanation of empirical facts, some appropriate bridge principles are required. But this does not mean that the mathematical explanation is restricted to pure mathematics. Yes, there is a great deal of work being done by the bridge principles in order for the mathematical explanations to be explanations of physical facts, and there is a great deal to be said about the nature and adequacy of these bridge principles, but this does not reduce the importance of the mathematical explanation in question. Indeed, the bridge principles in question are mappings between physical systems and mathematical structures, and so are themselves mathematical entities (i.e., mappings). If the nominalist hopes to defuse the situation by having the bridge principles shoulder some of the explanatory load, this seems a poor way to proceed. (p.15)

Lyon and Colyvan grant that while in mathematical explanations of empirical facts such bridge principles are required, they “do not seem to do anything more than allow the transmission of the mathematical explanations to the empirical domain” (p.15)

It thus appears that a proper account of explanations in science requires an analysis of mathematical explanations in pure mathematics. Indeed, this was also the major intuition behind Steiner's account of mathematical explanation in science offered in 1978b, whose central idea was that a mathematical explanation of a physical fact is one in which when we remove the physics what we are left with is a mathematical explanation of a mathematical fact. Steiner himself had provided an account of mathematical explanations of mathematical facts in 1978a. We will discuss it, in section 6, in the context of the treatment of mathematical explanations within mathematics to which we now turn.

4. Mathematical explanations within mathematics

Much mathematical activity is driven by factors other than justificatory aims such as establishing the truth of a mathematical fact. In many cases knowledge that something is the case will be considered unsatisfactory and this will lead mathematicians to probe the situation further to look for better explanations of the facts. This might take the form of, just to give a few examples, providing alternative proofs for known results, giving an account for surprising analogies, or recasting an entire area of mathematics on a new basis in the pursuit of a more satisfactory ‘explanatory’ account of the area. The phenomenology of the variety of such explanatory activities has been partially investigated in Sandborg (1997, ch. 1) and Hafner & Mancosu 2005 (see also Robinson 2000 for a cognitive analysis of proof emphasizing explanatory factors).

Consider for instance the case of Brumfiel, a real algebraic geometer. In his book “Partially ordered rings and semi-algebraic geometry” (1979), Brumfiel contrasts different methods for proving theorems about real closed fields. One of them relies on a decision procedure for a particular axiomatization of the theory of real closed fields. By this method one can find elementary proofs of sentences formulated in the language of that theory —at least in principle, since, as Brumfiel remarks, “it certainly might be very tedious, if not physically impossible, to work out this elementary proof” (166)

Another method of proof consists in using a so-called transfer principle which allows one to infer the truth of a sentence for all real closed fields from its being true in one real closed field, say the real numbers. Despite the fact that the transfer principle is a very efficient tool, Brumfiel does not make any use of it, and he is very clear about this.

In this book we absolutely and unequivocally refuse to give proofs of this second type. Every result is proved uniformly for all real closed ground fields. Our philosophical objection to transcendental proofs is that they may logically prove a result but they do not explain it, except for the special case of real numbers. (Brumfiel 1979, 166)

Brumfiel prefers a third proof method which aims at giving non-transcendental proofs of purely algebraic results. This does not mean that he restricts himself to just elementary methods; he does use stronger tools but it is crucial that they apply uniformly to all real closed fields.

But explanations in mathematics do not only come in the form of proofs. In some cases explanations are sought in a major conceptual recasting of an entire discipline. In such situations the major conceptual recasting will also produce new proofs but the explanatoriness of the new proofs is derivative on the conceptual recasting. This leads to a more global (or holistic picture) of explanation than the one based on the focus on individual proofs. Mancosu 2001 describes in detail such a global case of explanatory activity from complex analysis; see also Kitcher 1984 and Tappenden 2005 for additional case studies.

5. Mathematical explanations: some historical remarks

Since contributions in analytic philosophy to the study of mathematical explanations date back only to Steiner 1978a, one might suspect that the topic was a byproduct of the Quinean conception of scientific theories (see Resnik & Kushner, 1987, 154). Once mathematics and natural science were placed on the same footing, it became possible to apply a unified methodology to both areas. Thus, it made sense to look for explanations in mathematics just as in natural science. However, this historical reconstruction would be mistaken. Mathematical explanations of mathematical facts have been part of philosophical reflection since Aristotle. We have already seen the distinction Aristotle drew between demonstrations “of the fact” and demonstrations “of the reasoned fact”. Both are logically rigorous but only the latter provide explanations for their results. Aristotle had also claimed that demonstrations “of the reasoned fact” occur in mathematics. On account of what we said in section 2, these demonstrations can be called “explanatory” demonstrations. Aristotle's position on explanatory proofs in mathematics was already challenged in ancient times. Proclus, in his “Commentary on the first book of Euclid's Elements”, informs us on this point. He reports: “Many persons have thought that geometry does not investigate the cause, that is, does not ask the question ‘Why?’” (Proclus 1970, 158-159; for more on Proclus on mathematical explanation see Harari 2008). Proclus himself singles out certain propositions in Euclid's “Elements”, such as I.32, as not being demonstrations “of the reasoned fact”. Euclid I.32 states that the sum of the internal angles of a triangle is equal to two right angle. If the demonstration were given by a scientific syllogism in the Aristotelean sense, the middle of the syllogism would have to provide the ‘cause’ of the fact. But Proclus argues that Euclid's proof does not satisfy these Aristotelian constraints, for the appeal to the auxiliary lines and exterior angles is not ‘causal’:

What is called “proof” we shall find sometimes has the properties of a demonstration in being able to establish what is sought by means of definitions as middle terms, and this is the perfect form of demonstration; but sometimes it attempts to prove by means of signs. This point should not be overlooked. Although geometrical propositions always derive their necessity from the matter under investigation, they do not always reach their results through demonstrative methods. For example, when [from] the fact that the exterior angle of a triangle is equal to the two opposite interior angles it is shown that the sum of the interior angles of a triangle is equal to two right angles, how can this be called a demonstration based on the cause? Is not the middle term used here only as a sign? For even though there be no exterior angle, the interior angles are equal to two right angles; for it is a triangle even if its side is not extended. (Proclus 1970, 161-2)

In addition, Proclus also held that proofs by contradiction were not demonstrations “of the reasoned fact”. The rediscovery of Proclus in the Renaissance was to spark a far-reaching debate on the causality of mathematical demonstrations referred to above as the Quaestio de Certitudine Mathematicarum. The first shot was fired by Alessandro Piccolomini in 1547. Piccolomini's aim was to disarm a traditional claim to the effect that mathematics derives its certainty on account of its use of “scientific demonstrations” in the Aristotelean sense (such proofs were known as “potissimae” in the Renaissance). Since “potissimae” demonstrations had to be causal, Piccolomini attacked the argument by arguing that mathematical demonstrations are not causal. This led to one of the most interesting epistemological debates of the Renaissance and the seventeenth century. Those denying the “causality” of mathematical demonstrations (Piccolomini, Pereyra, Gassendi etc.) argued by providing specific examples of demonstrations from mathematical practice (usually from Euclid's Elements) which, they claimed, could not be reconstructed as causal reasonings in the Aristotelian sense. By contrast, those hoping to restore “causality” to mathematics aimed at showing that the alleged counterexamples could easily be accommodated within the realm of “causal” demonstrations (Clavius, Barrow, etc.). The historical developments have been presented in detail in Mancosu 1996 and Mancosu 2000. What is more important here is to appreciate that the basic intuition — the contraposition between explanatory and non-explanatory demonstrations — had a long and successful history and influenced both mathematical and philosophical developments well beyond the seventeenth century. For instance Mancosu 1999 shows that Bolzano and Cournot, two major philosophers of mathematics in the nineteenth century, construe the central problem of philosophy of mathematics as that of accounting for the distinction between between explanatory and non-explanatory demonstrations. In the case of Bolzano this takes the form of a theory of Grund (ground) and Folge (consequence). Kitcher 1975 was the first to read Bolzano as propounding a theory of mathematical explanations. In the case of Cournot this is spelled out in terms of the opposition between “ordre logique” and “ordre rationnel” (see Cournot 1851). In Bolzano's case, the aim of providing a reconstruction of parts of analysis and geometry, so that the exposition would use only “explanatory” proofs, also led to major mathematical results, such as his purely analytic proof of the intermediate value theorem.

In conclusion to this section, we should also point out that there is another tradition of thinking of explanation in mathematics that includes Mill, Lakatos, Russell and Gödel. These authors are motivated by a conception of mathematics (and/or its foundations) as hypothetico-deductive in nature and this leads them to construe mathematical activity in analogy with how explanatory hypotheses occur in science (see Mancosu 2001 for more details).

6. Two models for mathematical explanation: Steiner and Kitcher

In section 4 it was pointed out that two major forms of the search for explanations in mathematical practice occur at the level of comparison between different proofs of the same result and in the conceptual recasting of major areas. These two types of explanatory activity lead to two different conceptions of explanation. These conceptions could be characterized as local and global. The point is that in the former case explanatoriness is primarily a (local) property of proofs whereas in the latter it is a (global) property of the whole theory or framework and the proofs are judged explanatory on account of their being part of the framework. While these two types of explanatory activity do not exhaust the varieties of mathematical explanations which occur in practice, the contraposition between local and global captures well the major difference between the two major accounts of mathematical explanation available at the moment, those of Steiner and Kitcher.

Before discussing them, it should also be pointed out that other models of scientific explanation can be thought to extend to mathematical explanation. For instance, Sandborg (1997, 1998) tests van Fraassen's account of explanation as answers to why-questions by using cases of mathematical explanation.

6.1 A local model of explanation: Steiner

Steiner proposed his model of mathematical explanation in 1978a. In developing his own account of explanatory proof in mathematics he discusses—and rejects—a number of initially plausible criteria for explanation, e.g. the (greater degree of) abstractness or generality of a proof, its visualizability, and its genetic aspect that would give rise to the discovery of the result. In contrast, Steiner takes up the idea “that to explain the behaviour of an entity, one deduces the behavior from the essence or nature of the entity” (Steiner 1978a, 143). In order to avoid the notorious difficulties in defining the concepts of essence and essential (or necessary) property, which, moreover, do not seem to be useful in mathematical contexts anyway since all mathematical truths are regarded as necessary, Steiner introduces the concept of characterizing property. (Let me mention as an aside that Kit Fine distinguishes between essential and necessary properties and that perhaps the distinction could be exploited in this context). By characterizing property Steiner means “a property unique to a given entity or structure within a family or domain of such entities or structures”, where the notion of family is taken as undefined. Hence what distinguishes an explanatory proof from a non-explanatory one is that only the former involves such a characterizing property. In Steiner's words: “an explanatory proof makes reference to a characterizing property of an entity or structure mentioned in the theorem, such that from the proof it is evident that the result depends on the property”. Furthermore, an explanatory proof is generalizable in the following sense. Varying the relevant feature (and hence a certain characterizing property) in such a proof gives rise to an array of corresponding theorems, which are proved —and explained — by an array of “deformations” of the original proof. Thus Steiner arrives at two criteria for explanatory proofs, i.e. dependence on a characterizing property and generalizability through varying of that property (Steiner 1978a, 144, 147).

Steiner's model was criticized by Resnik & Kushner 1987 who questioned the absolute distinction between explanatory and non-explanatory proofs and argued that such a distinction can only be context-dependent. They also provided counterexamples to the criteria defended by Steiner. In Hafner & Mancosu 2005 it is argued that Resnik and Kushner's criticisms are insufficient as a challenge to Steiner for they rely on ascribing explanatoriness to specific proofs based not on evaluations given by practicing mathematicians but rather relying on the intuitions of the authors. By contrast, Hafner and Mancosu build their case against Steiner using a case of explanation from real analysis, recognized as such in mathematical practice, which concerns the proof of Kummer's convergence criterion. They argue that the explanatoriness of the proof of the result in question cannot be accounted for in Steiner's model and, more importantly, this is instrumental for giving a careful and detailed scrutiny of various conceptual components of the model. In addition, further discussion of Steiner's account, aimed at its improvement, is provided in Weber & Verhoeven 2002.

6.2 A holistic model of explanation: Kitcher

Kitcher is a well known defender of an account of scientific explanation as theoretical unification. Kitcher sees one of the virtues of his viewpoint to be that it can also be applied to explanation in mathematics, unlike other theories of scientific explanation whose central concepts, say causality or laws of nature, do not seem relevant to mathematics. Kitcher has not devoted any single article to mathematical explanation and thus his position can only be gathered from what he says about mathematics in his major articles on scientific explanation. In his later work, such as Kitcher 1989, he uses unification as the overarching model for explanation both in science and mathematics:

The fact that the unification approach provides an account of explanation, and explanatory asymmetries, in mathematics stands to its credit. (Kitcher 1989, 437)

Kitcher claims that behind the account of explanation given by Hempel's covering law model —the official model of explanation for logical positivism — there was an unofficial model which saw explanation as unification. What should one expect from an account of explanation? Kitcher in 1981 points out two things. First, a theory of explanation should account for how science advances our understanding of the world. Secondly, it should help us in evaluating or arbitrating disputes in science. He claims that the covering law model fails on both counts and he proposes that his unification account fares much better.

Kitcher found inspiration in Friedman 1974 where Friedman put forward the idea that understanding of the world is achieved by science by reducing the number of facts we take as brute:

this is the essence of scientific explanation — science increases our understanding of the world by reducing the total number of independent phenomena that we have to accept as ultimate or given. A world with fewer independent phenomena is, other things equal, more comprehensible than one with more. (Friedman 1974, 15)

Already Friedman had tried to make this intuition more precise by substituting for the notion of phenomena and laws linguistic descriptions of such. Kitcher disagrees with the specific details of Friedman's proposal but thinks that the general intuition is correct. He modifies Friedman's proposal by emphasizing that what lies behind unification is the reduction of the number of argument patterns used in providing explanations while being as comprehensive as possible in the number of phenomena explained:

Understanding the phenomena is not simply a matter of reducing the “fundamental incomprehensibilities” but of seeing connections, common patterns, in what initially appeared to be different situations. Here the switch in conception from premise-conclusion pairs to derivations proves vital. Science advances our understanding of nature by showing us how to derive descriptions of many phenomena, using the same patterns of derivation again and again, and, in demonstrating this, it teaches us how to reduce the number of types of facts that we have to accept as ultimate (or brute). So the criterion of unification I shall try to articulate will be based on the idea that E(K) is a set of derivations that makes the best tradeoff between minimizing the number of patterns of derivation employed and maximizing the number of conclusions generated. (Kitcher 1989, p.432)

Let us make this a little bit more formal. Let us start with a set K of beliefs assumed to be consistent and deductively closed (informally one can think of this as a set of statements endorsed by an ideal scientific community at a specific moment in time; Kitcher 1981, p.75). A systematization of K is any set of arguments that derive some sentences in K from other sentences of K. The explanatory store over K, E(K), is the best systematization of K (Kitcher here makes an idealization by claiming that E(K) is unique). Corresponding to different systematizations we have different degrees of unification. The highest degree of unification is that given by E(K). But according to what criteria can a systematization be judged to be the best? There are three factors: the number of patterns, the stringency of the patterns and the set of consequences derivable from the unification.

We cannot enter here into the technicalities of Kitcher's model. Unlike Steiner's model of mathematical explanation, Kitcher's account of mathematical explanation has not been extensively discussed (in contrast to the extensive discussion of his model in the context of general philosophy of science). A general discussion is found in Tappenden 2005 but not a detailed analysis. The only exception is Hafner & Mancosu 2008, where Kitcher's model is tested in light of Brumfiel's case from real algebraic geometry, described in section 4. The authors argue that Kitcher's model makes predictions about explanatoriness that go against specific cases in mathematical practice.

7. Conclusion

Despite the impressive historical pedigree of the problem of mathematical explanations, both in science and mathematics, work in analytic philosophy in this area is just beginning.

Mathematical explanations of empirical facts have not been sufficiently studied. We badly need more detailed case studies in order to understand better the variety of explanatory uses that mathematics can play in empirical contexts. The philosophical pay-offs might come from at least three different directions. First, one can expect results in the direction of a better understanding of the applicability of mathematics to the world. Indeed, understanding the ‘unreasonable effectiveness’ of mathematics in discovering and accounting for the laws of the physical world (Wigner 1967, Steiner 1998 and 2005) can only be resolved if we understand how mathematics helps in scientific explanation. Second, the study of mathematical explanations of scientific facts will serve as a test for theories of scientific explanation, in particular those that assume that explanation in natural science is causal explanation. Third, philosophical benefits might also emerge in the metaphysical arena by improved exploitation of various forms of the indispensability argument. Whether any such argument is going to be successful remains to be seen but the discussion will yield philosophical benefits in forcing, for instance, the nominalist to take a stand on how he can account for the explanatoriness of mathematics in the empirical sciences.

Also in the case of mathematical explanation of mathematical facts, we need to carefully analyze more case studies in order to get a better grasp of the varieties of mathematical explanations. Previous theories of mathematical explanation proceeded top-down, that is by first providing a general model without much concern for describing the phenomenology from mathematical practice that the theory should account for. Recent work has shown that it might be more fruitful to proceed bottom-up, that is by first providing a good sample of case studies before proposing a single encompassing model of mathematical explanation. Indeed, this kind of work might also lead to the conclusion that mathematical explanations are heterogeneous and that no single theory will encompass them all. It is hoped that the philosophical pay-offs of the work on mathematical explanation will come from the following areas. First, models of mathematical explanation can be used to test models of scientific explanation. Theories of scientific explanations aim at capturing ‘scientific’ explanations in any area of knowledge, not just explanations in the natural sciences. If they cannot accommodate mathematical explanations, this will show important limitations of the theories in question. On the other hand, if no such theory will be able to encompass mathematical explanations and explanations in natural science under a single model, this will point to important differences between science and mathematics. Secondly, as it has become apparent from our exposition, accounting for mathematical explanations of scientific facts will very likely require an account of mathematical explanations of mathematical facts. Third, ‘explanatoriness’ is only one virtue among those that an epistemology of mathematics that does not limit itself to traditional debates about justifying axioms can fruitfully investigate. Indeed, it is clear that ‘explanation’ is closely connected to other notions such as ‘generality’, ‘visualizability’, ‘mathematical understanding’, ‘purity of methods’, ‘conceptual fruitfulness’ etc. The epistemological analysis of these important notions informing mathematical practice, and the connection among them, has only recently been taken up in earnest (for recent work in this direction see the volumes Mancosu & others 2005 and Mancosu 2008a).


Other Internet Resources

[Please contact the author with suggestions.]

Related Entries

Aristotle, Special Topics: causality | mathematics, philosophy of: indispensability arguments in the | models in science | Newton, Isaac: Philosophiae Naturalis Principia Mathematica | scientific explanation | scientific unity


While this paper was originally written for the Stanford Encyclopedia of Philosophy, in some cases I have used verbatim passages from my previous publications (especially 2008b). Many thanks to Justin Bledin and Chris Pincock for useful comments on a previous draft.