# Combinatory Logic

*First published Fri Nov 14, 2008*

Combinatory logic (henceforth: CL) is an *elegant* and *powerful*
logical theory that is connected to many areas of logic, and has found
applications in other disciplines, especially, in computer science and
mathematics.

CL was originally invented as a continuation of the reduction of the set of
logical constants to a singleton in classical first-order logic (FOL). CL
untangles *the problem of substitution*, because formulas can be
prepared for the *elimination of bound variables* by inserting
combinators. Sometimes, bound variables are thought to signify
“ontological commitments.”
A philosophical rôle of CL is to show the variability of the ontological
involvements a theory has.

Substitution is a crucial operation not only in first-order logics, but also in higher-order logics, as well as in other formal systems that contain a variable binding operator, such as the λ-calculi and the ε-calculus. Indeed, carrying out substitution correctly is particularly pressing in λ-calculi and in the closely related functional programming languages. CL can emulate λ-abstraction despite the fact that CL has no variable binding operators, which makes CL a particularly suitable language for compilers of functional programming languages.

The connection to λ-calculi might suggest—correctly—that
CL is sufficiently expressive to formalize
recursive functions (i.e., *computable
functions*) and arithmetic. Consequently, CL is susceptible to
Gödel-type incompleteness theorems.

CL is an archetypical *term rewriting system* (TRS). These systems
comprise a wide range of formal calculi from syntactic specifications of
programming languages and context-free grammars to Markov algorithms; even
some number theoretic problems may be viewed as special instances of
questions about TRSs. Several notions and proof techniques that were
originally invented for CL, later turned out to be useful in applications to
less well-understood TRSs.

CL is connected to *nonclassical logics*
via typing. First, a correspondence between formulas that are
provable in the implicational fragment of intuitionistic logic and the typable
combinatory terms was discovered. Then the isomorphism was generalized to
other combinatory bases and implicational logics (such as the logic of
relevant implication, exponential-free linear logic, affine logic, etc.).

*Self-reference* factors into some paradoxes, most interestingly into
“
The liar” and
Russell's paradox. The set theoretical
understanding of functions also discourages the idea of self-application.
Thus it is remarkable that pure untyped CL does not exclude self-application
of functions. Moreover, its mathematical models showed that a theory in which
functions can become their own arguments is completely sensible, in addition
to being consistent (what was established earlier using proof theoretic
methods).

- 1. Schönfinkel's elimination of bound variables
- 2. Combinatory terms and their main properties
- 3. Nonclassical logics and typed CL
- 4. Models
- 5. Computational functions and arithmetic
- Bibliography
- Other Internet Resources
- Related Entries

## 1. Schönfinkel's elimination of bound variables

### 1.1 The problem of substitution

Classical first-order logic includes *quantifiers* that are denoted by
∀ (“for all”) and ∃ (“there is a”). A
simple sentence such as “All birds are animals” may be formalized
as ∀*x*(*Bx* ⊃ *Ax*), where *x* is a
variable, *B* and *A* are one-place predicates, and ⊃ is a
(material) implication. The occurrences of the variables in
∀*x*(*Bx* ⊃ *Ax*) are bound, whereas
those in the open formula *Bx* ⊃ *Ax* are free. If
we assume that *t* (“Tweety”) is a name constant, then an
instance of the above sentence is *Bt* ⊃ *At*, that
may be read as “Tweety is an animal, provided Tweety is a bird.”
This illustrates that the instantiation of a (universal) quantifier involves
substitution.

Due to the simplicity of the example, the *substitution* of *t* for
*x* in *B* and in *A* seems to be easy to understand and to
perform. However, a definition of substitution for FOL (and in general, for
an abstract syntax, that is, for a language with a variable binding operator)
has to guarantee that no *free* occurrence of a variable in the
substituted expression becomes *bound* in the resulting expression.

To see what can go wrong, let us consider the (open) formula
∀*x*(*Rxy* ∧ *Rxr*), where *R* is a
two-place predicate, *r* is a name constant abbreviating
“Russell” and ∧ is conjunction.
∀*x*(*Rxy* ∧ *Rxr*) contains a free
occurrence of *y* (that is, *y* is a free variable of the formula),
however, *y* is not free for a substitution of a term that contains a
free occurrence of *x*, for instance, *x* itself. More formally,
the occurrence of *y* in the second argument place of *R* in
∀*x*(*Rxy* ∧ *Rxr*) is not bound by a
quantifier (the only quantifier of the formula), whereas
∀*x*(*Rxx* ∧ *Rxr*) is a closed formula,
that is, it contains no free occurrences of variables. Informally, the
following natural language sentences could be thought of as interpretations
of the previous formulas. “Everybody reads him and Russell,”
(where ‘him’ is deictic, or perhaps, anaphoric) and
“Everybody reads himself and Russell.” Obviously, the meanings
of the two sentences are vastly different, even if we assume that everybody
pens something. As a contrast,
∀*x*(*Rxw* ∧ *Rxr*) exhibits an
unproblematic substitution of the name constant *w* for the free
occurrence of *y*. (The expression, perhaps, formalizes the sentence
“Everybody reads Wittgenstein and Russell.”) These examples are
meant to demonstrate the more complex part of the problem Schönfinkel
set out to solve, and for what he invented
CL.^{[1]}

### 1.2 The operators “nextand” and
“*U*”

A well-known result about *classical sentential logic* (SL) is that all
truth-functions can be expressed in terms of ¬ and ∧ (or of
¬ and ∨, etc.). A minimal sufficient set of connectives, however,
can contain just one connective such as
∣
(“nand,” what is often called, Sheffer's stroke), or ↓
(“nor”). “Nand” is “not-and,” in other
words,
*A* ∣ *B*
is defined as ¬(*A* ∧ *B*), where *A*, *B* range over
formulas and ¬ is *negation*. Going into the other direction, if
∣
is a primitive, then ¬ *A* is definable as
*A* ∣ *A*.
Although formulas with numerous
vertical lines may quickly become visually confusing and hard to parse, it is
straightforward to prove that
∣
alone is sufficiently expressive to define all the truth-functions.

Schönfinkel's aim was to minimize the number of logical constants that
are required for a formalization of FOL, just as Sheffer (indeed, already
Ch. S. Peirce) did for classical propositional logic. One of the
two quantifiers mentioned above may be assumed to be defined without loss of
generality. Let us say, ∃*x A* is an abbreviation for
¬∀*x* ¬*A*. Even if ¬ and the rest of the
connectives are traded in for
∣ ,
two logical constants remain: ∀ and
∣ .
A further pressing issue is that quantifiers may be nested
(i.e., the scope of a quantifier may fully contain the scope of another
quantifier), and the variable bindings (that could be visualized by drawing
lines between quantifiers and the variables they bind) may get quite
intertwined. Keeping for a moment the familiar logical constants, we may
consider the following formula that hints at the emerging
difficulties—when the question is tackled in its full
generality.^{[2]}

∀x(∃y(Py∧Bxy) ⊃ ∃y(Py∧Bxy∧ ∀z((Rz∧Ozy) ⊃ ¬Cz)))

∀*x* binds all occurrences of *x*; the variables in the
second argument place of the two *B*s are bound by one of the two
∃*y*'s, the latter of which interacts with ∀*z* via
*Ozy*.

Predicates have a fixed finite arity in FOL, and nothing precludes binding
at once a variable in the first argument of one predicate and in the second
argument of another predicate. (Indeed, FOL would loose some of its
expressiveness, if bindings of this sort would be excluded without some means
to compensate for them.) These difficulties persist when a formula is
transformed into a(n equivalent) formula in *prenex normal form*. As
long as the variable bindings can interweave and braid into arbitrarily
complex patterns, there seems to be no clear strategy for the elimination of
bound variables. (Note that free variables in open formulas—in a
sense—behave like name constants, and their elimination is neither
intended, nor achieved in the procedures described here.)

Schönfinkel's ingenuity was that he introduced combinators to untangle
variable bindings. The *combinators* S,
K, I,
B and C (in contemporary
notation) are his, and he established that S and
K suffice to define all the other combinators.
In effect, he also defined an algorithm to carry out the elimination of bound
variables, which is essentially one of the algorithms used nowadays to define
*bracket abstraction* in
CL.^{[3]}

Schönfinkel introduced a new logical constant *U*, that
expresses the *disjointness* of two classes. For
instance, *UPQ* may be written in usual FOL notation as
¬∃*x*(*Px* ∧ *Qx*),
when *P* and *Q* are one-place predicates. (The formula may
be thought to formalize the natural language sentence “No
parrots are quiet.”) In the process of the elimination of the
bound variables, *UXY* is obtained from an expression that
contains ‘*Xx*’ and ‘*Yx*’,
where *x* does not occur in *X* or *Y*, and if
*X* and *Y* happen to be *n*-ary predicates with
*n* ≥ 2, then *x* occurs (only) in their last
argument place. For example, “Nobody reads Aristotle and
Plato” can be formalized as
¬∃*x*(*Rxa* ∧ *Rxp*),
where *a* and *p* are name constants that stand for
“Aristotle” and “Plato,” respectively. This
formula
*cannot* be written as *U*(*Ra*)(*Rp*). On the other
hand, “There is nobody whom both Russell and Wittgenstein read,”
that is, ¬∃*x*(*Rrx* ∧*Rwx*) turns into
*U*(*Rr*)(*Rw*), where the parentheses delineate the arguments
of *U*. Often, the expressions *X* and *Y* (in *UXY*)
consist of predicates (and name constants) together with combinators and
other *U*'s.

It is useful to have a notation for “nextand” (i.e.,
“not-exists-and”) without assuming either that *x* has a free
occurrence in the expressions joined, or that if it has one, then it is the
last component of the expressions. Following Schönfinkel, we use
∣^{x}
for the “nextand”
*operator* that binds *x*. The notation
∣^{−} (where ^{−} is the place
for a variable) closely resembles the Sheffer stroke. Indeed,
Schönfinkel achieved the reduction of the set of logical constants for
FOL to a singleton
{ ∣^{−} },
because every formula of FOL is equivalent to a formula that contains only
“nextand.”

A formula ∀*x A* is usually defined to be well-formed in FOL even
if *A* has no free occurrences of *x*. Then, of course,
∀*x A* is equivalent to *A* as well as to ∃*x A*,
and such quantifiers are called *vacuous*. In order to show that any
formula can be rewritten into an equivalent formula that contains only
“nextand,” it is sufficient to inspect the following definitions
for ¬, ∨ and ∀—that are due to Schönfinkel.

¬ A⇔ _{df}A∣^{x}AA∨B⇔ _{df}( A∣^{y}A) ∣^{x}(B∣^{y}B)∀ xAx⇔ _{df}( Ax∣^{y}Ax) ∣^{x}(Ax∣^{y}Ax)

The definition for ¬, for instance, may be justified by the following
equivalences. *A* ⇔ *A* ∧ *A*,
*A* ∧ *A* ⇔ ∃*x*(*A* ∧ *A*),
hence by replacement
¬*A* ⇔ ¬∃*x*(*A* ∧ *A*).

Now we give a concrete example to illustrate how to turn a FOL formula into
one that contains only ∣^{−} , and then how to eliminate the
bound variables using *U* and combinators. To put excitement into the
process, we start with the sentence in (#1).

(#1) For every natural number there is a greater prime.

A straightforward formalization of this sentence—on the foreground of
the domain of numbers—is the formula in (#2), (where
‘*Nx*’ stands for “*x* is a natural number,”
‘*Px*’ stands for “*x* is a prime” and
‘*Gxy*’ is to be read as “*x* is greater
that *y*”).

(#2) ∀y∃x(Ny⊃ (Px∧Gxy))

This formula is equivalent to
∀*y*(*Ny* ⊃ ∃*x*(*Px* ∧ *Gxy*))
and further to
¬∃*y* ¬(*Ny* ⊃ ∃*x*(*Px* ∧ *Gxy*)).
In one or two more steps, we get
*Ny* ∣^{y} (*Px* ∣^{x} *Gxy*).
(Expressions are considered to be *grouped* to the left unless
parentheses indicate otherwise. E.g., *Gxy* is ((*Gx*)*y*)
not *G*(*xy*) as could have been, perhaps, expected based on the
most common way of arranging parentheses in FOL formulas.) Unfortunately,
neither ∣^{x} nor ∣^{y} can be replaced
by *U* in the last expression. However, if the arguments of *G*
were permuted then the former reduction could be carried out. One of the
combinators, C does exactly what is
needed: *Gxy* can be changed to C*Gyx*
(see the definition of combinators in section 2.1). That is, we
have *Ny* ∣^{y} (*Px* ∣^{x} C*Gyx*), and then *Ny* ∣^{y} *UP*(C*Gy*).^{[4]}
The expression may give the impression that
*y* is the last component of
*UP*(C*Gy*),
which is the second argument of
∣^{y} ,
but it is not so. The grouping within
expressions cannot be disregarded, and another combinator,
B is needed to turn *UP*(C*Gy*)
into the desired form B(*UP*)(C*G*)*y*.
From *Ny* ∣^{y} B (*UP*)(C*G*)*y*,
we get
*UN*(B(*UP*)(C*G*))
in one more step. This expression is
completely free of variables, and it also makes the *renaming of bound
variables* in FOL easily comprehensible: given two sequences of (distinct)
variables that are different in their first two elements, the reversal of the
above process yields formulas that are (logically equivalent) alphabetic
variants of the formula in (#2).

The expression
*UN*(B(*UP*)(C*G*))
may look “unfamiliar” when
compared to formulas of FOL, but notation—to a large extent—is
a matter of convention. It may be interesting to note that the first
*U* is followed by its two arguments, however, the second *U*
is not. B(*UP*) is a subexpression, but
*UP*(C*G*) is not a subexpressions of
*UN*(B(*UP*)(C*G*)).
Furthermore, the whole expression can
be transformed into *XNPG* using combinators, where *X* is
composed of *U*s and combinators only. Such an *X* encodes the
logical content of the formula with the predicates being
arguments.^{[5]}

The expressions obtained via the transformations outlined above quickly
become lengthy—as trying to rewrite a simple FOL sentence such as
∃*x*(*Px* ∧ *Qx*) can
show.^{[6]} However,
this does not diminish the importance of Schönfinkel's theoretical
results. A slight increase (if any) in the length of the expressions is not
even an inconvenience, let alone an impediment in the era of computers with
petabytes of memory.

It seems that Schönfinkel's reduction procedure for FOL is not widely known. As a measure of how widely Sheffer's and Schönfinkel's reductions are known, we appeal to the fact that the first is part of standard intro courses in logic, whereas the second is not. Undoubtedly, one of the reasons for this is that Schönfinkel's process to eliminate bound variables is conceptually more opulent than defining a few truth functions from ∣ (or ↓). Another reason may be that Schönfinkel, perhaps, did not place a sufficiently strong emphasis on the intermediate step that allows the elimination of all other logical connectives and quantifiers via “nextand.” We may also note that although “nextand” is an operator in the standard logical sense, it is binary—unlike ∀ and ∃, which are singulary.

If
*A* ∣ *B* ⇔_{df}
¬(*A*∧*B*) is added as a definition to SL, then the result
is a *conservative extension*, and it becomes provable that for any
formula *A*(*p*_{0}, …, *p*_{n}) (i.e.,
for a formula containing the displayed propositional variables and some
connectives) there is a formula *B*(*p*_{0},
…,
*p*_{n}) containing only the connective
∣ .
Furthermore, *B*(*p*_{0}, …, *p*_{n})
⇔ *A*(*p*_{0}, …,
*p*_{n}) itself is provable.
∣
is, of course, interpreted as the “nand” truth function.
“Nand” as a binary connective or as a binary truth function is of
the same sort of object as conjunction, disjunction, etc.

The first stage in Schönfinkel's extension of FOL is analogous. The
addition of
∣^{−}
is a conservative extension
of FOL, and every occurrence of
∣^{−}
can be eliminated. (We noted that
∣^{−}
is a binary operator, and so it may be thought to
combine a quantifier (∃) with connectives (¬, ∧), but
∣^{−}
does not introduce any objects that are not definable in FOL.)

The second stage in Schönfinkel's extension of FOL is slightly
different. *UXY* is definable in FOL only for one-place predicates
*P* and *Q* (or for predicates of higher arity when the variable in
their last argument is bound). Thus neither *U* is definable in
general, nor the combinators are definable.

The elimination of bound variables goes beyond the resources of FOL. The
combinators are not only undefinable, but they are new kinds of
objects—which are absent in FOL itself. Also, the intermediate steps
of the bound variable elimination procedure presuppose that functions of
several arguments can be viewed as functions in one variable, and the other
way
around.^{[7]}
Enriching a presentation of FOL with predicate letters that have sufficiently many
arguments in the right order would be more or less unproblematic, and it
would add objects to the language that would have the same sort of
interpretation as other predicates. A potential problem though is that for
each predicate, infinitely many (ℵ_{0} many) new predicates
would be needed—together with axioms stipulating the intended
equivalences between the meaning of the predicates. Notationally, these
steps amount to padding predicate symbols with extra arguments, omitting some
arguments, as well as permuting and regrouping the arguments. Although such
additions may look superficial, it is crucial for the understanding of
Schönfinkel's procedure to eliminate bound variables to view formulas as
structured strings of
symbols.^{[8]}

In conclusion to this section, it is important to emphasize that there are
*no questions of consistency* with respect to the above reduction
process, because it can be viewed—or described in contemporary
terms—as a *well-defined algorithm*. It is a completely different
issue that if we consider the language of FOL expanded with combinators, then
the resulting system is inconsistent, because CL is powerful enough to define
the fixed point of any function. The effect of having fixed points for all
functions—including truth functions—may be thought to amount to
adding certain biconditionals (which may or may not be valid) as axioms.
Notably, both FOL and (pure) CL are consistent.

## 2. Combinatory terms and their main properties

### 2.1 Reduction, equality and their formalizations

Cantor's and Russell's paradoxes that were discovered in the late 19th–early 20th century both involve self-membership of a set. The ramified theory of types due to Whitehead and Russell, and ZF (the formalization of set theory named after Zermelo and Fraenkel) exclude self-membership. However, there seems to have been a desire all the time to have a theory that allows self-membership or self-application. Indeed, one of the motivations for Curry to further develop CL was the goal to construct a formal language that allows a wide range of expressions to be well-formed, perhaps, including some expression, which under some interpretations turn out to be meaningless. (This idea may be compared to the Bernays–von Neumann–Gödel formalization of set theory, that does not contain the axiom of foundation. The Russell class simply cannot be proven to be a set, that is, it is a proper class.)

A few natural language examples provide a convenient illustration to clarify the difference between (1), that is a well-formed (but meaningless) expression and (2), which is a meaningful (but ill-formed) sentence. (The meaningfulness of (2), of course, should be taken with a grain of salt: in reality, Gödel proved the system of PM to be incomplete in 1930.)

- The derivative of
λ
*x*(*x*^{2}+ 4*x*− 6) wishes to declare that functions are smart. - Peano arithmetics prove incomplete by Gödel at 1930.

After these informal motivations, we turn to CL proper and introduce some of its notions (more) formally.

The *objects* in CL are called
*terms*.^{[9]}
Terms may be thought to be interpreted as
functions (as further explained in section 4.1). *Primitive terms*
comprise *variables* and *constants*, whereas *compound terms*
are formed by combining terms. Usually, a denumerable set (i.e., a set with
cardinality ℵ_{0}) of variables is included, and the constants
include some *combinators*. (We use *x*, *y*, *z*,
*v*, *w*, *u*, *x*_{0}, … as variables in
the object language, and *M*, *N*, *P*, *Q*, … as
metavariables that range over terms.)

*Terms* are inductively defined as follows.

(t1) Ifxis a variable, thenxis a term,

(t2) ifcis a constant, thencis a term,

(t3) ifMandNare terms, then (MN) is a term.

In the above definition, (t3) conceals the binary operation that conjoins the
two terms *M* and *N*. This operation is called
*application*, and it is often denoted by juxtaposition, that is, by
placing its two arguments next to each other as in (*MN*).

Application is not assumed to possess additional properties (such as
commutativity), because its intended interpretation is *function*
application. For instance, ((*vw*)*u*) and (*v*(*wu*))
are distinct terms—just as the derivative of
λ*x*.*x*^{2} + 4*x* − 6
applied to 8 (that is,
(λ*x*.2*x* + 4)8 = 20) is different from
derivative of 90 (that is,
(8^{2} + 32 − 6)′ = 0).
Using λ notation, the two terms in the example may be expressed as
((λ*y*.*y*′)(λ*x*.*x*^{2} + 4*x* − 6))8
vs
(λ*y*.*y*′)((λ*x*.*x*^{2} + 4*x* − 6)8).

If terms are viewed as structured strings (where parentheses show grouping),
then the number of distinct terms associated to a string of length *n*
is the *Catalan number* *C*_{n−1}. For a
non-negative integer *n* (i.e., for
*n* ∈ ℕ),

C_{n}=

1 n+ 1( 2 nn) .

The first seven Catalan numbers are C_{0} = 1,
C_{1} = 1, C_{2} = 2,
C_{3} = 5, C_{4} = 14,
C_{5} = 42 and C_{6} = 132. As an
illustration we may take—for simplicity—strings consisting of
*x*'s, because the terms are to differ only in their grouping. Clearly,
if the term is *x* or *xx*, that is of length 1 or 2, then there is
only one way to form a term, that is, there exists just one possible term in
each case. If we start with three *x*'s, then we may form
(*xx*)*x* or *x*(*xx*). If the length of the term is 4,
then the five terms are: *xxxx*, *x*(*xx*)*x*,
*xx*(*xx*), *x*(*xxx*) and *x*(*x*(*xx*)).
(It is a useful exercise to try to list the 14 distinct terms that can be
formed from 5 *x*'s.)

The usual notational convention in CL is to *drop parentheses* from left
associated terms together with the outmost pair. For instance, *xyz*
would be fully written as ((*xy*)*z*), whereas *xy*(*xz*)
and (*xy*)(*xz*) are both “shorthand versions” of the
term ((*xy*)(*xz*)) (but not of *xyxz*). Grouping in terms
delineates subterms. For instance, *xy* is a subterm of each of these
terms, whereas *yz* and *yx* are subterms of neither term.

*Subterms* of a term are recursively defined as follows.

(s1)Mis a subterm ofM,

(s2) ifMis a subterm ofNor ofP, thenMis a subterm ofNP.

Incidentally, the notion of free variables is straightforwardly definable
now: *x* is a free variable of *M* if *x* is a subterm
of *M*. The set of free variables of *M* is denoted by
fv(*M*).

All terms are interpreted as functions, and combinators are functions too.
Similarly, to some numerical and geometrical functions, that can be described
and grasped easily, the combinators that are frequently encountered can be
characterized as perspicuous transformations on terms.
(Sans serif letters denote combinators and > denotes *one-step
reduction*.)

Definition. (Axioms of some well-known combinators)

S xyz>xz(yz)K xy>xI x>xB xyz>x(yz)T xy>yxC xyz>xzyW xy>xyyM x>xxY x>x(Yx)J xyzv>xy(xvz)B′ xyz>y(xz)V xyz>zxy

These axioms tacitly specify the *arity* of a combinator as well as their
*reduction* (or *contraction*) pattern. Perhaps, the simplest
combinator is the *identity* combinator
I,
that applied to an argument *x* returns the same *x*.
K
applied to *x* is a constant function, because
when it is further applied to *y*, it yields *x* as a result, that
is,
K
is a *cancellator* with respect to its
second argument.
W and M
are *duplicators*, because in the result of their application one of the
arguments (always) appears
twice.^{[10]}
C, T
and V are *permutators*,
because they change the order of some of their arguments.
B
is an *associator*, because
B*xyz*
turns into a term in which *y* is applied
to *z* before *x* is applied to the result.
Y
is the *fixed point* combinator, because for any
function *x*,
Y*x* is the fixed point
of that function (see section 2.3). The combinator
B′
is an associator and a permutator, whereas
S and
J are also
duplicators.
S
is very special and it is called
the *strong composition* combinator, because when applied to two
functions, let us say, *f* and *g*, as well as *x*, then
the resulting term *gx*(*fx*) expresses the composition of *g*
and *f* both applied to the same argument *x*.

These informal explications did not mention any restrictions on the sort of
functions *x*, *y*, *z*, *f*, *g*, … may be.
However, the axioms above (formally) limit the applicability of the
combinators to variables. Intuitively, we would like to say that given any
terms, that is, any functions *M* and *N*,
W*MN* one-step reduces to *MNN*. For example,
*M* may be K and *N* may be *yy*,
and then
WK(*yy*) > K(*yy*)(*yy*).
The latter term suggests a further one-step reduction, and indeed we
might be interested in successive one-step reductions—such as
those leading from
WK(*yy*)
to *yy*. A way to achieve these goals is to formalize (a theory
of) CL as an extension of the standard
*inequational logic* by adding certain axioms and rules.

Inequational Calculus for CL(CL≥).

M≥MS MNP≥MP(NP)K MN≥M

M≥NN≥PM≥P

M≥NMP≥NP

M≥NPM≥PN

The use of metavariables encompasses *substitution* (that we illustrated
above on the term
W*MN*).
The identity axiom
and the rule of transitivity imply that ≥ is a transitive and reflexive
relation. The last two rules characterize application as an operation that
is *monotone* in both of its argument places. CL≥ includes only
S and
K,
because the
other combinators are definable from them—as we already mentioned in
section 1.2, and as we explain more precisely toward the end of this
section.

The set of combinators
{ S,
K }
is called a *combinatory base*, that is,
these two combinators are the undefined constants of CL≥. To give an idea
of a *proof* in this calculus, the following steps may be pieced
together to prove
SKK(KKK) ≥ K.
KKK ≥ K
is an instance of an axiom. Then
SKK(KKK) ≥ SKKK
is obtained by right monotonicity, and further,
SKK(KKK) ≥ K
results by instances of the S
and
K
axioms together with applications of the transitivity rule.

The relation ≥ is called *weak reduction*, and it may be defined
alternatively as follows. (‘Weak reduction’ is a technical term
used in CL to distinguish this relation on the set of terms from some other
relations, one of which is called ‘strong reduction’.) A term
that is either of the form S*MNP* or of the
form K*MN* is a *redex*, and the leading
combinators (S and K,
respectively) are *the heads* of the redexes. If a term *Q*
contains a subterm of the form S*MNP*, then
*Q*′ obtained by replacing that subterm by *MP*(*NP*)
is a *one-step reduct* of *Q*. (Similarly, for the
redex
K*MN* and *M*.)
That is, *Q* > *Q*′ in both cases. *Reduction* then
may be defined as the reflexive transitive closure of one-step reduction.
This notion is *completely* captured by CL≥. The calculus CL≥
is *complete* in the sense that if *M* ≥ *N* in
the sense we have just described, then CL≥ proves
*M* ≥ *N*. (It is easy to see that the converse
implication is true too.)

The notion of reduction is a weaker relation than one-step reduction, and so
it is useful to distinguish a subclass of terms using the stronger relation.
A term is in *normal form* (nf) when it contains no redexes. Note that
one-step reduction need not decrease the total number of redexes that a term
contains, hence, is does not follow that every term can be turned into one in
nf after finitely many one-step reductions. Indeed, some terms do not reduce
to a term in nf.

Reduction is arguably an important relation between terms that denote
functions. The typical steps in a program execution and in other concrete
calculations are function applications rather than moves in the other
direction, what is called *expansion*. However, the notion of the
equality of functions is familiar to everybody from mathematics, and the
analogous notion has been introduced in CL too. The symmetric closure of
the reduction relation is called (weak) *equality*. A formalization of
equational CL may be obtained by extending the standard equational logic with
combinatory axioms and rules characterizing application.

Equational Calculus for CL(CL=).

M=MK MN=MS MNP=MP(NP)

M=NN=PM=P

M=NN=M

M=NMP=NP

M=NPM=PN

The first axiom and the first two rules constitute equational logic. The constants are again the combinators S and K. Note that CL= could have been defined as an extension of CL≥ by adding the rule of symmetry, that would have paralleled the description of the definition of equality from reduction. We chose instead to repeat the inequational axioms and rules with the new notation (and add the rule of symmetry) to make the two definitions self-contained and easy to grasp. (The two characterizations of = coincide—as those of ≥ did.)

CL≥ and CL= share a feature that may or may not be
desirable—depending on what sort of understanding of functions is to be
captured. To illustrate the issue, let us consider the one-place combinators
SKK and
SK(KK).
It is easy to verify that
SKK*M* ≥ *M*
and
SK(KK)*M* ≥ *M*.
However, neither
SKK ≥ SK(KK)
nor
SK(KK) ≥ SKK
is provable in CL≥; a fortiori, the equality of
the two terms in not provable in CL=. This means that CL≥ and CL=
formalize *intensional* notions of functions, where
“intensionality” implies that functions that give the same output
on the same input may remain distinguishable.

The archetypical intensional functions that one is likely to encounter are
*algorithms*. As examples, we might think of various specifications to
calculate the decimal expansion of π, or various computer programs that
behave in the same way. For instance, compilers (for one and the same
language) may differ from each other by using or not using some
optimizations, and thereby, producing programs from a given piece of code
that have identical input–output behavior but different run times.

If functions that are indistinguishable from the point of view of their
input–output behavior are to be identified, that is,
an *extensional* understanding of functions is sought, then CL≥ and
CL= have to be extended by the following rule, (where the symbol ‡
is to be replaced by ≥ or =, respectively).

Mx‡NxM‡Nwhere xis not free inMN.

### 2.2 Church–Rosser theorems and consistency theorems

The calculi CL≥ and CL= of the previous section formalize reduction and equality. However, ≥ and = have some further properties that are important when the terms are thought to stand for functions. The next theorem is one of the earliest and best-known results about CL.

Church–Rosser theorem (I).IfMreduces toNandP, then there is a termQto which bothNandPreduce.

Figure 1: Illustration for the Church–Rosser theorem (I)

If we think that reduction is like computing the value of a function, then the Church–Rosser theorem—in a first approximation—can be thought to state that the final result of a series of calculations with a term is unique—independently of the order of the steps. This is a slight overstatement though, because uniqueness implies that each series of calculations ends (or “loops” on a term). That is, if there is a unique final term, then only finitely many distinct consecutive calculation steps are possible.

A coarse analogy with elementary arithmetic operations, perhaps, can shed
some light on the situation. The addition and multiplication of natural
numbers always yield a natural number. However, if division is included then
it is no longer true that all numerical expressions evaluate to a natural
number, since 7/5 is a rational number that is not a natural one,
and *n*/0 is undefined (even if *n* were real). That is, some
numerical expressions do not evaluate to a (natural) number. Although the
analogy with combinatory terms is not very tight, it is useful. For
instance, *n*/0 (even if the domain of the function λ*n.n*/0
would be extended to permit *n* to be rational) could be implemented in
a machine by a loop (that would never terminate when executed) which would go
through an enumeration of the rational numbers trying to find an *r*
such that *r* · 0 = *n*.

The combinatory terms
WWW and
WI(WI)
are, perhaps, the
simplest examples of terms that do not have a normal form. Both terms induce
an infinite *reduction sequence*, that is, an infinite chain of
successive one-step reductions. To make the example more transparent, let us
assume for a moment that
W,
I,
C,
etc. are not defined from
K and
S, but are
primitive constants. The contraction of the only redex in
WWW
returns the same term, which shows that uniqueness
does not imply that the term is in nf. The contraction of the only redex in
WI(WI)
gives
I(WI)(WI)
that further reduces to the term we started with. A
slightly more complicated example of a term that has only infinite reduction
sequences is
Y(CKI).
This term has a reduction sequence (in which each contracted redex is headed
by
Y)
that contains infinitely many distinct terms.
To sum up, the Church–Rosser theorem, in general, does not guarantee
the uniqueness of the term *Q*. However, if *M* has *a normal
form* then that is *unique*.

The Church–Rosser theorem is often stated as follows.

Church–Rosser theorem (II).IfNandPare equal, then there is a termQto which bothNandPreduces.

Figure 2: Illustration for the Church–Rosser theorem (II)

The second form of the Church–Rosser theorem differs from the first in
its *assumption*. From the definition of equality as a superset of
reduction, it is obvious that the first form of the theorem is implied by the
second. However, despite the weaker assumption in the second formulation of
the Church–Rosser theorem, the two theorems are *equivalent*.
Equality is the symmetric closure of reduction, which means that if two terms
are equal then there is a finite path comprising reduction and expansion
steps (which decompose into one-step reductions and one-step expansions,
respectively). Then by finitely many applications of the first
Church–Rosser theorem (i.e., by induction on the length of the path
connecting *N* and *P*), the first Church–Rosser theorem
implies the second formulation.

*Modern proofs* of the Church–Rosser theorem for CL proceed
indirectly because one-step reduction fails to have the diamond property. A
binary relation *R* (e.g., reduction) is said to have the *diamond
property* when *xRy* and *xRz* imply that *yRv*
and *zRv* for some *v*. If a binary relation *R* has the
diamond property, so does its transitive closure. To exploit this insight in
the proof of the Church–Rosser theorem, a suitable *subrelation of
reduction* has to be found. The sought after subrelation should possess
the diamond property, and its reflexive transitive closure should coincide
with reduction.

The following counterexample illustrates that one-step reductions of a term may yield terms that further do not reduce to a common term in one step. SKK(KKK) > SKKK and SKK(KKK) > K(KKK)(K(KKK)), and then the potential reduction sequences are as follows.

- SKKK > KK(KK) > K
- K(KKK)(K(KKK)) > KKK > K
- K(KKK)(K(KKK)) > KK(K(KKK)) > KK(KK) > K
- K(KKK)(K(KKK)) > K(KKK)(KK) > KKK > K
- K(KKK)(K(KKK)) > K(KKK)(KK) > KK(KK) > K

The failure of the diamond property is obvious once we note that SKKK > KK(KK) (only), but K(KKK)(K(KKK)) does not reduce in one step to KK(KK).

An appropriate subrelation of reduction is the *simultaneous contraction
of a set of nonoverlapping redexes*, which is denoted by >_{sc}.
‘Nonoverlapping’ means that there are no shared subterm
occurrences between two redexes. >_{sc} includes >
because a one-step reduction of a redex may be viewed instead as
>_{sc} of a singleton set of redexes. >_{sc} is,
obviously, included in ≥ (i.e., in reduction). These two facts imply that
the reflexive transitive closure of >_{sc} is reduction—when
the tonicity of the reflexive transitive closure operation (denoted
by ^{*}) is taken into account.

(1)–(3) summarize the key inclusions between the relations mentioned.

- > ⊆ >
_{sc}⇒ >^{*}⊆ >_{sc}^{*}, - >
_{sc}⊆ ≥ ⇒ >_{sc}^{*}⊆ ≥^{*}, - >
^{*}⊆ ≥^{*}and ≥^{*}= ≥.

The central fact about ≥_{sc} is the content of the following
theorem.

Theorem. (Diamond property for>_{sc})IfM>_{sc}NandM>_{sc}Pthen there is a termQsuch that bothN>_{sc}QandP>_{sc}Q.

The proof of this theorem is an easy induction on the term *M*. The
properties of >_{sc} guarantee that one or more one-step
reductions can be performed at once, but the reductions cannot interfere (or
overlap) with each other.

The *consistency* of CL follows from the Church–Rosser theorem
together with the existence of (at least two) distinct normal forms.

Theorem. (Consistency)CL is consistent, that is, there are terms that do not reduce to each other, hence they are not equal.

Not all terms have an nf, however, many do. Examples, first of all, include
S
and
K.
(The variables, if included, of which there are ℵ_{0}
many, are all in nf.) None of these terms contains a redex, hence
they reduce only to themselves. By the Church–Rosser theorem,
it is excluded that some term *M* could reduce to both *x*
and
S
(making S equal to *x*).

The interaction between infinite reduction sequences and nfs deserves a more careful inspection though. The terms WWW, Y(CKI) and WI(WI) have only infinite reduction sequences. However, the existence of an infinite reduction sequence for a term does not imply that the term has no normal form (when the combinatory base is complete or contains a cancellator). Y(KI) reduces to KI(Y(KI)), KI(KI(Y(KI))), KI(KI(KI(Y(KI)))), … as well as to I.

A term *weakly normalizes* when it has an nf, whereas a term
*strongly normalizes* when all its reduction sequences lead to an nf
(hence, to *the nf*) of the term. A computational analogue of a strongly
normalizing term is a (nondeterministic) program that terminates on every
branch of computation, whereas termination on at least branch is akin to weak
normalization.

### 2.3 The existence of fixed points and combinatorial completeness

Schönfinkel proved that S and K suffice to define the other combinators he introduced, and we mentioned in the definition of CL≥ that the set of constants is limited to S and K, because other combinators could be defined from them.

To demonstrate the sense in which definability is understood here we consider
the example of
B. The axiom for
B is
B*xyz* > *x*(*yz*),
and if we take
S(KS)K
instead of
B,
then the following reduction sequence results.

S(KS)Kxyz> KSx(Kx)yz> S(Kx)yz> Kxz(yz) >x(yz)

The term S(KS)K is in nf, however, to be in nf is not a requirement for definability. It is more convenient to work with defining terms that are in nf, because an application of a combinator that is not in nf could be started with reducing the combinator to its normal form. (Also, there are always infinitely many combinators that reduce to a combinator.) However, note that the preference for choosing combinators in nf is not meant to imply that a combinator cannot be defined by two or more terms in nf. E.g., we give here two definitions (involving only S and K) for I.

If the constants are
S and
K, then the *combinators* are all those terms that
are formed from
S and
K
(without variables). Once we have defined
B as
S(KS)K,
we may use
B in further
definitions as an abbreviation, and we do that primarily to reduce the size
of the resulting terms as well as to preserve the transparency of the
definitions.

The following list gives definitions for the rest of the well-known combinators. (Here ‘=’ is placed between a definiendum and a definiens.)

I = SK(KK) T = B(SI)K C = B(T(BBT))(BBT) W = CSI M = SII Y = B(WI)(BWB) J = W(BC(B(BC)(B(BB)(BB)))) B′ = CB V = BCT

The definitions are easily seen to imply that all these combinators depend on both S and K, but it is not obvious from the definitions that the defined combinators are mutually independent, that is, that none of the listed combinators is definable from another one. (Clearly, some subsets suffice to define some of the combinators.) We do not intend to give an exhaustive list of interdefinability between various subsets of these combinators, but to hint at the multiplicity and intricacy of such definitions, we list a handful of them. We also introduce two further combinators S′ and R.

I = SKK I = WK I = CK(KK) B = CB′ S′ = CS S = CS′ W = S′I W = B(T(BM(BBT)))(BBT) W = C(S(CC)(CC)) R = BBT Y = BM(CBM) Y = B′(B′M)M

If the fixed point combinator Y is not taken to be a primitive, then there are various ways to define it.

Fixed point theorem.For any functionM, there is a termNsuch thatMN=N.

The proof of this theorem is easy using a fixed point combinator, because a
term that can play the rôle of *N* is Y*M*.

Some of the definitions of Y have slightly
different properties with respect to reduction. But the importance of the
fixed point combinator is that it ensures that all functions have a fixed
point and all *recursive equations* can be solved.

Curry and Turing both defined fixed point combinators (in the λ-calculus). If we consider the definitions

Y_{1}= BM(BWB) Y_{2}= W(B(BW(BT)))(W(B(BW(BT))))

(where the subscripts are to distinguish the two definitions), then we can
see that Y_{1}*M* =
*M*(Y_{1}*M*), but for
Y_{2},
Y_{2}*M*
≥
*M*(Y_{2}*M*)
holds too. In this respect,
Y_{1}
is similar to Curry's fixed point
combinator (and really, to any fixed point combinator), whereas
Y_{2}
is like Turing's fixed point combinator.

The fixed point theorem demonstrates the expressive power of CL to some
extent. However, fixed point combinators may be defined from bases without a
cancellator (as
Y_{1} and
Y_{2} show).
The full power of CL (with the base
{ S,
K })
is enunciated by the following theorem.

Theorem. (Combinatorial completeness)Iff(x_{1}, …,x) =_{n}M(whereMis a term containing no other variables than those explicitly listed), then there is a combinator X such that Xx_{1}…x_{n}reduces toM.

The theorem's assumption may be strengthened to exclude the possibility that
some *x*'s do not occur in *M*. Then the consequent may be
strengthened by adding the qualification that
X is
a relevant combinator, that is,
X is a combinator
over a base that does not contain a combinator with cancellative effect, or
equivalently,
X is a combinator over
{ B,C,
W,
I } or
{ I,
J }.
(These bases correspond to Church's preferred
λI-calculus.)

Combinatorial completeness is usually proven via defining a
“*pseudo*” λ-*abstraction* (or *bracket
abstraction*) in CL. There are various algorithms to define a bracket
abstraction operator in CL, that behaves as the λ operator does in a
λ-calculus. This operator is usually denoted by [ ] or by
λ^{*}. The algorithms differ from each other in various
aspects: (i) the set of combinators they presuppose, (ii) the length of the
resulting terms, (iii) whether they compose into (syntactic) identity with
the algorithm that translates a combinatory term into a λ-term, and
(iv) if they commute with either of the reductions or equalities.

The first algorithm, the elements of which may be found already in Schönfinkel (1924 (1967)), consists of the following clauses that are applied in the order of their listing.

(k) [ x].M= KM, wherex∉ fv(M),(i) [ x].x = I,(η) [ x].Mx=M, wherex∉ fv(M),(b) [ x].MN= BM([x].N), wherex∉ fv(M),(c) [ x].MN= C([x].M)N, wherex∉ fv(N),(s) [ x].MN= S([x].M)([x].N).

For example, if this algorithm is applied to the term
λ*xyz.x*(*yz*) (that is, to the λ-translation of
B), then the resulting term is
B.
However, if η is omitted then a much longer term
results, namely,
C(BB(BBI))(C(BBI)I).
Another algorithm consists of clauses (i), (k) and (s).

## 3. Nonclassical logics and typed CL

### 3.1 Simple types

Combinatory terms are thought of as functions, and functions are thought to
have a *domain* (a set of possible inputs) and a *codomain* (a set
of possible outputs). For example, if a unary function is considered as a set
of ordered pairs, then the domain and codomain are given by the first and
second projections, respectively.
Category theory,
where functions are components of categories (without a set
theoretic reduction assumed), retains the notions of a domain and a codomain;
moreover, every function has a unique domain and codomain.

Functions that have the same domain and codomain may be quite different,
however, by abstraction, they are of the same sort or type. As a simple
illustration, let *f*_{1} and *f*_{2} be two
functions defined as *f*_{1} = λ*x*.8·*x*
and *f*_{2} = λ*x.x*/3. If *x* is a variable
ranging over reals, then *f*_{1} and *f*_{2}
have the same domain and codomain (i.e., they have the same type
ℜ → ℜ), although
*f*_{1} ≠ *f*_{2} whenever
*x* ≠ 0. The usual notation to indicate that a
function *f* has *A* as its domain and *B* as its codomain
is *f* : *A* → *B*. It is a happy
coincidence that nowadays ‘→’ is often used in logics as a
symbol for entailment or nonclassical implication.

Given a set of basic types (that we denote by *P*), *types* are
defined as follows.

- If
*p*∈*P*then*p*is a type, - if
*A*,*B*are types then (*A*→*B*) is a type.

To distinguish these types from other types—some of which are
introduced in the next section—they are called *simple types*.

The connection between combinators and types may be explained on the example
of the identity combinator. Compound combinatory terms are formed by the
application operation. Premises of modus ponens can be joined by fusion
(denoted by
) that in the
strongest relevance logic is like the application operation.
I*x* ⊳ *x*
and so if *x*'s type is *A*, then
I*x*'s
type should imply *A*. Furthermore,
I*x*'s
type should be of the form
*XA*,
for some type *X*; then
I
can be of type *A* → *A*. In the example, we fixed *x*'s
type, however,
I
can be applied to any term, hence,
it is more accurate to say that *A* → *A* is the
type *schema* of
I, or that
I's
type can be any formula of the form of self-implication.

The type-assignment system TA_{CL} is formally defined as the
following deduction system. (When implicational formulas are considered as
types, the usual convention is to omit parentheses by association to the
right.)

Δ ⊢ S : ( A→B→C) → (A→B) → (A→C)Δ ⊢ K : A→B→A

Δ ⊢ M:A→BΘ ⊢N:AΔ, Θ ⊢ MN:B

Expressions of the form *M* : *A* above are
called *type assignments*. A characteristic feature of type-assignment
systems is that if *M* : *A* is provable then *A* is
considered to be one of the types that can be assigned to *M*. However,
a provable assignment does not preclude other types from becoming associated
to the same term *M*, that is a type assignment does not fix the type of
a term rigidly. Δ and Θ on the left-hand side of
⊢
are sets of type assignments
to variables, and they are assumed to be consistent—meaning that no
variable may be assigned two or more types.

Type assignment systems are often called *Curry-style typing systems*.
Another way to type terms is by fixing a type for each term, in which case
each term has exactly one type. Such calculi are called *Church-style
typing systems*. Then, for example, the identity combinator
I
of type (*A*
→ *A* → *A*) →
*A* → *A* → *A* is not the same as the
identity combinator
I
of type
((*B* → *B*) → *B*) →
(*B* → *B*) → *B*. The two styles of
typing have quite a lot in common, but there are certain differences between
them. In particular, no self-application is typable in a Church-style typing
system, whereas some of those terms can be assigned a type in a Curry-style
typing system. Curry-style typing systems proved very useful in establishing
various properties of CL and λ-calculi. The Church-style typing,
however, emulates the translation of a concrete functional program (without
objects) more closely.

There is no 1–1 correspondence between types and combinators in either
style of typing: not all combinators can be assigned a type, and some
implicational formulas cannot be assigned to any combinatory term. A
combinator that can be assigned a type is said to be *typable*, and a
type that can be assigned to a combinator is said to be *inhabited*.
For instance,
M
has no type, because an
implicational formula is never identical to its own antecedent. On the other
hand, ((*A* → *B*) → *A*) → *A*
is not the type of any combinator in the type assignment system
TA_{CL}. Despite (or, indeed, due to) the discrepancy between
implicational formulas and combinatory terms, classes of implicational
formulas that can be assigned to certain sets of combinatory terms coincide
with some important logics.

Theorem.A→Bis a theorem ofJ_{→}, the intuitionistic implicational logic iffM:A→Bis a provable type assignment in TA_{CL}, where the termMis built from S and K, that is,Mis a combinator.

A combinator that inhabits an implicational theorem encodes a *proof* of
that theorem in the deduction system TA_{CL}. There is an
algorithm to recover the formulas that constitute a proof of the type of the
combinator, moreover, the algorithm produces a proof that is minimal and
well-structured. The correspondence between implicational theorems of
intuitionistic logic (and their proofs) and typable closed λ-terms
(or combinators) is called the *Curry–Howard isomorphism*. The
usual notion of a proof in a Hilbert-style axiomatic system is quite lax, but
it can be tidied up to obtain the notion of *traversing proofs*. In a
traversing proof there is a 1–1 correspondence between subterms of a
combinator and the formulas in the traversing proof as well as between
applications and detachments therein (cf. Bimbó 2007).

The above correspondence can be modified for other implicational logics and
combinatory bases. The next theorem lists correspondences that obtain
between the implicational fragments of the relevance logics *R*
and *T* and some combinatory bases that are of interest in themselves.

Theorem.A→Bis a theorem ofR_{→}(orT_{→}) iffM:A→Bis a provable type assignment whereMis a combinator over { B, I, W, C } (or over { B, B′, I, S, S′ }).

The calculus TA_{CL} may be amended by adding axiom schemas for the
combinators in the two bases. (The axiom schemas of the combinators that are
not in these bases may be omitted from the calculus or simply may be
neglected in proofs.) The *new axioms* are as follows.

B : ( A→B) → (C→A) →C→BB′ : ( A→B) → (B→C) →A→CC : ( A→B→C) →B→A→CW : ( A→A→B) →A→BS′ : ( A→B) → (A→B→C) →A→CI : A→A

The combinatory base { B, C, W, I } is especially interesting, because these combinators suffice for a definition of a bracket abstraction that is equivalent to the abstraction of the λI-calculus. To put it differently, all functions that depend on all of their arguments can be defined by this base. The other base allows the definition of functions that can be described by terms in the class of the so-called hereditary right maximal terms (cf. Bimbó 2005a). Informally, the idea behind these terms is that functions can be enumerated, and then their successive applications should form a sequence in which the indexes are “globally increasing.”

A type assignment has two parts: a *term* and a *formula*. The
questions whether some term can be assigned a type and whether some type can
be assigned to a term are the problems of *typability* and of
*inhabitation*, respectively. Although these questions may be posed
about one and the same set of type assignments, the computational properties
of these problems may differ widely.

Theorem.It is decidable if a termMcan be assigned a type, that is, ifMis typable.

The theorem is stated in a rather general way without specifying exactly
which combinatory base or which modification of TA_{CL} is assumed,
because the theorem holds for any combinatory base. Indeed, there is an
algorithm that given a combinator decides if the combinator is typable, and
for a typable combinator produces a type too. Of course, in the
combinatorially complete base
{ S,
K }
all the combinators are expressible as terms
consisting of these two combinators only. However, this assumption is not
needed for a solution of typability, though it might provide an explanation
for the existence of a general algorithm.

The *problem of inhabitation* does not have a similar general solution,
because the problem of the equality of combinatory terms is undecidable.
Given a set of axiom schemas that are types of combinators with detachment as
the rule of inference, the problem of the *decidability of a logic* can
be viewed as the problem of inhabitation. Indeed, if *A* is an
implicational formula, then to decide whether *A* is a theorem amounts
to determining if there is a term (over the base that corresponds to the
axiom schemas) that can be assigned *A* as its type. (Of course, a more
sophisticated algorithm may actually produce such a term, in which case it is
easy to verify the correctness of the claim by reconstructing the proof of
the theorem.)

To see from where complications can emerge in the case of decidability, we
compare the rule of the *formation of terms* and the rule of
*detachment*. Given a combinatory base and a denumerable set of
variables, it is decidable by inspection whether a term *is* or *is
not* in the set of the generated terms. That is, all the inputs of the rule
are retained in the output as subterms of the resulting term. In contrast,
an application of detachment results in a formula that is a proper subformula
of the major premise (and in the exceptional case when the major premise is
an instance of self-identity it is identical to the minor premise). The lack
of the retention of all subformulas of premises through applications of modus
ponens is the culprit behind the difficulty of some of the decision problems
of implicational logics. A solution to the problem of inhabitation may run
into similar difficulties.

For example, the combinator K can be assigned the following type.

p→ ((q→ ((q→ (q→q)) → ((q→q) → (q→q)))) →p)

SKK
can be assigned the
type *p* → *p*. There is a proof in TA_{CL}
ending in
SKK :
*p* → *p* that does not contain the long formula
above. However, there is a proof of
SKK :
*p* → *p* that contains the above formula the second
antecedent of which is not a subformula of *p* → *p*,
indeed, the sets of the subformulas of the two formulas are disjoint. Some important
cases of the problem of inhabitation, however, are decidable.

Theorem.It is decidable if a type has an inhabitant over the base { S, K }.

This theorem amounts to the typed version of the decidability of the
implicational fragment of
*intuitionistic logic*
that is part of
Gentzen's decidability results
(dating from 1935).

Theorem.It is decidable if a type has an inhabitant over the base { I, C, B′, W }.

The theorem is the typed equivalent of the decidability of the implicational
fragment of the logic of *relevant implication*. The decidability of
*R*_{→} was proven by Kripke in 1959 together with the
decidability of the closely related *E*_{→} (the
implicational fragment of the *logic of entailment*).

The rule of substitution is built-in into the formulation of TA_{CL}
via the rule *schema* called detachment and the axiom *schemas* for
the basic combinators. It is obvious that there are formulas of *least
complexity* that are types of
S and
K, such that all the other types of
S and
K are their substitution
instances. A formula that has this property is called a *principal
type* of a combinator. Obviously, a combinator that has a principal type,
has denumerably many principal types, which are all substitution instances of
each other; hence it is justified to talk about the *principal type
schema* of a combinator. The existence of principal types for complex
combinators is not obvious, nevertheless, obtains.

Theorem.If the termMis typable, thenMhas a principal type and a principal type schema.

Principal types and principal type schemas may seem now to be interchangeable
everywhere. Thus we could take a slightly different approach and define
TA_{CL} to include axioms and the rule schema of detachment together
with the *rule of substitution*. This version of TA_{CL} would
assume the following form.

Δ ⊢ S : ( p→q→s) → (p→q) →p→sΔ ⊢ K : q→s→q

Δ ⊢ M:A→BΘ ⊢N:AΔ, Θ ⊢ MN:B

Δ ⊢ M:AΔ[ P/B] ⊢M:A[P/B]

where *P* ranges over propositional variables. (The substitution
notation is extended—in the obvious way—to sets of type
assignments.) Clearly, the two deduction systems are equivalent.

If substitution were dropped altogether, then the applicability of detachment
would become extremely limited, for instance,
SK
no longer would be typable. A compromise between having substitution
everywhere and having no substitution at all is to modify the detachment rule
so that that includes as much substitution as necessary to ensure the
applicability of the detachment rule. Such a rule (without combinatory terms
or type assignments) was invented in the 1950s by C. A. Meredith,
and it is usually called *condensed detachment*. The key to the
applicability of detachment is to find a common substitution instance of the
minor premise and of the antecedent of the major premise. This step is
called *unification*.

Notice that it is always possible to choose substitution instances of a pair
of formulas so that the sets of their propositional variables are disjoint,
because formulas are finite objects. The *most general common instance*
of two formulas *A* and *B* (that do not share a propositional
variable) is *C*, where *C* is a substitution instance of
both *A* and *B*, and propositional variables are identified by the
substitutions only if the identification is necessary to obtain a formula
that is a substitution instance of both *A* and *B*.
The *unification theorem* (specialized to simple types) implies that if
two formulas *A* and *B* have a common instance then there is a
formula *C* such that all their common instances are substitution
instances of *C*. Obviously, a pair of formulas either has no common
instance at all, or it has ℵ_{0} many most general common
instances.

A famous example of a pair of formulas that have *no common instance* is
*A* → *A* and
*A* → *A* → *B*. The instances
*p* → *p* and
*q* → *q* → *r* share no
propositional variables, however, neither *q* → *q*
nor (*q* → *r*) → *q*
→ *r* matches the shape of the second formula. To put
the problem differently, *q* and *q* → *r*
would have to be unified, but they cannot be unified no matter what formula
is substituted for *q*. An immediate consequence of this is that
WI is not typable.

On the other hand, (*r* → *r*) → *r* →
*r* and ((*s* → *s*) → *s* → s) →
(*s* → *s*) → *s* → *s* are substitution
instances of *p* → *p* and of *q* → *q*.
Furthermore, all simple types are substitution instances of a propositional
variable, hence
II can be assigned both the type
*r* → *r* and the type (*s* → *s*) →
*s* → *s*—and, of course, the latter happens to be an
instance of the former because *A* → *A* is the principal type
schema of
II. If we apply condensed detachment to
*p* → *p* and *q* → *q*, then we get *q*
→ *q* (via the substitutions [*p*/*q* → *q*]
and [*q*/*q*]), and so condensed detachment yields the principal type
of
II. Incidentally,
II
and
I
provide an excellent example to illustrate
that *distinct* terms may have the *same* principal type schema.

Condensed detachment has been used extensively to refine axiomatizations of
various implicational logics, especially, in search for shorter and fewer
axioms. Some logics may be formulated using axioms (rather than axiom
schemas) together with the rule of condensed detachment without loss of
theorems. All the logics that we mentioned so far
(*J*_{→}, *R*_{→},
*T*_{→} and *E*_{→}) are
* D-complete*, that is, they all may be axiomatized by axioms and
the rule of condensed detachment. In other words, the implicational
fragments of classical and intuitionistic logics, and the implicational
fragments of the relevance logics

*R*,

*E*and

*T*are all

**D**-complete. (See Bimbó (2007) for some further technical details.)

Simply typed systems has been extended in various directions. Logics often
contain connectives beyond implication. It is a natural modification of a
type assignment system to expand the set of types via including further
*type constructors*. Conjunction and fusion are the easiest to explain or
motivate as type constructors, however, disjunction and backward implication
have been introduced into types too. Types are useful, because they allow us
to get a grip on classes of terms from the point of view of their behavior
with respect to reduction.

Tait's theorem.If a combinatory termMis typable (with simple types) thenMstrongly normalizes, that is, all reduction sequences ofMare finite (i.e., terminate).

The converse of this claim is, obviously, not true. For example, WI strongly normalizes but untypable, because the antecedent of contraction cannot be unified with any instance of identity. The aim to extend the set of typable terms led to the introduction of ∧ into types.

### 3.2 Intersection types

A different way to look at the problem of typing
WI
is to say that
W
should have a type similar to the
formula *A* → (*A* → *B*).
→ *A* → *B*, but in which the
formulas in place of the two formulas *A*
and *A* → *B* in the antecedent can be unified.
This is the motivation for the inclusion of conjunction (∧) and top
(⊤)
as new type constructors.

An extended type system, that is often called the *intersection type
discipline*, is due to Coppo and Dezani-Ciancaglini. The set of
*intersection types* (denoted by wff) is defined as follows.

*p*∈ wff if*p*is a propositional variable,- ⊤ ∈ wff, where ⊤ is a constant proposition,
*A*,*B*∈ wff implies (*A*→*B*), (*A*∧*B*) ∈ wff.

Of course, if TA_{CL} is augmented with an expanded set of types,
then new instances of the previously assigned types become available.
However, the gist of having types with type constructors ∧ and
⊤
is that the set of types has a
richer structure than the relationships between types determined by the
rules of substitution and modus ponens.

The structure of intersection types is described by the
conjunction–implication fragment of *B*, the *basic
relevance logic*. In the following presentation of this logic,
≤ is the main connective (an implication) of a formula and ⇒
separates the premises and the conclusion of an inference rule.

A≤AA≤ ⊤⊤ ≤ ⊤ → ⊤ A≤A∧AA∧B≤AA ∧ B≤B

A≤B,B≤C⇒A≤CA≤B,C≤D⇒A∧C≤B∧D( A→B) ∧ (A→C) ⇒ (A→ (B∧C))A≤B,C≤D⇒B→C≤A→D

The axiom schemas for the combinators S, K and I are as follows. Note that the axiom for S is not simply a substitution instance (with new connectives included) of the previous axiom for S.

Δ ⊢ S : ( A→B→C) → (D→B) → (A∧D) →C

Δ ⊢ K : A→B→AΔ ⊢ I : A→A

There are four new rules added, and there is an axiom for ⊤.

Δ ⊢ M:AΔ ⊢M:BΔ ⊢ M:A∧B

Δ ⊢ M:A∧BΔ ⊢ M:A

Δ ⊢ M:A∧BΔ ⊢ M:B

Δ ⊢ M:AA≤BΔ ⊢ M:BΔ ⊢ M: ⊤

This type assignment system is equivalent to the intersection type assignment system for the λ-calculus, and it allows a more precise characterization of classes of terms with respect to the termination of reduction sequences.

Theorem.

(1) A term Mis normalizable wheneverMis typable (in the last type assignment system).(2) A term Mis strongly normalizable wheneverMis typable and the proof does not contain ⊤.

## 4. Models

CL has various kinds of models, three of which are outlined in this section.
*Algebraic models* (often called “term models”) may be
constructed without difficulty for both the inequational and the equational
systems of CL that were introduced in section 2.1. The set of terms forms an
algebra, and given a suitable equivalence relation (that is also a
congruence), the application operation can be lifted to the equivalence
classes of terms in the standard way. The quasi-inequational characterization
of the so obtained algebra provides the basis for an algebraic semantics for
these logics. Isolating the Lindenbaum algebra and verifying that it is not
a trivial algebra constitutes a consistency proof for CL≥ and CL=.

### 4.1 Scott's model

The earliest model for a *typefree applicative system* as a *function
space* was given by Dana Scott in the late 60s. The following is an
outline of some of the key elements of his construction.

In pure typefree CL, an expression of the form *MM* is a well-formed
term. Moreover, terms of this form can enter into provable equations and
inequations in multiple ways. For example, *xx* = *xx* is an
axiom, and by one of the rules, *y*(*xx*) = *y*(*xx*)
is provable too. A more interesting occurrence of a term of the
form *MM* can be seen in the provable inequation
S(SKK)(SKK)*x* ≥
*xx*.

The set-theoretic reduction of a function yields a set of pairs (in general,
a set of tuples). In set theory (assuming well-foundedness, of course) a
pair (e.g., { { *a* }, { *a*,
*b* } }) is never identical to either of its two elements.
Therefore, the main question concerning a mathematical model of CL
is how to deal with *self-application*.

Scott's original model is built starting with a *complete lattice*
(*D*, ≤). That is, (*D*, ≤) is a partially ordered
set in which greatest lower bounds (infima) and least upper bounds (suprema)
exist for arbitrary sets of elements. A function *f* from
(*D*, ≤) into a complete lattice (*E*, ≤) is said
to be *continuous* when it preserves the supremum of each ideal on
*D*, where an ideal is an upward directed downward closed subset.

A *topology* may be defined on *D* by selecting certain increasing
sets as the opens. More precisely, if *I* is an ideal and *C* is
a cone, then *C* is open iff *C* ∩ *I*
≠ ∅ provided that
∨*I*
∈ *C*, that is, provided that the supremum of *I* is an
element of *C*.
(For example, complements of principal ideals are open.) *f* turns out
to be continuous in the usual topological sense, that is, the inverse image
of an open set being open, when *D* and *E* are taken together with
their topologies. This motivates the earlier labeling of these functions as
continuous. Notably, all continuous functions are *monotone*.

For the purposes of modeling functions in CL, the interesting functions are
those that are continuous *on* *D*. However, these functions by
themselves are not sufficient to obtain a modeling of self-application,
because none of them has as its domain a set of functions—as *D*
is not assumed to be a function space. The solution starts with defining a
*hierarchy of function spaces* *D*_{0}, *D*_{1},
*D*_{2}, … so that each function space
*D _{n}*, is a complete lattice on which continuous functions may
be defined (creating the function space

*D*

_{n+1}). The importance of selecting continuous functions is that the emerging function space has the

*same cardinality*as the underlying set, which allows defining embeddings between function spaces adjacent within the hierarchy.

The hierarchy of all the function spaces *D _{n}* may be
accumulated together. A standard construction in model theory is to form the
disjoint union of structures. (Disjointness can always be guaranteed by
indexing the carrier sets of the structures.) Scott defined

*D*

_{∞}to be the

*disjoint union*of the function spaces

*D*, for all

_{n}*n*, except that the extremal elements are “glued together.” (More formally, the top elements and the bottom elements of the function spaces, respectively, are identified with each other.)

*D*

_{∞}is a complete lattice, and by Tarski's fixed point theorem, a continuous function that maps

*D*

_{∞}into

*D*

_{∞}has a fixed point, which implies that

*D*

_{∞}is

*isomorphic*to

*D*

_{∞}→

*D*

_{∞}.

The above construction may also be conceptualized in terms of *strings*
and *Cartesian products*. The back-and-forth moves between functions of
one and more than one variable—the “uncurrying” and
“currying” of functions—algebraically corresponds to the
two directions of *residuation*. For example, a function *f* :
*A* × *B* → *C* may be
represented by a function *f* ′ : *A* →
*B* → *C*, and vice versa. Thus, without loss of
generality, it is sufficient to consider unary functions. If *a* is a
fixed element of the function space *D*_{∞}, then *x*
= (*a*, *x*) holds when *x* is the fixed point of *a*.
In terms of tuples, the solution may be viewed as the infinite tuple
(*a*, (*a*, (*a*, ….

### 4.2 Relational semantics

A further model that we briefly outline falls into the set theoretical
semantics paradigm of *nonclassical logic* and it is due to Dunn and
Meyer. The connection between CL and
nonclassical logics
is way stonger
than the link between CL and classical logic. In particular, the CL≥ and
CL= calculi are nonclassical logics. Set theoretical semantics in which the
intensional connectives are modeled from relations on a collection of
situations have been the preferred interpretation of nonclassical logics since
the early 1960s. These sorts of semantics are sometimes called “Kripke
semantics” (after Kripke who defined such semantics for some normal
modal logics) or “gaggle semantics” (after the pronunciation of
the abbreviation ‘ggl’ that stands for “generalized Galois
logics”).

A model for CL≥ may be defined as follows. Let (*W*, ≤, *R*,
*S*, *K*, *v*) comprise a (nonempty) partially ordered set
(*W*, ≤) with a three-place relation *R* on *W*, and let
*S*, *K* ∈ *W*. Furthermore, for any α, β,
γ, δ ∈ *W*, the conditions (s) and (k) are true.

(s) ∃ζ _{1}, ζ_{2}, ζ_{3}∈W.RSαζ_{1}∧Rζ_{1}βζ_{2}∧Rζ_{2}γδimplies ∃ζ _{1}, ζ_{2}, ζ_{3}∈W.Rαγζ_{1}∧Rβγζ_{2}∧Rζ_{1}ζ_{2}δ,

(k) ∃ζ _{1}∈W.RKαζ_{1}∧Rζ_{1}βγimplies α ≤ γ.

The ternary relation is stipulated to be antitone in its first two argument
places and monotone in the third. These components define a frame for
CL≥. The valuation function *v* maps variables *x*, *y*,
*z*, … into (nonempty) cones on *W*, and it maps the two
primitive combinators
S and
K
into the cones generated by *S* and *K*,
respectively. Recall, that the standard notation in CL hides application, a
binary operation that allows forming compound terms. The next clause extends
*v* to compound terms and makes this operation explicit again.

v(MN) = { β : ∃α, γ.Rαγβ ∧ α ∈v(M) ∧ γ ∈v(N) }

An inequation *M* ≥ *N* is valid if *v*(*M*)
⊆ *v*(*N*) under all valuations on frames for
CL≥. (That is, the relationship between the interpretations of the two
terms is invariant whenever *v* is varied on the set of variables.)

Informally, the underlying set *W* is a set of *situations*, and
*R* is an *accessibility relation* connecting situations. All the
terms are interpreted as sets of situations, and function application is the
*existential image operation* derived from *R*. A difference from
the previous model is that the result of an application of a term to a term
is not determined by the objects themselves that interpret the two
terms—rather the application operation is defined from *R*.

This semantics parallels the possible worlds semantics for normal modal logics. Therefore, it is important to note that the situations are not maximally consistent theories, but theories possessing the property that for any pair of formulas they contain a formula that implies both of them. Equivalently, the situations may be taken to be dual ideals on the Lindenbaum algebra of CL≥. These situations are typically consistent in the sense that they do not contain all the terms in all but one case. (The notion of negation consistency, of course, cannot be defined for CL≥ (or for CL=).)

Relational semantics can be defined for CL= along similar lines. Then soundness and completeness—that is, the following theorem—obtains.

Theorem.

(1) An inequation M≥Nis provable in CL≥ if and only ifv(M) ⊆v(N) in any model for CL≥.(2) An equation M=Nis provable in CL= if and only ifv(M) =v(N) in any model for CL=.

Relational and operational semantics for systems of CL that include dual and symmetric combinators can be found in Bimbó (2004).

## 5. Computational functions and arithmetic

A remarkable feature of CL is that despite its seeming simplicity it is a
*powerful formalism*. Of course, the strength of CL cannot be
appreciated without discovering certain relationships between combinatory
terms or without an illustration that computable functions are definable.

An important beginning step in the formalization of mathematics is the
*formalization of arithmetic*,
that was first achieved by the
Dedekind–Peano
axiomatization. There are
various ways to formalize arithmetic in CL; two representations of numbers are
described in this section together with some functions on them.

*Numbers* may be thought to be objects (or abstract objects) of some
sort. (Here by numbers we mean natural numbers, that is, 0 and the positive
integers.) Numbers are characterized, first of all, by the *structure*
they possess as a set. This structure possesses properties such as
0 ≠1, and that the sum of *n* and *m* is the same number as
the sum of *m* and *n*. A less elementary property is, for
example, the existence of infinitely many prime numbers.

Numbers can be represented in CL by terms, and one way is to choose the terms
KI,
I,
SBI,
SB(SBI),
… for 0, 1, 2, 3, etc. Depending on which
terms stand for the numbers, the terms that represent the arithmetic
operations vary. Note that unlike the Dedekind–Peano formalization of
arithmetic, CL makes no syntactic distinction that would parallel that
between individual constants and function symbols, because in CL the only
objects are terms. The above list of terms already shows the *successor
function*, that is
SB.
(SB(KI)
strongly equals to
I, that is, 1 is the successor of 0.)

*Addition* is the term
BS(BB),
and *multiplication* is the term
B.
The usual recursive definition of multiplication
relying on addition may induce the idea that addition should be a simpler
operation than multiplication. However, in CL the numbers themselves are
functions, and so they have properties that allows
B—a
simpler looking term—to be chosen for the
function that is often perceived to be more complex than addition. As a
classical example, we may consider the term
BII,
that is strongly equal to
I,
that is,
1 × 1 = 1—as expected.

Another way to represent numbers in CL is to start with a different choice of
terms for the numbers. Previously,
I
stood for 1, now we take
I
to be 0. The successor of a number *n* is
V(KI)*n*,
where the second occurrence of *n* indicates a numeral, that is, the
combinator that represents *n*. In other words, the *successor*
function is
V(KI).
(The numeral for *n* is often denoted—more precisely—by an
overlined or otherwise decorated *n*. However, the double usage in a
limited context should not cause any confusion.) Notice that the numbers in
the present representation are terms over a more restricted combinatory base
than in the former case. For example, no combinator with duplicative effect
is definable from { I,
K,
V }.

Some simple *recursive functions* may be defined as follows. The
*predecessor* function *P* on numbers is “−1”
(i.e., subtracting one) for all numbers greater than or equal to 1, and the
predecessor of 0 is defined to be 0. The next term defines the predecessor
function which is abbreviated by *P*.

P= C(W(BB(C(TK)I)))(KI)

If *n* is a numeral, then *Pn* reduces to
*n*KI(*n*(KI)),
which suggests that for positive numbers *P* could have been defined to be the
term
T(KI)
because
T(KI)*n*
reduces to *n*−1 whenever *n* is of the form
V(KI)(*n*−1).

Some models of computation (such as register machines) and certain
programming languages (explicitly) include a *test for zero*. It is
useful to define a function *Z* such that *Znxy* reduces to
*x* if *n* is zero, whereas *Znxy* reduces to *y* when
*n* is positive. *Znxy* may be thought of as the conditional
instruction “If *n* = 0 then *x* else *y*,” where
*x* and *y* are themselves functions. (Of course, in the
pseudo-code one should have assumed that *n* is of integer type and
cannot take a negative value, that can be guaranteed by a declaration of
variables and an additional conditional statement.) The following definition
works for branching on zero.

Z= TK

TK*nxy* =
*n*K*xy*,
and if *n* is zero, that is, *n* =
I,
then by another step
K*xy*
and then *x* results; whereas if *n*
is positive, then after a few more reductions, one gets
KI*xy*,
that is, *y*. The two terms,
K*xy*
and
KI*xy*,
hint toward an interpretation of
K and
KI
as *truth* and *falsity*, or they can be viewed as terms
that can select, respectively, the first or the second argument.
These ideas may be further developed into definitions of truth
functions and a representation of tuples.

*Addition* may be defined by the recursive equation
+ *mn* =
*Zmn*(V(KI)( + (*Pm*)*n*)),
where *m*
and *n* are numerals, and *P* and *Z* are the already defined
functions. (The abbreviations are used to enhance the readability of
the terms—they can be replaced everywhere by the defining combinators.)
To put into words, if *m* is 0 then the sum is *n*,
otherwise *m* + *n* is the successor of
(*m*−1) + *n*. Of course, this definition closely
simulates the definition of addition from recursion theory, where addition is
often defined by the two equations + (0, *n*) = *n* and
+ (*s*(*m*), *n*) = *s*( + (*m*,
*n*)) (with *s* denoting the successor function). The fact that CL
can express addition in this form shows—once again—the
versatility of CL.

Combinatorial completeness guarantees that the term on the right hand side of
the equation (i.e., the term
*Zmn*(V(KI)( + (*Pm*)n)))
can be transformed
into a term in which + is the first, *m* and *n* are the second and
third arguments, respectively. Then + can be defined explicitly as the fixed
point of the combinator

B(S(BS(TK)))(B(B(B(V(KI))))(CB(C(W(BB(C(TK)I)))(KI)))).

Of course, we can abbreviate the so obtained term as + for the sake of
transparency, just as we could use *P* and *Z* as shorthands for
longer combinatory terms.

*Multiplication* is often denoted by · . The
recursive equation · *mn* =
*Zm*I( + *m*( · (*Pm*)*n*))
defines multiplication and it can be deciphered as “if *m* is 0
then the result is 0, else *m* is added to the result of
(*m*−1) · *n*.” The next step in the
definition brings the right-hand side term to the form
X · *mn*, where
X does not contain occurrences of · ,
*m* or *n*. Then taking the fixed point of
X, and setting · to be
YX
concludes the definition of the multiplication
function. For instance, the abstraction can yield the combinator

BW(B(CB(BB(S(BB(CZI))+)))(B(CB)(CBP))).

The *factorial* function is definable from the predecessor function plus
from multiplication, and it is useful e.g., in combinatorics. The factorial
function (denoted by ! ) is recursively definable by the equation
! *m* =
*Zm*(V(KI)I)( · *m*( ! (*Pm*))),
that may be read as “if *m* is 0, then
! *m* = 1, otherwise ! *m* equals to *m*
multiplied by the factorial of *m*−1.”

Of course, it is not essential to define various numerical functions by
simulating their recursive definitions. As we saw above in the case of the
first representation of numbers, we might just happen to have the right terms
such as
BS(BB) and
B,
that behave as the target functions do on
numbers. An equally good way to define arithmetic functions is to simply
list some terms and then show that they behave as expected. However, once it
has been shown that the *basic primitive recursive functions* together
with *recursion* and *minimization* can be emulated in CL, we have
got not only a nice collection of arithmetic functions in the form of
combinators to work with, but also a proof that combinatory logic is
sufficiently expressive to formalize all *partial recursive functions*.
Indeed, the remaining steps of such a proof can be carried out in CL, though
the details are beyond the scope of this brief entry.

### 5.1 Gödel sentence

The abbreviations and the interspersed explanations in the sketch above may obscure that arithmetic has been formalized in a language that consists of five symbols (when juxtaposition is not counted): S, K, = plus two delimiters, ( and ) . The finite (and perhaps, surprisingly small) number of symbols and the availability of recursive functions conjure the thought that an arithmetization of the syntax of CL could be attempted.

Gödel
achieved an encoding of a formal language
by assigning numbers to symbols, formulas and sequences of formulas, which
later became known as “Gödel numbers.” Gödel relied on
the *prime factorization* theorem from algebra, however, it is possible
to arithmetize the language of CL without any similar strong assumptions.
(See for example, Smullyan (1985) and Smullyan (1994).) The five symbols get
as their Gödel numbers the first five positive integers. A string is
assigned the number in base 10 that results from the concatenation of the
corresponding numbers for the symbols.

The following outline gives the flavor of an analogue of Gödel's
incompleteness proof for CL. It is possible to define a combinator such that
if this combinator is applied to a numeral *n*, then the whole term
reduces to the numeral *m* that is the numeral denoting the *Gödel
number* of the numeral *n*. Slightly more formally, there is a
combinator δ such that δ*n* = *G*(*n*) (where
*G*(*n*) denotes the Gödel number of the expression *n*).
Furthermore, there is a combinatory term, which returns the numeral itself
followed by *G*(*n*), when applied to a numeral *n*. For any
term *A* there is a term *B* such that the equation
*A*(δ*B*) = *B* is true. This statement (or its
concrete variant for a particular formal system) is usually called
the *second fixed point theorem*. Computable characteristic functions
of sets of numbers can be represented by combinators with the choice of
K for truth and
KI for
falsity. The complements of such functions are computable too. Finally, it
can be proven that there is no combinator that represents the set of *all
true equations*. To put it differently, any combinator either represents
a set of equations that fails to include some true equations, or represents a
set of equations that includes all true but also some false equations.

Church proved the *undecidability of classical
first-order logic* relying on Gödel's incompleteness theorem. Scott
proved that if *A* is a nonempty proper subset of λ-terms that is
closed under equality then *A* is not recursive. The analogous claim for
CL, that follows from the existence of a Gödelian sentence for CL, is
that it is *undecidable* if two terms of CL are equal.

### Acknowledgements

I am grateful to the referees—both to the “internal referees” of this Encyclopedia and to J. P. Seldin—for their helpful comments and suggestions, as well as certain corrections. Heeding the advice of the referees, I repeatedly tried to make the entry less technical and more readable.

## Bibliography

Due to obvious limitations of size, only some representative publications are listed here. More comprehensive bibliographies may be found in Curry, Hindley and Seldin 1972, Hindley 1997, Anderson, Belnap and Dunn 1992, Terese 2003 as well as Hindley and Seldin 2008.

- Anderson, A., N. Belnap, and J.M. Dunn, 1992.
*Entailment: The Logic of Relevance and Necessity*, (Volume II), Princeton: Princeton University Press. - Barbanera, F., and S. Berardi, 1996. “A Symmetric
λ-Calculus for Classical Program
Extraction,”
*Information and Computation*, 125: 103–117. - Bimbó, K., 2003. “The Church-Rosser Property in Dual
Combinatory Logic,”
*The Journal of Symbolic Logic*, 68: 132–152. - Bimbó, K., 2004. “Semantics for dual and symmetric
combinatory calculi,”
*Journal of Philosophical Logic*, 33: 125–153. - Bimbó, K., 2005. “Types of
I-free
hereditary right maximal terms,”
*Journal of Philosophical Logic*, 34: 607–620. - Bimbó, K., 2007. “Relevance logics,” in
*Philosophy of Logic*(D. Jacquette, ed.), Handbook of the Philosophy of Science (D. Gabbay, P. Thagard and J. Woods, eds.), vol. 5, Elsevier (North-Holland), Amsterdam, 2007, pp. 723–789. - Bimbó, K., and J. M. Dunn, 2008.
*Generalized Galois Logics. Relational Semantics of Nonclassical Logical Calculi*, CSLI Lecture Notes, vol. 188, CSLI Publications, Stanford, CA. - Bunder, M. W., 1992. “Combinatory logic and lambda calculus
with classical types,”
*Logique et Analyse*, 137–128: 69–79. - Bunder, M. W., 2000. “Expedited Broda–Damas bracket
abstraction,”
*Journal of Symbolic Logic*, 65: 1850–1857. - Cardone, F., and J. R. Hindley, 2006. “History of lambda-calculus
and combinatory logic,” in
*Logic from Russell to Church*(D. M. Gabbay and J. Woods, eds.), Handbook of the History of Logic, vol. 5, Elsevier, Amsterdam, 2006. - Church, A., 1941.
*The Calculi of Lambda-conversion*, 1st ed., Princeton University Press, Princeton, NJ. - Coppo, M., and M. Dezani-Ciancaglini, 1980. “An extension of the
basic functionality theory for the λ-calculus,”
*Notre Dame Journal of Formal Logic,*21: 685–693. - Curry, H. B., and R. Feys, 1958.
*Combinatory Logic*, 1st ed., Studies in Logic and the Foundations of Mathematics, vol. I, North-Holland, Amsterdam. - Curry, H. B., J. R. Hindley, and J. P. Seldin, 1972.
*Combinatory Logic*, Studies in Logic and the Foundations of Mathematics, vol. II, North-Holland, Amsterdam. - Dunn, J. M., 1991. “Gaggle theory: An abstraction of Galois
connections and residuation with applications to negation, implication, and
various logical operators,” in
*Logics in AI: European workshop JELIA '90*, (J. van Eijck, editor), Lecture Notes in Computer Science, vol. 478, Springer, Berlin, 1991, pp. 31–51. - Dunn, J. M., and R. K. Meyer, 1997. “Combinators and structurally
free logic,”
*Logic Journal of IGPL*, 5: 505–537. - Gierz, G., K. H. Hofmann, K. Keimel, J. D. Lawson, M. W. Mislove, and
D. S. Scott, 2003.
*Continuous Lattices and Domains*, Encyclopedia of Mathematics and its Applications, vol.93, Cambridge University Press, Cambridge, UK. - Gödel, K., 1931. “Über formal unentscheidbare Sätze
der
*Principia mathematica*und verwandter Systeme I,” in*Collected Works*(S. Feferman, ed.), vol. I, Oxford University Press and Clarendon Press, New York, NY and Oxford, UK, 1986, pp. 144–195. - Hindley, J. R., 1997.
*Basic Simple Type Theory*, Cambridge Tracts in Theoretical Computer Science, vol. 42, Cambridge University Press, Cambridge, UK. - Hindley, J. R., and J. P. Seldin, 2008.
*λ-calculus and Combinators, an Introduction*, Cambridge University Press, Cambridge, UK. - Kleene, S. C., 1967.
*Mathematical Logic*, John Wiley & Sons, Inc., New York, (Dover, Mineola, NY, 2002). - Revesz, G. E., 1988.
*Lambda-calculus, Combinators and Functional Programming*, Cambridge University Press, Cambridge, UK. - Rosser, J. B., 1936. “A mathematical logic without
variables,”
*Annals of Mathematics*, 2: 127–150. - Schönfinkel, M., 1924. “On the building blocks of mathematical
logic,” in
*From Frege to Gödel. A Source Book in Mathematical Logic*(J. van Heijenoort, ed.), Harvard University Press, Cambridge, MA, 1967, pp. 355–366. - Scott, D., 1970.
*Outline of a Mathematical Theory of Computation*, (Technical report), Oxford University Computing Laboratory Programming Research Group. - Seldin J. P., 200+. “The logic of Curry and Church,”
in
*Handbook of the History of Logic*, (D. Gabbay and J. Woods, eds.), vol. 5, Elsevier, Amsterdam. - Shankar, N., 1997.
*Metamathematics, Machines, and Gödel's Proof*, Cambridge Tracts in Theoretical Computer Science, vol. 38, Cambridge University Press, Cambridge, UK. - Smullyan, R. M., 1985.
*To Mock a Mockingbird. And other Logic Puzzles Including an Amazing Adventure in Combinatory Logic,*Alfred A. Knopf, New York, NY. - Smullyan, R. M., 1994.
*Diagonalization and Self-reference*, Clarendon, Oxford, UK. - Tait, W., 1967. “Intensional interpretations of functionals of
finite type I,”
*Journal of Symbolic Logic*, 32: 198–212. - Terese, 2003.
*Term Rewriting Systems*, Cambridge Tracts in Theoretical Computer Science, vol. 55, Cambridge University Press, Cambridge, UK.

## Other Internet Resources

- λ-calculus problem list
- R. B. Jones's λ-calculus website

## Related Entries

algebra | category theory | Church, Alonzo | computability and complexity | function: recursive | Gödel, Kurt | Gödel, Kurt: incompleteness theorem | logic: intuitionistic | logic: linear | logic: non-classical | logic: relevance | logic: substructural | ontological commitment | Peano, Giuseppe | reasoning: automated | Turing, Alan | type theory