#### Supplement to Inductive Logic

## Proof of the Non-Falsifying Refutation Theorem

Here again we explicitly treat the case where only
*condition-independence* is assumed. If
*result-independence* holds as well, all occurrences of
‘(c^{k−1}·e^{k−1})’ may
be dropped, which gives the results stated in the text. If neither
*independence condition* holds, all occurrences of
‘c_{k}·(c^{k−1}·e^{k−1})’
are replaced by
‘c^{n}·e^{k−1}’ and
occurrences of
‘b·c^{k−1}’ are replaced
by ‘b·c^{n}’.

The proof of Convergence Theorem 2 requires the introduction of one
more concept, that of the *variance in the quality of
information* for a sequence of experiments or observations,
VQI[c^{n} | h_{i}/h_{j} | b]. The
quality of the information QI from a specific outcome sequence
e^{n} will vary somewhat from the *expected*
*quality of information* for conditions c^{n}. A common
statistical measure of how widely individual values tend to vary from
an expected value is given by the *expected squared distance from
the expected value*, which is a quantity called the
*variance*.

Definition: VQI — the Variance in the Quality of Information.

For h_{j}outcome-compatible with h_{i}on c^{k}, define

VQI[c_{k}| h_{i}/h_{j}| b·(c^{k−1}·e^{k−1})] =∑_{u}(QI[o_{ku}| h_{i}/h_{j}| b·c_{k}·(c^{k−1}·e^{k−1})] − EQI[c_{k}| h_{i}/h_{j}| b·(c^{k−1}·e^{k−1})])^{2}· P[o_{ku}| h_{i}·b·c_{k}·(c^{k−1}·e^{k−1})];Next define

VQI[c_{k}| h_{i}/h_{j}| b·c^{k−1}] =∑_{{ek−1}}VQI[c_{k}| h_{i}/h_{j}| b·(c^{k−1}·e^{k−1})] · P[e^{k−1}| h_{i}·b·c^{k−1}].For a sequence c

^{n}of observations on which h_{j}is outcome-compatible with h_{i}, define

VQI[c^{n}| h_{i}/h_{j}| b] =∑_{{en}}(QI[e^{n}| h_{i}/h_{j}| b·c^{n}] − EQI[c^{n}| h_{i}/h_{j}| b])^{2}· P[e^{n}| h_{i}·b·c^{n}].

Clearly VQI will be positive unless h_{i} and h_{j}
agree on the likelihoods of all possible outcome sequences in the
evidence stream, in which case both EQI[c^{n} |
h_{i}/h_{j} | b] and
VQI[c^{n} | h_{i}/h_{j} | b] equal 0.

VQI[c^{n} | h_{i}/h_{j} | b] does not
generally decompose into the sum of the VQI for individual experiments
or observations c_{k}. However, when both *independence
conditions* hold, the decomposition into the sum does
follow.

For the Proof, we employ the following abbreviations:Theorem: The VQI Decomposition Theorem for Independent Evidence:

Suppose bothcondition independenceandresult-independencehold. ThenVQI[c^{n}| h_{i}/h_{j}| b] = Σ_{k=1}^{n}VQI[c_{k}| h_{i}/h_{j}| b].

Q[e _{k}]= QI[e _{k}| h_{i}/h_{j}| b·c_{k}]Q[e ^{k}]= QI[e ^{k}| h_{i}/h_{j}| b·c^{k}]E[c _{k}]= EQI[c _{k}| h_{i}/h_{j}| b]E[c ^{k}]= EQI[c ^{k}| h_{i}/h_{j}| b]V[c _{k}]= VQI[c _{k}| h_{i}/h_{j}| b]V[c ^{k}]= VQI[c ^{k}| h_{i}/h_{j}| b]

The equation stated by the theorem may be derived as follows:

V[c^{n}]

= ∑ _{{en}}(Q[e^{n}] − E[c^{n}])^{2}· P[e^{n}| h_{i}·b·c^{n}]= ∑ _{{en}}((Q[e_{n}]+Q[e^{n−1}]) − (E[c_{n}]+E[c^{n−1}]))^{2}· P[e_{n}| h_{i}·b·c_{n}]·P[e^{ n−1}| h_{i}·b·c^{n−1}]= ∑ _{{en−1}}∑_{{en}}((Q[e_{n}]−E[c_{n}]) + (Q[e^{n−1}]−E[c^{n−1}]))^{2}·

P[e_{n}| h_{i}·b·c_{n}]·P[e^{ n−1}| h_{i}·b·c^{n−1}]= ∑ _{{en−1}}∑_{{en}}( (Q[e_{n}]−E[c_{n}])^{2}+ (Q[e^{n−1}]−E[c^{n−1}])^{2}+

2·(Q[e_{n}]−E[c_{n}])·(Q[e^{n−1}]−E[c^{ n−1}]) ) · P[e_{n}| h_{i}·b·c_{n}]·P[e^{ n−1}| h_{i}·b·c^{n−1}]= V[c _{n}] + V[c^{n−1}] +

2·∑_{{en−1}}∑_{{en}}(Q[e_{n}]·Q[e^{n−1}] − Q[e_{n}]·E[c^{n−1}] − E[c_{n}]·Q[e^{n−1}] +

E[c_{n}]·E[c^{n−1}]) · P[e_{n}| h_{i}·b·c_{n}]·P[e^{ n−1}| h_{i}·b·c^{n−1}]= V[c _{n}] + V[c^{n−1}] +

2 · (E[c_{n}]·E[c^{n−1}] − E[c_{n}]·E[c^{n−1}] − E[c_{n}]·E[c^{n−1}] + E[c_{n}]·E[c^{n−1}])= V[c _{n}] + V[c^{n−1}]= … = Σ _{k=1}^{n}VQI[c_{k}| h_{i}/h_{j}| b].

By averaging the values of VQI[c^{n} | h_{i}/h_{j} | b] over the number of observations n we
obtain a measure of the *average variance in the quality of the
information* due to c^{n}. We represent this average by
underlining ‘VQI’.

Definition: The Average Variance in the Quality of Information

VQI[c^{n}| h_{i}/h_{j}| b] = VQI[c^{n}| h_{i}/h_{j}| b] ÷ n.

VQI is only a true average, a sum of n terms divided by n,
when the *independent evidence* conditions hold. But our
definition here does not presuppose *independence*, and the
notion of “averaging” VQI,
VQI, by dividing by
the number of experiments and observations turns out to be useful even
when the evidence is not *independent*.

We are now in a position to state a very general version of the second
part of the *Likelihood Ratio Convergence Theorem*. It applies
to all evidence streams not containing *possibly*
*falsifying outcomes* for h_{j}. That is, it applies to
all evidence streams for which h_{j} is
*outcome-compatible* with h_{i} on each c_{k}
in the stream. This theorem is essentially a specialized version of
Chebyshev's Theorem, which is a so-called Weak Law of Large
Numbers. This version of the theorem presupposes neither of the
*independence conditions*.

Theorem 2*:Non-falsifying Likelihood Ratio Convergence Theorem

Choose positive ε < 1, as small as you like, but large enough that (for the number of observations n being contemplated) the value of EQI[c^{n }| h_{i}/h_{j}| h_{i}·b] > −(log ε)/n. ThenP[{e^{n}: P[e^{n}| h_{j}·b·c^{n}]/P[e^{n}| h_{i}·b·c^{n}] < ε} | h_{i}·b·c^{n}] ≥

1

−1

—

n

·VQI[c ^{n}| h_{i}/h_{j}| b]

—————————————

EQI[c^{n}| h_{i}/h_{j}| b] + (log ε)/n )^{2}

Thus, provided that the average expected quality of the information,
EQI[c^{n} | h_{i}/h_{j} | b], for the
stream of experiments and observations c^{n} doesn't get
too small (as n increases), and provided that the average variance,
VQI[c^{n} | h_{i}/h_{j} | b], doesn't
blow up (e.g. it is bounded above), hypothesis h_{i} say it is
highly likely that outcomes of c^{n} will be such as to make
the likelihood ratio against h_{j} as compared to
h_{i} as small as you like, as n increases.

**Proof:** Let

V = VQI[c ^{n}| h_{i}/h_{j}| b]E = EQI[c ^{n}| h_{i}/h_{j}| b]Q[e ^{n}]= QI[e ^{n}| h_{i}/h_{j}| b·c^{n}] = log(P[e^{n}| h_{i}·b·c^{n}]/P[e^{n}| h_{j}·b·c^{n}])

Choose any small ε > 0. And suppose (for n large enough) that E > −(log ε)/n. Then we have

So,

V = ∑{e ^{n}: P[e^{n}| h_{j}·b·c^{n}] > 0} (E − Q)^{2}· P[e^{n}| h_{i}·b·c^{n}]≥ ∑{e ^{n}: P[e^{n}| h_{j}·b·c^{n}] > 0 & Q[e^{n}] ≤ −(log ε)} (E − Q)^{2}· P[e^{n}| h_{i}·b·c^{n}]≥ (E + (log ε)) ^{2}· ∑{e^{n}: P[e^{n}| h_{j}·b·c^{n}] > 0 & Q[e^{n}] ≤ −(log ε)} P[e^{n}| h_{i}·b·c^{n}]= (E + (log ε)) ^{2}· P[{e^{n}: P[e^{n}| h_{j}·b·c^{n}] > 0 & Q[e^{n}] ≤ log(1/ε)} | h_{i}·b·c^{n}]= (E + (log ε)) ^{2}· P[{e^{n}: P[e^{n}| h_{j}·b·c^{n}]/P[e^{n}| h_{i}·b·c^{n}] ≥ ε} | h_{i}·b·c^{n}]

(1/n) · V / (E + (log ε)/n) ^{2}= V/(E + (log ε)) ^{2}≥ P[{e ^{n}: P[e^{n}| h_{j}·b·c^{n}]/P[e^{n}| h_{i}·b·c^{n}] ≥ ε} | h_{i}·b·c^{n}]= 1 − P[{e ^{n}: P[e^{n}| h_{j}·b·c^{n}]/P[e^{n}| h_{i}·b·c^{n}] < ε} | h_{i}·b·c^{n}]

Thus, for any small ε>0,

P[{e^{n}: P[e^{n}| h_{j}·b·c^{n}]/P[e^{n}| h_{i}·b·c^{n}] < ε} | h_{i}·b·c^{n}] ≥ 1 − (1/n)· V / (E + (log ε)/n)^{2}

(End of Proof)

The previous theorem shows that when VQI is bounded above, a
sufficiently long stream of evidence will very likely result in the
refutation of false competitors of a true hypothesis. This claim holds
regardless of whether the evidence can be chunked into
*independent* pieces. However, we can use the *independence
conditions* to describe a very simple provision under which
VQI is indeed bounded above. This gives us the theorem stated
in the main text.

Likelihood Ratio Convergence Theorem 2 — The Non-falsifying Refutation Theorem.

Suppose that theindependent evidence conditionshold. And suppose γ > 0 is a number smaller than 1/e^{2}(≈ .135). And suppose that for each possible outcome o_{ku}of each observation condition c_{k}in c^{n}, either P[o_{ku}| h_{i}·b·c_{k}·(c^{k−1}·e^{k−1})] = 0 or P[o_{ku}| h_{j}·b·c_{k}·(c^{k−1}·e^{k−1})] / P[o_{ku}| h_{i}·b·c_{k}·(c^{k−1}·e^{k−1})] ≥ γ. Choose positive ε < 1, as small as you like, but large enough (for the number of observations n being contemplated) that the value of EQI[c^{n }| h_{i}/h_{j}| h_{i}·b] > −(log ε)/n. ThenP[{e

^{n}: P[e^{n}| h_{j}·b·c^{n}] / P[e^{n}| h_{i}·b·c^{n}] < ε} | h_{i}·b·c^{n}] >

1

−1

—

n

·(log γ) ^{2}

——————————————

EQI[c^{n}| h_{i}/h_{j}| h_{i}·b] + (log ε)/n )^{2}

Proof: This follows from Theorem 2* together with the following
observation, which holds given the *independence
conditions*:

If for each c_{k}in c^{n}, for each of its possible outcomes o_{ku}, either P[o_{ku}| h_{j}·b·c_{k}] = 0 or P[o_{ku}| h_{j}·b·c_{k}]/P[o_{ku}| h_{i}·b·c_{k}] ≥ γ > 0, where γ < 1, then V = VQI[c^{n}| h_{i}/h_{j}| b] ≤ (log γ)^{2}.

To see that this observation holds, assume its antecedent.

- First notice that when 0 < P[e
_{k}| h_{j}·b·c_{k}] < P[e_{k}| h_{i}·b·c_{k}] we have(log[P[e

_{k}| h_{i}·b·c_{k}]/P[e_{ k}| h_{j}·b·c_{k}]])^{2}· P[e_{k}| h_{i}·b·c_{k}] ≤ (log γ)^{2}· P[e_{k}| h_{i}·b·c_{k}].So we only need establish that when P[e

_{k}| h_{j}·b·c_{k}] > P[e_{k}| h_{i}·b·c_{k}] > 0, we will also have this relationship — i.e., we will also have(log[P[e

_{k}| h_{i}·b·c_{k}]/P[e_{ k}| h_{j}·b·c_{k}]])^{2}· P[e_{k}| h_{i}·b·c_{k}] ≤ (log γ)^{2}· P[e_{k}| h_{i}·b·c_{k}].(Then it will follow easily that VQI[c

^{n}| h_{i}/h_{j}| b] ≤ (log γ)^{2}, and we'll be done.) - To establish the needed relationship, suppose that
P[e
_{k}| h_{j}·b·c_{k}] > P[e_{k}| h_{i}·b·c_{k}] > 0. Notice that for all p ≤ q, p and q between 0 and 1, the function g(p) = (log(p/q))^{2}· p has a minimum at p = q, where g(p) = 0, and (for p < q) has a maximum value at p = q/*e*^{2}— i.e. at p/q = 1/*e*^{2}. (To get this, take the derivative of g(p) with respect to p and set it equal to 0; this gives a maximum for g(p) at p = q/*e*^{2}.)So, for 0 < P[e

_{k}| h_{i}·b·c_{k}] < P[e_{k}| h_{j}·b·c_{k}] we have(log(P[e

_{k}| h_{i}·b·c_{k}]/P[e_{ k}| h_{j}·b·c_{k}]))^{2}· P[e_{k}| h_{i}·b·c_{k}] ≤(log(1/

*e*^{2}))^{2}· P[e_{k}| h_{j}·b·c_{k}] ≤ (log γ)^{2}· P[e_{k}| h_{j}·b·c_{k}](since, for γ ≤ 1/

*e*^{2}we have logγ ≤ log(1/*e*^{2}) < 0; so (logγ)^{2}≥ (log(1/*e*^{ 2}))^{2}> 0). - Now (assuming the antecedent of the theorem), for
each c
_{k},VQI[c _{k}| h_{i}/h_{j}| b]= ∑{o _{ku}: P[o_{ku}| h_{ j}·b·c_{k}] > 0} (EQI[c_{k}] − QI[c_{k}])^{2}· P[o_{ku}| h_{i}·b·c_{k}]= ∑{o _{ku}: P[o_{ku}| h_{j}·b·c_{k}] > 0} (EQI[c_{k}]^{2}− 2·QI[c_{k}]·EQI[c_{k}] + QI[c_{k}]^{2}) · P[o_{ku}| h_{i}·b·c_{k}]= ∑{o _{ku}: P[o_{ku}| h_{j}·b·c_{k}] > 0} QI[c_{k}]^{2}· P[o_{ku}| h_{i}·b·c_{k}] − EQI[c_{k}]^{2}≤ ∑{o _{ku}: P[o_{ku}| h_{j}·b·c_{k}] > 0} QI[c_{k}]^{2}· P[o_{ku}| h_{i}·b·c_{k}] ≤ (log γ)^{2}.So, given

*independence*,VQI[c

_{k}| h_{i}/h_{j}| b] = (1/n)·Σ_{k=1}^{n}VQI[c_{k}| h_{i}/h_{j}| b] ≤ (log γ)^{2}.