REHMT
UNPROVABLE THEOREMS AND FAST-GROWING FUNCTIONS Stephen G. Simpson #0. Introduction. ____________ This paper is a write-up of an expository talk which I have given tomathematical audiences at a number of universities in the United Statesand Europe over a period of several years. Preparation of this paper waspartially supported by NSF grant DMS-8317874. The purpose of the talk is to exposit some recent results (1977 andlater) in which mathematical logic has impinged upon finite combinatorics.Like most good research in mathematical logic, the results which I amgoing to discuss had their origin in philosophical problems concerning thefoundations of mathematics. Specifically, the results discussed here wereinspired by the following philosophical question. Could there be such athing as a comprehensive, self-contained discipline of finitecombinatorial mathematics? It is well known that a great deal of reasoning about finitecombinatorial structures can be carried out in a self-contained finitaryway, i.e. with no reference whatsoever to infinite sets or structures. Ihave in mind whole branches of mathematics such as finite graph theory,finite lattice theory, finite geometries, block designs, large parts offinite group theory (excluding character theory, in which use is made ofthe field of complex numbers), and large parts of number theory (includingthe elementary parts but excluding analytical techniques such as contourintegrals). One could easily imagine comprehensive textbooks of thesesubjects in which infinite sets are never mentioned, even tangentially.All of the reasoning in such textbooks would be concerned exclusively withfinite sets and structures. Consequently, there is a strong naive impression that the answer toour above-mentioned philosophical question is "yes." However, naive impressions can be misleading. I am going to discussthree recent results from mathematical logic which point to an answer of"no." Namely, I shall present three examples of combinatorial theoremswhich are finitistic in their statements but not in their proofs. Each ofthe three theorems is simple and elegant and refers only to finitestructures. Each of the three theorems has a simple and elegant proof.The only trouble is that each of the proofs uses an infinite set at somecrucial point. Moreover, deep logical investigations have shown that theinfinite sets are in fact indispensable. Any proof of one of these finitecombinatorial theorems must involve a detour through the infinite. Thus,in a strong relative sense, the three theorems are "unprovable" -- theycannot be proved by means of the finite combinatorial considerations interms of which they are stated. There is one somewhat technical point from mathematical logic which Imust now discuss. This involves the answer to a question which may havealready occurred to the reader. Namely, how might one establish that agiven finite combinatorial theorem cannot be given a finite combinatorialproof? We know how to recognize a finite combinatorial proof when we seeone. But how could we hope to establish that there is no finitecombinatorial proof of a given theorem? In order to do this, it would befirst of all necessary to give a precise definition of "finitecombinatorial proof" or at least "finite combinatorial provability." Such a definition has in fact been formulated and validated bymathematical logicians. The definition is unquestionably adequate in thatit is precise, rigorous, and captures the intuitive idea of "provabilityby finite combinatorial means." I do not want to give the detaileddefinition here. I shall however say something about the definition.(The details can be found in any good first-year graduate course ortextbook on the foundations of mathematics. For readers who are trainedin mathematical logic, let me remark that the notion I have in mind isthat of provability in the formal system PA of first-order Peanoarithmetic, also known as Z . Equivalently, one could consider 1provability in the formal system consisting of Zermelo-Fraenkel set theorywith the axiom of infinity replaced by "All sets are finite.") More specifically, there is a formal system consisting of appropriatesymbols, formulas, axioms, and rules of inference. For my purposes here Ishall refer to this formal system as T , the theory of finite sets and &finstructures. The language of T contains variables which can denote finarbitrary finite sets or finite structures of various sorts. Forinstance, there could be variables ranging over natural numbers, finitesequences of natural numbers, finite unordered sets, finite ordered sets,finite sequences of finite unordered sets, etc. Quantification over suchfinite structures is permitted. For instance, we can write down a formulasaying that for any finite structure of a certain sort there existsanother finite structure bearing a certain relationship to the first one,etc. There are also logical connectives which have their usual meaning.All valid methods of finite combinatorial inference are either formulatedexplicitly as rules of the formal system T or can be derived from rules *finwhich are are so formulated. For instance, T includes the powerful -finprinciple of proof by induction on the cardinality of a finitecombinatorial structure of a given sort. This principle applies to anyproperty of finite structures which can be expressed by a formula of thelanguage of T .
fin The point of T is that our intuitive but imprecise concept of fin"finite combinatorial provability" can be replaced by the preciselydefined notion of "provability in the formal system T ." This makes it 5finpossible for logicians to state and prove results to the effect thatparticular finite combinatorial theorems cannot be given finitecombinatorial proofs. The body of this paper is devoted to a fairly detailed discussion ofthe above-mentioned three examples of "unprovable" theorems. The threetheorems involve respectively colorings of finite sets, embeddings offinite trees, and exponential notation. I shall discuss the threeexamples one at a time together with appropriate background material. Ineach case I shall present the finite combinatorial theorem and at least asketch of its proof. I shall then give indications of why the theoremcannot be proved in T , i.e. why any proof must involve a detour through finthe infinite. This will require some discussion of ordinal numbers and ofa hierarchy of fast-growing functions. At the end of the paper there is an appendix describing subsequent,related research. (The material in the appendix was not included in thetalk on which this paper is based.) For further details I refer to theliterature, including several of the other papers in this volume. #1. A variant of Ramsey's Theorem. _ _______ __ ________ _______ Our first example of an "unprovable" theorem of finite combinatoricsconcerns colorings of the k-element subsets of a given finite set. It isclosely related to a famous theorem of Ramsey [31]. I begin by recallingthe Finite Ramsey Theorem. If X is any finite set, we write |X| = cardinality of X. We use i,j, k, l, m and n to denote positive integers. If X is any set, we kdenote by [X] the set of all k-element subsets of X. 1.1. FINITE RAMSEY THEOREM (Ramsey [31]). For all k, l and m, 5kthere exists n so large that: if |X| = n and [X] = C g ... g C , <1
l 6kthen there exists Y { X such that |Y| , m and [Y] { C for some i ;i, l. The reader who has no previous acquaintance with Ramsey's Theorem mayfind the following sociological illustration useful. Suppose that k = l =2. In this case the Finite Ramsey Theorem says the following. For any m,there exists n so large that, in any gathering of n people, there willalways be a set of at least m of them who all know each other, or none ofwhom know each other. Here C (respectively C ) is the set of unordered 1 2pairs of people who know each other (respectively do not know each other).For example, if m = 3 then we may take n = 6. This means that in anygathering of 6 people, there will always be at least three who either allknow each other or all don't know each other. In the statement of the Finite Ramsey Theorem, there is no harm inassuming that the sets C , ..., C are pairwise disjoint. In this case, 1 l kthe hypothesis [X] = C g ... g C is sometimes expressed by saying 1
lthat the k-element subsets of X have been colored with l colors C , ..., B1C . l The Finite Ramsey Theorem is definitely not our promised example ofan "unprovable" theorem. Like the vast majority of finite combinatorialtheorems, the Finite Ramsey Theorem has a finitistic proof which isstraightforwardly formalizable in T . I shall not give the proof here, #finbut in order to give the reader a feeling for this, let me mention thatthe proof yields an easily described upper bound on the size of n(expressed as a function of the other parameters k, l, and m). Let usdenote by f (k,l,m) the smallest n such that the conclusion of the RFinite Ramsey Theorem holds. The function f is known in the literature -Ras "the Ramsey function." One of the standard proofs of the Finite RamseyTheorem (namely the proof by "ramification") yields an upper bound ofapproximately !m+2k l & . | . * k+1 .
| l | l 7for f (k,l,m). It is also known that f (k,l,m) is bounded below by a R "Rsimilar expression. Thus f (k,l,m) is of superexponential growth rate Rin the parameter k. Such growth rates fall well within the strictures offinite combinatorial provability. We shall now state a finite combinatorial theorem which looks verysimilar to the Finite Ramsey Theorem and may appear to be only slightlystronger. However, we shall see that the logical properties and growthrates which are associated to this Modified Finite Ramsey Theorem arestrikingly different from those of the original Finite Ramsey Theorem. 1.2. MODIFIED FINITE RAMSEY THEOREM. For all k, l and m there :kexists n so large that: if X = {1,2,...,n} and if [X] = C g ... A1g C , then there exists Y { X such that |Y| , m, |Y| , min(Y), and l k[Y] { C for some i , l. i The Modified Finite Ramsey Theorem can be restated as follows. Afinite nonempty set Y of positive integers is said to be relativelylarge if |Y| , min(Y), i.e. the cardinality of Y is at least as greatas the minimum element of Y. For example {3,6,7} is relatively large and{6,7,9,11,15} is not relatively large. The Modified Finite Ramsey Theoremis just the Finite Ramsey Theorem with the conclusion strengthened to saythat in addition Y is relatively large. This is a transparentgeneralization of the Finite Ramsey Theorem. Consider for instance the sociological illustration which wasdiscussed earlier. In those terms, the special case k = l = 2 of theModified Finite Ramsey Theorem says the following. Given m, there existsn so large that the following holds. Consider an arbitrary set of n 6thpeople with the property that, for each j , n, the j person in the setis of the opinion that the number j is a very big number. Then therewill have to be a subset consisting of at least m people all of whomknow each other or all of whom do not know each other, such that inaddition the subset is very big according to the opinion of at least oneof the people in the subset. We now come to the main point of this section. The Modified FiniteRamsey Theorem was first formulated by Paris and Harrington in order toenable them to state the following unprovability result. 1.3. THEOREM (Paris-Harrington [30]). The Modified Finite RamseyTheorem is not provable in T . In other words, the Modified Finite finRamsey Theorem cannot be proved by finite combinatorial means alone. Thus the Modified Finite Ramsey Theorem is our first example of an"unprovable" combinatorial theorem. As explained above, the unprovabilityis only relative. The Modified Finite Ramsey Theorem is a validmathematical theorem which is proved by standard combinatorial methods.The only unusual feature is that, although the statement of the theorem isstrictly finitistic, any proof of the theorem must necessarily involve theinfinite. For interest's sake I shall now give the proof of the Modified FiniteRamsey Theorem. The proof is simple but uses infinite sets. It is basedon the following well-known infinitistic version of Ramsey's Theorem. 1.4. INFINITE RAMSEY THEOREM (Ramsey [31]). Let X be an infinite
kset. If [X] = C g ... g C , then there exists an infinite set Y { 1
l kX such that [Y] { C for some i , l. i I shall deduce the Modified Finite Ramsey Theorem from the InfiniteRamsey Theorem. Suppose that the Modified Finite Ramsey Theorem is false.Then for some fixed k, l, and m, there is no n satisfying theconclusion of the theorem. In other words, for each n there is a k n
ncoloring [{1,...,n}] = C g ... g C such that for no Y { {1,...,n} 1
l 4k nand i , l do we have |Y| , m, |Y| , min(Y), [Y] { C . Let X be the 9iset of all positive integers. For each i , l let C be the set of all 6i ?n(k+1)-element subsets of X of the form F g {n+1} with F m C . Thus ?i k+1[X] = C g ... g C and there is no finite Y { X such that |Y| , m 1
l k+1+ 1, |Y| , min(Y) + 1, [Y] { C for some i , l. Hence there is no #i k+1infinite Y { X such that [Y] { C for some i , l. This is in &icontradiction to the Infinite Ramsey Theorem. As indicated above, Paris and Harrington have shown that the ModifiedFinite Ramsey Theorem cannot be given a finite combinatorial proof. Inorder to give some indication of why this is so, we shall now discuss theassociated fast-growing function. Given k, l, and m, let f (k,l,m) be =PHthe smallest n which satisfies the conclusion of the Modified FiniteRamsey Theorem. Recall that f (k,l,m) is the analogous function for the Roriginal Finite Ramsey Theorem. One way to highlight the differencebetween the Finite Ramsey Theorem and the Modified Finite Ramsey Theoremis to compare the growth rates of f and f . We shall see that f grows #R PH PHmuch faster than f . R We have already described the growth rate of f . In order to 3Rdescribe the growth rate of f , it will be necessary to discuss a PHset-theoretic construction due to Cantor known as the ordinal numbers. Wedo not assume that the reader has any previous acquaintance with ordinalnumbers. The ordinal numbers are best understood as a transfinite extension ofthe natural number sequence. The natural number sequence is generated bystarting with 0 and then repeatedly adding 1. If we imagine the naturalnumber sequence 0, 1, 2, ... as a completed totality, it is reasonable toposit a new symbol w which is to be regarded as the limit of thissequence. We can then then continue the sequence by resuming the repeatedaddition of 1. Thus we obtain an extended sequence 0, 1, 2, ..., w, w+1, w+2, .... The same idea can be used to generate further extensions. If weimagine that the sequence w, w+1, w+2, ... has been similarly completed,it is reasonable to denote the limit by some such expression as w+w orw.2. We can then resume the process of adding 1's. This gives 0, 1, 2, ..., w, w+1, w+2, ..., w+w = w.2, w.2+1, w.2+2, ....Denoting the limit of w.2+1, w.2+2, ..., w.2+n,... by w.3 andcontinuing as before, we obtain 0, 1, 2, ..., w, w+1, w+2, ..., w.2, ..., w.3, ..., w.4, .... 2If we now let w.w = w denote the limit of the subsequence w, w.2,w.3, ..., w.n, ..., we can continue as before. Thus we have a process ofgenerating longer and longer extensions of the natural number sequence.This can be pictured as follows. 0, 1, 2, ..., w, w+1, w+2, ..., w+w = w.2, +2 2 2 2 ..., w.3, ..., w.4, ..., w.w = w , ..., w +w = w .2, 3 4 w ..., w , ..., w , ..., w , w w w w w ..., w , ..., w , ..., e = lim w , ... &0 n nwhere by definition e is the limit of the sequence w , w , ..., 0 1 2w , ... with n #w & ". !. | w = . * n. n w | w 7Two operations used repeatedly in this process are, first, the addition of1, and second, the introduction of an expression intended to denote thelimit of a cofinal increasing sequence of previously introducedexpressions. Each of the expressions so generated is called an ordinalnumber. If a and b are ordinal numbers, we write a < b to indicate thata comes before b in the process of generating the ordinal numbers.Thus < is an extension of the usual ordering of the natural numbers.The ordinal numbers which we have introduced fall into three mutuallyexclusive classes. Each ordinal number is either 0, a successor ordinala+1, or a limit ordinal d = lim d[n] where n d[1] < d[2] < ... < d[n] < ....Operations of ordinal addition and exponentiation are defined in such away that a+0 = a, a+(b+1) = (a+b)+1, a+d = lim(a+d[n]), a.1 = a, /n %0 a+1 a d d[n]a.(n+1) = (a.n)+a, a.w = lim a.n, w = 1, w = w .w, w = lim w . n %n The account of the ordinal numbers which we have given above israther sketchy and imprecise. A rigorous set-theoretic development wouldproceed as follows. Let A be a linearly ordered set. We say that A iswell-ordered if every nonempty subset of A has a first element. Twowell-ordered sets which are isomorphic are said to have the same ordertype. An ordinal number is then defined to be the order type of anywell-ordered set. This definition permits a convenient set-theoreticaldiscussion of arithmetical operations on the ordinal numbers. Forinstance, if a and b are the order types of the well-ordered sets Aand B respectively, then a+b is defined to be the order type of thewell-ordered set which consists of A followed by B. Similarly a.b isdefined to be the order type of the well-ordered set which results upon Fbreplacing each element of B by an isomorphic copy of A. Likewise a isthe order type of the set of functions from B into A with finitesupport, ordered by last difference. Each ordinal number less than e can be uniquely expressed as an %0exponential polynomial in w. A fairly typical example of an ordinalnumber less than e is 0 w+1 w w+1 w + w + w.This representation in terms of exponential polynomials is known as theCantor Normal Form. We shall now indicate how ordinal numbers can be used to describe thegrowth rate of the function f which was defined above in terms of the PHModified Finite Ramsey Theorem. To each ordinal a , e we shall 70associate a function f from positive integers to positive integers. As aexplained above, the ordinal numbers , e are generated from 0 by (0repeatedly adding 1 and taking limits of increasing sequences d = lim Dnd[n] where d[1] < d[2] < ... < d[n] < .... We define f (n) = n + 1, 0 $n f (n) = f f ...f (n) = f (n), a+1 a a a a ^---(---& n f (n) = f (n) for limit ordinals d , e . d d[n] 0Of course this definition presupposes that for each limit ordinal d , e H0we have specified an increasing sequence d[1], d[2], ..., d[n], ... suchthat d = lim d[n]. We omit the details of exactly how this is done. See nfor instance the paper by Buchholz and Wainer [7] in this volume. In order to illustrate the properties of the functions f , a , e , >a 0let us consider the first few of these functions. We have f (n) = n + 1,
0 f (n) = n + 1 + 1 ... +1 = n + n = 2.n,
1 ^-----(------& n 'n
n f (n) = 2.2.....2.n = 2 .n _ 2 ,
2 ^---(---& n n 2 & 2 & . |
. | . | . | f (n) _ . * n _ . * n,
3 2 |
2 | 2 7 2 7and the growth rate of f (n) is already too fast to be described in 4exponential notation. Note that f has approximately the same growth #3rate as the Ramsey function f . R In general, it can be shown that, for all a < b , e , f grows 90 bmuch faster than f . In particular f is eventually dominated by f a a bin the sense that f (n) < f (n) for all sufficiently large n. Thus we a bhave a hierarchy of fast-growing functions which are indexed by theordinal numbers , e and are ordered by eventual domination. This 0hierarchy is due to Wainer and is known as the Wainer hierarchy. We are now in a position to describe the growth rate of theParis-Harrington function f which is associated with the Modified PHFinite Ramsey Theorem. Namely, it has been shown that the growth rate off is approximately the same as that of f . (This result is due to PH )e -0Paris and Harrington [30] using results of Wainer. See also the papers byKetonen-Solovay [21] and Buchholz-Wainer [7] and #6.3 of the monograph byGraham, Rothschild and Spencer [19].) In particular f eventually dominates f for all a < e . By PH a 0contrast f is eventually dominated by f . Thus f grows much, much R 4
PHfaster than f . R The remarkable aspect of all this is that there is really no way todescribe the growth rate of f except by recourse to ordinal numbers. PHLogicians have shown that the ordinal numbers up to and including e are D0in a sense implicit or unavoidable in the Modified Finite Ramsey Theorem.This is closely related to the fact that the Modified Finite RamseyTheorem cannot be proved in T . (An important classical theorem of finmathematical logic is that e is the limit of the ordinal numbers which 0are in a certain sense accessible within T .) *fin #2. Embeddability of finite trees. _____________ __ ______ _____ Our second example of an "unprovable" finite combinatorial theoremwill be concerned with embeddability of finite trees. We begin bypresenting the relevant definitions. A finite tree is a finite, partially ordered set T such that: (i) Thas a minimum element (called the root of T); (ii) for each x m T, thepredecessors of x in T are linearly ordered. Thus a finite tree can be pictured as a diagram which looks likethis: ^
. . . | . . . . . | !. . . T 8 ". . | | . 6 . #- root of T.Note that any two elements of a finite tree have a greatest lower bound.The greatest lower bound of x and y is denoted x c y. Let T ant T' be finite trees. An embedding of T into T' is aone-to-one mapping o: T 3 T' such that for all x, y m T, o(x c y) =o(x) c o(y). (It follows easily that x , y if and only if o(x) ,o(y).) We say that T is embeddable into T' if there exists an embeddingof T into T'. We write T " T' to mean that T is embeddable into T'. An important result about embeddability of finite trees, due to J. B.Kruskal [24], is the following. There is no infinite set of pairwise non-embeddable finite trees. We shall use the following equivalentformulation of this result. 2.1. KRUSKAL'S THEOREM. For any infinite sequence of finite treesT , T , ..., T , ..., there exist indices i and j such that i < j and T 1 2 n 9iis embeddable into T . j Kruskal's Theorem itself is not our promised example of an"unprovable" finite combinatorial theorem about finite trees. Indeed, thestatement of Kruskal's Theorem involves an infinite sequence and istherefore definitely not finitary. (Kruskal's Theorem cannot even bestated in the language of T .) However H. Friedman has shown that finKruskal's Theorem has a certain consequence which is finitary in itsstatement, but not in its proof. This consequence, known as Friedman'sFinite Form, is our promised example. 2.2. FRIEDMAN'S FINITE FORM OF KRUSKAL'S THEOREM. For any positiveinteger c, there exists a positive integer n = n which is so large 2cthat the following holds. If T , T , ..., T is any finite sequence of 1 2 nfinite trees with |T | , c.i for all i , n, then there exist indices i iand j such that i < j , n and T " T . !i j Clearly Friedman's Finite Form is a finitary statement, referringonly to finite sets and structures. Hence Friedman's Finite Form can bestated within T . The following unprovability theorem is due to finFriedman (198l, unpublished). For a full exposition of the theorem andits proof, see Simpson [37]. 2.3. THEOREM. Friedman's Finite Form of Kruskal's Theorem is notprovable in T . In other words, Friedman's Finite Form of Kruskal's
finTheorem cannot be proved by finite combinatorial methods alone. In this result, the bound of c.i can be strengthened considerably.See Loebl-Matousek [26] (this volume) and Smith [39]. Let me now indicate how Friedman's Finite Form can be derived fromKruskal's Theorem. Suppose for a contradiction that Friedman's FiniteForm is false. Let t be the set of all finite sequences ;T ,T ,...,T : c '1 2 nsuch that |T | , c.i for all i , n, and T " T for no i and j with i <
i i jj , n. Then for some c, t is infinite. If we partially order this t c ,cby extension, we see that it is an infinite, finitely branching tree.
~Hence by Konig's Infinity Lemma [23], it has an infinite path. In otherwords, there is an infinite sequence of finite trees ;T ,T ,...,T ,...: 81 2 nsuch that |T | , c.i for all i, and T " T for no i and j with i < j.
i i jThis contradicts Kruskal's Theorem. =~ In the above derivation, it is interesting to note how Konig's Lemma 7~was applied in a non-traditional way. Traditionally, Konig's Lemma isused to derive infinitistic theorems from finitistic ones. For example ~Konig's Lemma can be used to deduce the infinite form of Dilworth's 5~Theorem from the finite form. Indeed, the title of Konig's paper [23]translates to: "On a method of drawing conclusions from the finite to the ~infinite." This means that Konig must have viewed his lemma as a one-waypipeline from the finite to the infinite. However, there is no reason notto reverse the flow and derive finitistic statements from infinitisticones. Such was the procedure of the previous paragraph. By Theorem 2.3, Friedman's Finite Form cannot be proved without someuse of infinite sets. Actually Friedman obtained a much stronger result.Namely, neither Kruskal's Theorem nor Friedman's Finite Form can be provedwithout at least some weak use of uncountable sets. (The precise resultis that Friedman's Finite Form cannot be proved in the formal system ATR . H0For a proof and references to the literature, see Simpson [37].) ThusFriedman's Finite Form of Kruskal's Theorem is, so to speak, even moreunprovable than the Modified Finite Ramsey Theorem! This higher degree of unprovability is reflected in the associatedgrowth rate. Using the notation n as in the statement of Friedman's #cFinite Form, it is natural to ask the following question. How fast doesn grow as a function of c? The answer is that n grows faster than c 2cthe Wainer function f where G is a certain ordinal number which is G
0 0much, much larger than e . In order to define G , let OR be the class 0 0of all ordinal numbers and consider the following hierarchy of continuousincreasing functions f : OR 3 OR where a ranges over OR. Given b m OR a b %thwe define f (b) = w and, for all a > 0, f (b) = the b simultaneous 0 !afixed point of the functions f , a' < a. Then G is defined to be the a' 0least ordinal g > 0 such that f (b) < g for all a, b < g. This is a "avery large ordinal number. By comparison e = f (0) is miniscule. The ,0 1Wainer hierarchy can be extended through the ordinals , G . It can then :0be shown that f is eventually dominated by the Friedman function n (as G 4c 0a function of c). Indeed, it can be shown that the Friedman functioneventually dominates f for certain ordinals a which are somewhat alarger than G .
0 What about n for small values of c? This question points to canother remarkable aspect of Friedman's Finite Form. Consider forinstance c = 12. Please bear in mind that n is a single, specific ,12positive integer. (Namely, it is the smallest positive integer n withthe property that, in any sequence of finite trees T , T , ..., T with 51 2 n|T | , 12.i for all i , n, there is at least one embedding T " T , i < i ;i jj , n.) Hence the existence of n can be proved in T . However, it !12 finturns out that n is extremely large. It can be shown that n is much 12 ,12larger than !2 & . | . | . * 1000 2 | 2 7 .In fact, Friedman has shown that any proof of the existence of n within @12T would require more than fin !2 & . | . | . * 1000 2 | 2 7sheets of paper! Hence there is no hope for a manageable proof of theexistence of n within T . (For details see Smith [39].) This version 12 finof Theorem 2.3 is interesting from the viewpoint of ultrafinitism. #3. Exponential notation. ___________ ________ Our third example of an "unprovable" finitistic theorem is number-theoretical rather than combinatorial. The theorem is concerned with oneof the most elementary number-theoretic concepts, namely base bnotation. (The special case b = 10 is the usual decimal notation.) Let b be a positive integer , 2. Any nonnegative integer n canbe written uniquely in the form n n 1 k n = b .c + ... + b .c "1 kwhere k , 0, n > ... > n , 0, and 0 < c < b for i = 1, ..., k. l
k iThis is known as the base b representation of n. For example the base 2representation of 266 is #8 3 266 = 2 + 2 + 2. We may extend this by writing each of the exponents n , ..., n in ;1 kbase b notation, then doing the same for each of the exponents in theresulting representations, ... until the process stops. This yields theso-called complete base b representation of n. For example the completebase 2 representation of 266 is obtained as: !8 3 266 = 2 + 2 + 2 "3 !2 2+1 = 2 + 2 + 2 "2+1 !2 2+1 = 2 + 2 + 2. We now present the definitions which are needed to state the third"unprovable" theorem. Let R (n) be the nonnegative integer which bresults if we take the complete base b representation of n and thenreplace each b by b+1. Thus R is a "base change" function. For "bexample "3+1 !3 3+1 R (266) = 3 + 3 + 3. 2For each nonnegative integer n we recursively define a sequence ofnonnegative integers (n) , (n) , ..., (n) , ... by 0 1
k (n) = n, 0 ^ R ((n) ) - 1 if (n) > 0, | k+2 k k | (n) = 8 k+1 | | 6 0 if (n) = 0, 7kfor all k , 0. This is known as the Goodstein sequence beginning withn. For example the Goodstein sequence beginning with 266 is !2+1 2 2+1 (266) = 266 = 2 + 2 + 2, 0 3+1 3 3+1 (266) = 3 + 3 + 3 - 1 1 3+1 3 3+1 = 3 + 3 + 2, 4+1 4 4+1 (266) = 4 + 4 + 1, 2 5+1 5 5+1 (266) = 5 + 5 , 3 6+1 6 6+1 (266) = 6 + 6 - 1 4 6+1 6 6 5 = 6 + 6 .5 + 6 .5 + ... + 6.5 + 5, 7+1 7 7 5 (266) = 7 + 7 .5 + 7 .5 + ... + 7.5 + 4, 5 . . . . We can see that the Goodstein sequence beginning with 266 appearsinitially to increase very rapidly. Nevertheless, despite appearances, itcan be proved that this sequence converges to 0. In other words, (266) = Gk0 for all sufficiently large k. This surprising conclusion is due toGoodstein [18] who actually proved the same result for all Goodsteinsequences: 3.1. GOODSTEIN'S THEOREM. For all n there exists k such that (n) = Gk0. In other words, every Goodstein sequence converges to 0. Obviously the statement of Goodstein's Theorem is totally finitistic,referring only to natural numbers and elementary operations on them. Onthe other hand, we shall see that Goodstein's proof of his theorem isdefinitely not finitistic. Moreover Kirby and Paris [22] have shown thatno finitistic proof of the theorem is possible. Thus Goodstein's Theoremwill turn out to be our promised example of an "unprovable" number-theoretic theorem. I shall now sketch the proof of Goodstein's Theorem. The proof isinfinitistic in that it makes direct use of the ordinal numbers up to e . H0 9wGiven b , 2 and any natural number n, we denote by R (n) the result 9bof taking the complete base b representation of n and replacing each !woccurrence of b by w. Thus R (n) is the Cantor Normal Form of some !bordinal number less than e . For example 0 #w+1 w w w+1 R (266) = w + w + w. 2 w w $wNote that R (m) < R (n) if and only if m < n. Also R (0) = 0, and b b $b w wR (n) = R R (n). Now given n, consider the Goodstein sequence b b+1 b (n) , (n) , ..., (n) , (n) , ... 0 1
k k+1and the corresponding sequence of ordinal numbers
w
w w w R ((n) ), R ((n) ), ..., R ((n) ), R ((n) ), ....
2 0 3 1 k+2 k k+3 k+1For all k such that (n) > 0, we have k w w R ((n) ) = R (R ((n) )-1) k+3 k+1 k+3 k+2 k %w !< R (R ((n) ) %k+3 k+2 k %w != R ((n) ). %k+2 kSince there is no infinite descending sequence of ordinal numbers, itfollows that (n) = 0 for some k. This completes the proof. k We now state the following unprovability theorem due to Kirby andParis. 3.2. THEOREM (Kirby-Paris [22]). Goodstein's Theorem 3.1 is notprovable in T . In other words, Goodstein's Theorem cannot be proved
finby finite combinatorial methods alone. This gives our third example of an "unprovable" finite combinatorialtheorem. As in the case of our other two examples, the "unprovability" isreflected in the very rapid growth rate of a certain function. For all nlet f (n) be the smallest k such that (n) = 0. Kirby and Paris have G 'kshown that the Goodstein function f grows approximately as fast as $Gf . In particular f eventually dominates f for all a < e . (An e G a 0 0extremely elegant, computational proof of these results has been given byCichon [10]. See also Buchholz and Wainer [7].) #4. Appendix $________ In this section I shall discuss various developments which arerelated to the results of ##1-3. "* * * Ramsey's Theorem is only the beginning of a highly developed field ofresearch. For an excellent but by now somewhat dated survey of thisrapidly changing field, see the monograph Ramsey Theory by Graham,Rothschild, and Spencer [19]. I want to discuss some results in RamseyTheory which have some bearing on "unprovable" finite combinatorialtheorems. One of the best-known generalizations of Ramsey's Theorem is the #~ 'Canonical Ramsey Theorem due to Erdos and Rado (see # 5.5 of [19]).Recently Kanamori and McAloon [20] have used the Canonical Ramsey Theoremto give an elegant alternative to the results of Paris and Harrington. 6kFor any set X of positive integers, a coloring f: [X] 3 X is said to be .kregressive if f(F) < min(F) for all F m [X] such that min(F) > 1.The Kanamori-McAloon version of the Modified Finite Ramsey Theorem readsas follows. 4.1. THEOREM. For all k and m there exists n so large that, for all kregressive f: [X] 3 X = {1,2,...,n}, there exists Y { X such that |Y| , m kand, for F m [Y] , f(F) depends only on min(F). Kanamori and McAloon have shown that this statement has the samelogical and growth-rate properties as the Paris-Harrington statement 1.2.In particular, the Kanamori-McAloon statement is not provable in T . Bfin The advantage of this approach is that the proof of the Kanamori-McAloon unprovability result is much simpler than the proof of thecorresponding Theorem 1.3 of Paris and Harrington. To those who wish topresent a version of the Paris-Harrington result in a first-year graduatelogic course but do not want to spend too much time on the combinatorialdetails, I recommend the first part of Kanamori-McAloon [20]. "* * * In recent years there have appeared a number of infinitistic Ramsey-type theorems which are topological in nature. The prototypical result ofthis kind is the following theorem due to Galvin and Prikry. For any %infinite set X, let [X] be the set of all infinite subsets of X. We
%regard [X] as a topological space with basic open sets of the form X % o = {Y m [X] : F { Y and G f Y = O} FGwhere F and G are disjoint finite subsets of X. 4.2. GALVIN-PRIKRY THEOREM ([17]). Let X be an infinite set. If % @%[X] = C g ... g C with each C closed, then there exists Y m [X] such 1
l
i %that [Y] { C for some i.
i ;% (The hypothesis that each C be a closed subset of [X] can be iweakened considerably. For instance, Galvin and Prikry proved theirtheorem with "closed" replaced by "Borel." The sharpest result in thisdirection is due to Ellentuck. For details see [9].) Friedman, McAloon and Simpson [15] have shown that, despite itsinfinitary nature, the Galvin-Prikry Theorem 4.2 is relevant to finitecombinatorics. Indeed, Theorem 4.2 can be converted into a finitarystatement, just as Paris and Harrington converted the Infinite RamseyTheorem 1.4 into the Modified Finite Ramsey Theorem 1.2. (More precisely,the miniaturization procedure of [15] is modeled on the original idea ofParis [29] rather than on Paris-Harrington [30].) This finitary statementdue to Friedman, McAloon and Simpson turns out to be unprovable not onlyin T but also in the much stronger theory ATR which was mentioned near fin )0the end of #2. In addition, the associated fast-growing function grows atabout the same rate as f . (These results are sharp in the sense that G 0ATR is the weakest natural subsystem of second order arithmetic in which 04.2 is provable. For details see [15].) As mentioned above, the Galvin-Prikry Theorem has served as theprototype for a large number of extensions and generalizations of theInfinite Ramsey Theorem. A comprehensive survey of this area can be foundin [9]. Let me describe only one of these results, due to Carlson andSimpson [8]. For any set X, a partition of X is a pairwise disjointcollection of nonempty subsets of X whose union is all of X. If X is %infinite, let (X) be the set of all infinite partitions of X. For any Y %m (X) , put %
% (Y) = {Z m (X) : Z is coarser than Y}.
%We regard (X) as a topological space with basic open sets of the form X
% o = {Y m (X) : Y$F = P} Pwhere: P is a partition of a finite subset F of X; Y$F is the restrictionof Y to F, i.e. Y$F = {y f F: y m Y and y f F = O}. 4.3. DUAL GALVIN-PRIKRY THEOREM (Carlson and Simpson [8]). Let X %be an infinite set. If (X) = C g ... g C with each C closed, then 1
l
i % %there exists Y m (X) such that (Y) { C for some i. (i (As in the case of Theorem 4.2, the hypothesis "C closed" in Theorem 6i4.3 can be weakened considerably. For details see [8].) In view of the results of Friedman, McAloon and Simpson which werediscussed above, it is natural to ask whether Theorem 4.3 has any finitecombinatorial consequences which are unprovable in relatively strongsubsystems of second order arithmetic. (This was my original motivationfor formulating the Dual Galvin-Prikry Theorem. See #7 of [8].) Almostnothing is known about this question. It is known that Theorem 4.2 can bededuced from Theorem 4.3 as an easy corollary. Hence by [15] the growthrate associated to Theorem 4.3 is at least as great as f . However, it 8G 90is not known to be any greater. Similarly, almost nothing is known aboutthe following, closely related, axiomatic question. What set existenceaxioms are needed to prove Theorem 4.3? It follows from [15] that atleast the axioms of ATR are needed, and it is reasonable to conjecture 0that much more is needed, but this conjecture remains open. Analogousremarks could be made about the other generalizations and extensions ofthe Galvin-Prikry Theorem which are discussed in [9]. (Note added January 31, 1986. Recent work of Blass, Hirst and ____ _____ _______ ___ ____Simpson [4] yields some non-trivial upper bounds on the set existenceaxioms and growth rates associated with Hindman's Theorem and the DualGalvin-Prikry Theorem. However, more work remains to be done.) "* * * A large part of modern Ramsey Theory is concerned with the followingtheorem of van der Waerden. Let N be the set of nonnegative integers. Afinite arithmetical progression is a subset of N of the form {a, a + d, a + 2.d, ..., a + k.d}. 4.4. VAN DER WAERDEN'S THEOREM. If N = C g ... g C then, for /1
lsome i, C contains arbitrarily long finite arithmetical progressions. i A equivalent statement of Van der Waerden's Theorem reads as follows.For any k and l, there exists n so large that, if {1, 2, ..., n} = C g E1... g C , then for some i, C contains an arithmetical progression of l ilength k. Define W (k) to be the smallest n such that this conclusion lholds. In view of #1, it is natural to ask: What is the growth rate ofW (k) as a function of l and k? l The currently known upper bound for W (k) is something like the *lAckermann function, i.e. the function f of #1. The problem of improving 'wthis bound is open, famous, and apparently difficult. (See for instancethe paper by Brackin [5] in this volume.) It is unknown whether W (k) is eventually dominated by f (k) for any 2 nfixed n. Should this turn out not to be the case, an interestingunprovability result would follow. Namely, it would follow that Van der D+Waerden's Theorem is not provable in the formal systems WKL and WKL of ;0 0 5+[38]. This would be interesting because WKL and WKL are known to be ,0 0adequate for the development of a large part of ordinary mathematics.Thus we would be able to conclude that any proof of Van der Waerden'sTheorem must use strong methods which go beyond those that are usuallyneeded in mathematics. For instance, the standard combinatorial proof of Van der Waerden'sTheorem uses a strong method known as double induction. For each fixed k,the existence of W (k) is proved by induction on l assuming the existence lof W (k') for all k' < k and all l'. This type of induction is l' +unavailable in WKL and WKL and is rarely needed in mathematical 0 0practice. The interesting problem is whether all proofs of Van derWaerden's Theorem must necessarily involve such unusual methods. Similar remarks apply to the theorems of Hales-Jewett, Graham-Leeb- 'Rothschild, and Szemeredi. (For an exposition of these theorems, seeChapter 2 of [19].) Each of these theorems is a generalization of Van derWaerden's Theorem. In each case, the corresponding growth rate is notknown to be eventually dominated by f for any finite n, and it would be %nvery interesting if it were not so dominated. "* * * Furstenberg and his coworkers (see [16]) have developed a beautifulapproach to Van der Waerden's Theorem and related results. Namely, theyhave shown that such results can often be deduced from general theorems oftopological dynamics and ergodic theory. (In some cases, the process canbe reversed so that the combinatorial theorems are in fact interchangablewith their dynamical counterparts.) Thus Furstenberg and his coworkershave provided a sort of analytical or perhaps geometrical context for thisbranch of combinatorics. See for instance the paper by Bergelson [3] inthis volume. From our viewpoint here, Furstenberg's approach is intriguing becauseit introduces, into finite combinatorics, methods which appear to behighly abstract and set-theoretical. For instance, Furstenberg's proof of 'Szemeredi's Theorem (Chapter 7 of [16]) uses a complicated structuralanalysis of measure-preserving transformations. This seems to require aheavy dose of transfinite induction. Similarly, the Furstenberg-Weissproof of Hindman's Theorem (Chapter 8 of [16]) involves Zorn's Lemmaapplied to a non-metrizable Tychonoff product space. (See the proof ofLemma 8.4 in [16].) Developments such as these raise the question ofwhether the highly abstract methods are ever really needed in order toprove particular combinatorial theorems. If they are in fact needed, thenthis would probably lead to some results concerning unprovability andfast-growing functions. Unfortunately, at the present time, there isnothing but speculation here. (Note added January 31, 1986. Recent research of Blass, Hirst and ____ _____ _______ ___ ____Simpson [4] seems to indicate that highly abstract, set-theoreticalmethods are less essential for the study of dynamical systems than theymight at first appear to be. For instance, the main results of Chapter 8of [16], including the Auslander-Ellis Theorem 8.7, can be proved in a A+relatively weak subsystem of second order arithmetic known as ACA . A0However, the question of which set existence axioms are needed to prove 'Szemeredi's Theorem is still open.) "* * *