Full text of "Principia Mathematica Volume I"

Mathematica c RUSSIA VOLUME I CAMBRIDGE UNIVERSITY PRESS $4J.oo the set of 3 volumes PRINCIPIA MATHEMATICA BY A.N. WHITEHEAD AND BERTRAND RUSSELL Principia Mathematica was first published in 19 1 0—13 ; this is the fifth impression of the second edition of 192^-7. The Principia has long been recognized as one of the intellectual landmarks of the century. It was the first book to show clearly the close relationship between mathematics and formal logic. Starting from a minimal number of axioms, White- head and Russell display the structure of both kinds of thought. No other book has had such an influence on the subse- quent history of mathematical philosophy. £12. 125. net the set of 3 volumes PRINCIPIA MATHEMATICA BY ALFRED NORTH WHITEHEAD AND BERTRAND RUSSELL, F.R.S. VOLUME I SECOND EDITION CAMBRIDGE AT THE UNIVERSITY PRESS 1963 PUBLISHED BY THE SYNDICS OF THE CAMBRIDGE UNIVERSITY PRESS Bentley House, 200 Euston Road, London, N.W. 1 American Branch : 32 East 57th Street, New York 22, N.Y. West African Office: P.O. Box 83, Ibadan, Nigeria First Edition 1910 Second Edition 1927 Reprinted 1950 1957 1960 1963 PRESTON v>ri> ^4i 3<^ y Jteik 'tf(^ 84 F»re< printed in Great Britain at the University Press, Cambridge Reprinted by offset-Mho by Messrs Lowe & Brydone (Printers) Ltd., London, N.W. 10 PREFACE THE mathematical treatment of the principles of mathematics, which is the subject of the present work, has arisen from the conjunction of two different studies, both in the main very modern. On the one hand we have the work of analysts and geometers, in the way of formulating and systematising their axioms, and the work of Cantor and others on such matters as the theory of aggregates. On the other hand we have symbolic logic, which, after a necessary period of growth, has now, thanks to Peano and his followers, acquired the technical adaptability and the logical comprehensiveness that are essential to a mathematical instrument for dealing with what have hitherto been the beginnings of mathematics. From the combination of these two studies two results emerge, namely (1) that what were formerly taken, tacitly or explicitly, as axioms, are either unnecessary or demonstrable; (2) that the same methods by which supposed axioms are demonstrated will give valuable results in regions, such as infinite number, which had formerly been regarded as inaccessible to human knowledge. Hence the scope of mathematics is enlarged both by the addition of new subjects and by a backward extension into provinces hitherto abandoned to philosophy. The present work was originally intended by us to be comprised in a second volume of The Principles of Mathematics. With that object in view, the writing of it was begun in 1900. But as we advanced, it became in- creasingly evident that the subject is a very much larger one than we had supposed; moreover on many fundamental questions which had been left obscure and doubtful in the former work, we have now arrived at what we believe to be satisfactory solutions. It therefore became necessary to make our book independent of The Principles of Mathematics. We have, however, avoided both controversy and general philosophy, and made our statements dogmatic in form. The justification for this is that the chief reason in favour of any theory on the principles of mathematics must always be inductive, i.e. it must lie in the fact that the theory in question enables us to deduce ordinary mathematics. In mathematics, the greatest degree of self-evidence is usually not to be found quite at the beginning, but at some later point; hence the early deductions, until they reach this point, give reasons rather' for believing the premisses because true consequences follow from them, than for believing the consequences because they follow from the premisses. In constructing a deductive system such as that contained in the present work, there are two opposite tasks which have to be concurrently performed. On the one hand, we have to analyse existing mathematics, with a view to discovering what premisses are employed, whether these premisses are mutually consistent, and whether they are capable of reduction to more fundamental premisses. On the other hand, when we have decided upon oUr premisses, we have to build up again as much as may seem necessary of the data previously analysed, anji as many other consequences of our premisses as are of sufficient general interest to deserve statement. The preliminary labour of analysis does not appear in the final presentation, which merely sets forth the outcome of the analysis in certain undefined ideas and VI PREFACE undemonstrated propositions. It is not claimed that the analysis could not have been carried farther: we have no reason to suppose that it is impossible to find simpler ideas and axioms by means of which those with which we start could be defined and demonstrated. All that is affirmed is that the ideas and axioms with which we start are sufficient, not that they are necessary. In making deductions from our premisses, we have considered it essential to carry them up to the point where we have proved as much as is true in whatever would ordinarily be taken for granted. But we have not thought it desirable to limit ourselves too strictly to this task. It is customary to consider only particular cases, even when, with our apparatus, it is just as easy to deal with the general case. For example, cardinal arithmetic is usually conceived in connection with finite numbers, but its general laws hold equally for infinite numbers, and are most easily proved without any mention of the distinction between finite and infinite. Again, many of the properties commonly associated with series hold of arrangements which are not strictly serial,, but have only some of the distinguishing properties of serial arrange- ments. In such cases, it is a defect in logical style to prove for a particular class of arrangements what might just as well have been proved more generally. An analogous process of generalization is involved, to a greater or less degree, in all our work. We have sought always the most general reasonably simple hypothesis from which any given conclusion could be reached. For this reason, especially in the later parts of the book, the importance of a proposition usually lies in its hypothesis. The conclusion will often be something which, in a certain class of cases, is familiar, but the hypothesis will, whenever possible, be wide enough to admit many cases besides those in which the conclusion is familiar. We have found it necessary to give very full proofs, because otherwise it is scarcely possible to see what hypotheses are really required, or whether our results follow from our explicit premisses. (It must be remembered that we are not affirming merely that such and such propositions are true, but also that the axioms stated by us are sufficient to prove them.) At the same time, though full proofs are necessary for the avoidance of errors, and for convincing those who may feel doubtful as to our correctness, yet the" proofs of propo- sitions may usually be omitted by a reader who is not specially interested in that part of the subject concerned, and who feels no doubt of our substantial accuracy on the matter in hand. The reader who is specially interested in some particular portion of the book will probably find it sufficient, as regards earlier portions, to read the summaries of previous parts, sections, and numbers, since these give explanations of the ideas involved and statements of the principal propositions proved. The proofs in Part I, Section A, however, are necessary, since in the course of them the maimer of stating proofs is explained. The proofs of the earliest propositions are given without the omission of any step, but as the work proceeds the proofs are gradually compressed, retaining however sufficient detail to enable the reader by the help of the references to reconstruct proofs in which no step is omitted. The order adopted is to some extent optional. For example, we have treated cardinal arithmetic and relation-arithmetic before series, but we might have treated series first. To a great extent, however, the order is determined by logical necessities. PREFACE VU A very large part of the labour involved in writing the present work has been expended on the contradictions and paradoxes which have infected logic and the theory of aggregates. We have examined a great number of hypo- theses for dealing with these contradictions ; many such hypotheses have been . advanced by others, and about as many have been invented by ourselves. Sometimes it has cost us several months' work to convince ourselves that a hypothesis was untenable. In the course of such a prolonged study, we have been led, as was to be expected, to modify our views from time to time ; but it gradually became evident to us that some form of the doctrine of types must be adopted if the contradictions were to be avoided. The particular form of the doctrine of types advocated in the present work is not logically indispensable, and there are various other forms equally compatible with the truth of our deductions. We have particularized, both because the form of the doctrine which we advocate appears to us the most probable, and because it was necessary to give at least one perfectly definite theory which avoids the contradictions. But hardly anything in our book would be changed by the adoption of a different form of the doctrine of types. In fact, we may go farther, and say that, supposing some other way of avoiding the contradictions to exist, not very much of our book, except what explicitly deals with types, is dependent upon the adoption of the doctrine of types in any form, so soon as it has been shown (as we claim that we have shown) that it is possible to construct a mathematical logic which does not lead to contradictions. It should be observed that the whole effect of the doctrine of types is negative : it forbids certain inferences which would otherwise be valid, but does not permit any which would otherwise be invalid. Hence we may reasonably expect that the inferences which the doctrine of types permits would remain valid even if the doctrine should be found to be invalid. Our logical system is wholly contained in the numbered propositions, which are independent of the Introduction and the Summaries. The Introduction and the Summaries are wholly explanatory, and form no part of the chain of deductions. The explanation of the hierarchy of types in the Introduction differs slightly from that given in #12 of the body of the work. The latter explanation is stricter and is that which is assumed throughout the rest of the book. The symbolic form of the work has been forced upon us by necessity : without its help we should have been unable to perform the requisite reasoning. It has been developed as the result of actual practice, and is not an excrescence introduced for the mere purpose of exposition. The general method which guides our handling of logical symbols is due to Peano. His great merit consists not so much in his definite logical discoveries nor in the details of his notations (excellent as both are), as in the fact that he first showed how symbolic logic was to be freed from its undue obsession with the forms of ordinary algebra, and thereby made it a suitable instrument for research. Guided by our study of his methods, we have used great freedom in constructing, or reconstructing, a symbolism which shall be adequate to deal with all parts of the subject. No symbol has been introduced except on the ground of its practical utility for the immediate purposes of our reasoning. A certain number of forward references will be found in the notes and explanations. Although we have taken every reasonable precaution to secure Vlll PREFACE the accuracy of these forward references, we cannot of course guarantee their accuracy with the same confidence as is possible in the case of backward references. Detailed acknowledgments of obligations to previous writers have not very often been possible, as we have had to transform whatever we have borrowed, in order to adapt it to our system and our notation. Our chief obligations will be obvious to every reader who is familiar with the literature of the subject. In the matter of notation, we have as far as possible followed Peano, supplementing his notation, when necessary, by that of Frege or by that of Schroder. A great deal of the symbolism, however, has had to be new, not so much through dissatisfaction with the symbolism of others, as through the fact that we deal with ideas not previously symbolised. In all questions of logical analysis, our chief debt is to Frege. Where we differ from him, it is largely because the contradictions showed that he, in common with all other logicians ancient and modern, had allowed some error to creep into his pre- misses; but apart from the contradictions, it would have been almost impossible to detect this error. In Arithmetic and the theory of series, our whole work is based on that of Georg Cantor. In Geometry we have had continually before us the Writings of V. Staudt, Pasch, Peano, Pieri, and Veblen. We have derived assistance at various stages from the criticisms of friends, notably Mr G, G. Berry of the Bodleian Library and Mr R. G. Hawtrey. We have to thank the Council of the Royal Society for a grant towards the expenses of printing of £200 from the Government Publication Fund, and also the Syndics of the University Press who have liberally undertaken the greater portion of the expense incurred in the production of the work. The technical excellence, in all departments, of the University Press, and the zeal and courtesy of its officials, have materially lightened the task of proof-correction. The second volume is already in the press, and both it and the third will appear as soon as the printing can be completed. A. N. W. B. R. Cambridge, November, 1910. CONTENTS OF YOLUME I PREFACE ALPHABETICAL LIST OF PROPOSITIONS REFERRED TO BY NAMES . INTRODUCTION TO THE SECOND EDITION . . . INTRODUCTION . . . . •. ' . . Chapter I. Preliminary Explanations of Ideas and Notations Chapter II. Th« Theory of Logical Types . . Chapter III. Incomplete Symbols . . . . . . PART I. MATHEMATICAL LOGIC Summary of Part I . .. . . . .... Section A. The Theory of Deduction #1. Primitive Ideas and Propositions #2. Immediate Consequences o£ the Primitive Propositions #3. The Logical Product of two Propositions .... #4.. Equivalence and Formal Rules ...... #& Miscellaneous Propositions . . . . Section B. Theory of Apparent Variables ..... #9. Extension of the Theory of Deduction from Lower to Higher Types of Propositions . . . . . . #10. Theory of Propositions containing one Apparent Variable •11. Theory of two Apparent Variables ..... #12. The Hierarchy of Types and the Axiom of Reducibility #13. Identity . . #14. Descriptions- . . . Section C. Classes and Relations . #20.- General Theory of Classes . #21. General Theory of Relations #22. Calculus of Classes #23. Calculus of Relations . #24. The Universal Glass* the Null Class, Classes ..... #25; The Universal Relation, the Null Relation, and the Existence of Relations . . . and the Existence of PAGE v xii xiii 1 4 37 66 87 90 91 98 109 115 123 127 127 138 151 161 168 173 187 187 200 205 213 216 228 CONTENTS Section #30. *31. #32. #33. •34. •35. #36. #37. #38. Section #40. #41. <#42. #43. D. Logic op Relations Descriptive Functions . Converses of Relations Referents and Relata of a given Term with respect to a given Relation . . . . . . . Domains, Converse Domains, and Fields of Relations . The Relative Product of two Relations . . Relations with Limited Domains and Converse Domains Relations with Limited Fields Plural Descriptive Functions Relations and Classes derived from a Double Descriptive Function . . . . Note to Section D . . . E. Products and Sums of Classes Products and Sums of Classes of Classes The Product and Sum of a Class of Relations Miscellaneous Propositions . . . . . The Relations of a Relative Product to its Factors PART II. PROLEGOMENA TO CARDINAL ARITHMETIC Summary of Part II . . . Section A. Unit Classes and Couples . #50. Identity and Diversity as Relations #51. Unit Classes #52. The Cardinal Number 1 #53. Miscellaneous Propositions involving Unit Classes #54. Cardinal Couples .... #55. Ordinal Couples . . #56. The Ordinal Number 2 r . Section B. Sub-Classes, Sub-Relations, and Relative Types #60. The Sub-Classes of a given Class . #61. The Sub-Relations of a given Relation . #62. The Relation of Membership of a Class #63. Relative Types of Classes . . . #64. Relative Types of Relations #65. On the Typical Definition of Ambiguous Symbols Section C. One-Many, Many-One, and One-One Relations #70. Relations whose Classes of Referents and of Relata belong to given Classes . . . , ... #71. One-Many, Many-One, and One-One Relations . s . #72. Miscellaneous Propositions concerning One-Many, Many-One and One-One Relations . . . . #73. Similarity of Classes . #74. On One-Many and Many-One Relations with Limited Fields PAGE 231 232 238 242 247 256 265 277 279 296 299 302 304 315 320 324 329 331 333 340 347 352 359 366 377 386 388 393 395 400 410 415 418 420 426 441 455 468 CONTENTS XI PAGE Section D. Selections ......... 478 #80. Elementary Properties of Selections . . . . . 483 #81. Selections from Many-One Relations ... . . . 496 #82. Selections from Relative Products . . . . . 501 #83. Selections from Classes of Classes ...... 508 #84. Classes of Mutually Exclusive Classes . . . . . 517 #85. Miscellaneous Propositions ....... 525 #88. Conditions for the Existence of Selections .... 536 Section E. Inductive Relations 543 #90. On the Ancestral Relation ....... 549 #91. On Powers of a Relation . . . . . . . 558 #92. Powers of One-Many and Many-One Relations . . . 573 #93. Inductive Analysis of the Field of a Relation . . . 579 #94. On Powers of Relative Products ...... 588 #95. On the Equi-factor Relation . . . . . . . 596 #96. On the Posterity of a Term 607 #97. Analysis of the Field of a Relation into Families . . . 623 APPENDIX A #8. The Theory of Deduction for Propositions containing Apparent Variables . 635 APPENDIX B #89. Mathematical Induction ....... 650 APPENDIX C Truth-Functions and others . . . . . . . 659 LIST OF DEFINITIONS 667 ALPHABETICAL LIST OF PROPOSITIONS REFERRED TO BY NAMES \-~.py~p . D .~p b zq.D .j>vq bzp.pDq.^.q b zpv(qvr).3.qv(pvr) b z.p .D .gOr:3: q .3 ,p"Dr b z.pDq.p^r. D :p . D .q.r b .i.p.q.O.r : D :p. D . qDr f :. p "5q . D z p . r . D . q .r b.pDp b t.p . 3 . gO r : D zp~q . 3 . r i- : p v q . D . q v p b zq.D .pDq bzp.q.D.p b :p .q.^.q bz. qDr.Dzpvq.O.pvr b z.qDr.D zpDq.y.pDr b z.pOq.^'.q^r.D.pDr bz pDq.qOr.^.pDr b zqDr .pDq.'S.pDr b :pvp. D .p bzpD^q.D.q D~p b z <^>pDq . D ,<^q"Dp b z p D q . D . ~ q D ~ p b : ~ q D ~ p . D . p D q bz.p.q.D.rzDzp.^r."D.e^q b : p D q . = . ~ q "D ~ p bzp = q. = .<^p = <**>q Name Number Abs *201. Add *i*3. Ass *3 35. Assoc #1-5. Coram *2-04. Comp *3-43. Exp *33. Fact *3 45. Id *208. Imp *3'31. Perm #1-4. Simp *202. » *326. » *3*27. Sum *l-6. Syll *205. „ *206. » *333. » *3 34. Taut *l-2. Transp *203- » *215. 5> *216. » *217. » *337. » *41. 3J *411. INTRODUCTION TO THE SECOND EDITION* In preparing this new edition of Principia Mathematica, the authors have thought it best to leave the text unchanged, except as regards misprints and minor errors f, even where they were aware of possible improvements. The chief reason for this decision is that any alteration of the propositions would have entailed alteration of the references, which would have meant a very great labour. It seemed preferable, therefore, to state in an introduction the main improvements which appear desirable. Some of these are scarcely open to question ; others are, as yet, a matter of opinion. The most definite improvement resulting from work in mathematical logic during the past fourteen years is the substitution, in Part I, Section A, of the one indefinable "p and q are incompatible" (or, alternatively, "p and q are both false") for the two indefinables "not-p" and "p or q." This is due to Dr H. M. Sheffer]:. Consequentially, M. Jean Nicod§ showed that one primitive proposition could replace the five primitive propositions *1'2*3"4'5*6. From this there follows a great simplification in the building up of molecular propositions and matrices; #9 is replaced by a new chapter, #8, given in Appendix A to this Volume. Another point about which there can be no doubt is that there is no need of the distinction between real and apparent variables, nor of the primitive idea "assertion of a propositional function." On all occasions where, in Principia Mathematica, we have an asserted proposition of the form "V .fx" or "h .fp" this is to be taken as meaning "r- . (x) .fx " or " h . (p) .fp." Con- sequently the primitive proposition *1*11 is no longer required. All that is necessary, in order to adapt the propositions as printed to this change, is the convention that, when the scope of an apparent variable is the whole of the asserted proposition in which it occurs, this fact will not be explicitly indicated unless " some " is involved instead of " all." That is to say, "h . <f>x " is to mean " h . (x) . <fix " ; but in " I- . ( gar) . <f>x " it is still necessary to indicate explicitly the fact that " some " x (not " all " x's) is involved. It is possible to indicate more clearly than was done formerly what are the novelties introduced in Part I, Section B as compared with Section A. * In this introduction, as -well as in the Appendices, the authors are under great obligations to Mr F. P. Ramsey of King's College, Cambridge, who has read the whole in MS. and contributed valuable criticisms and suggestions. t In regard to these we are indebted to many readers, but especially to Drs Behmann and Boscovitch, of Gdttingen. X Tram. Amer. Math. Soc. Vol. xnr. pp. 481 — 488. § "A reduction in the number of the primitive propositions of logic," Proc. Camb. Phil. Soc. Vol. MX. XIV INTRODUCTION They are three in number, two being essential logical novelties, and the third merely notational. (1) For the "p" of Section A, we substitute " <}>x," so that in place of " *- • (P) -fP " we have " h . (<}>, x) ./(<f>x)." Also, if we have " h ./{p, q, r, . . .)," we may substitute <f>x, <f>y, 4>z, ... for^, q, r, ... or <f>x, <f>y for p, q, and yfrz, ... for r, ..., and so on. We thus obtain a number of new general propositions different from those of Section A. (2) We introduce in Section B the new primitive idea " (g#) . <f>x," i.e. existence-propositions, which do not occur in Section A, In virtue of the abolition of the real variable, general propositions of the form " (p) . fp " do occur in Section A, but " (<&p) .fp " does not occur. (3) By means of definitions, we introduce in Section B general propositions which are molecular constituents of other propositions ; thus " (x) . <]>x . v . p " is to mean " (x) . <f>xvp." It is these three novelties which distinguish Section B from Section A. One point in regard to which improvement is obviously desirable is the axiom of reducibility (*12M1). This axiom has a purely pragmatic justifica- tion : it leads to the desired results, and to no. others. But clearly it is not the sort of axiom with which we can rest content. On this subject, however, it cannot be said that a satisfactory solution is as yet obtainable. Dr Leon Chwistek* took the heroic course of dispensing with the axiom without adopting any substitute ; from his work, it is clear that this course compels us to sacrifice a great deal of ordinary mathematics. There is another course, recommended by Wittgenstein f for philosophical reasons. This is to assume 4hat functions of propositions are always truth-functions, and that a function can only occur in a proposition through its values. There are difficulties in the way of this view, but perhaps they are not insurmountable J. It involves the consequence that all functions of functions are extensional. It requires us to maintain that " A believes p " is not a function of p. How this is possible, is shown in Tractatus Logico-Philosophicus (loc. cit. and pp. 19—21). We are not prepared to assert that this theory is certainly right, but it has seemed worth while to work out its consequences in the following pages. It appears that everything in Vol. I remains true (though often new proofs are required) ; the theory of inductive cardinals and ordinals survives ; but it seems that the theory of infinite Dedekindian and well-ordered series largely collapses, so that irrationals, and real numbers generally, can no longer be adequately dealt with. Also Cantor's proof that 2 n > n breaks down unless n is finite. Perhaps some further axiom, less objectionable than the axiom of reducibility, might give these results, but we have not succeeded in finding such an axiom. * In his " Theory of Constructive Types." See references »t the end of this Introduction, j- Tractatus Logico-Philosophicus, *5*54 ff . X See Appendix C. INTRODUCTION XV It should be stated that a new and very powerful method in mathematical logic has been invented by Dr H. M. Sheflfer. This method, however, would demand a complete re-writing of Principia Mathematica. We recommend this task to Dr Sheffer, since what has so far been published by him is scarcely sufficient to enable others to undertake the necessary reconstruction. We now proceed to the detailed development of the above general sketch. I. ATOMIC AND MOLECULAR PROPOSITIONS Our system begins with "atomic propositions." We accept these as a datum, because the problems which arise concerning them belong to the philosophical part of logic, and are not amenable (at any rate at present) to mathematical treatment. Atomic propositions may be defined negatively as propositions containing no parts that are propositions, and not containing the notions "all" or "some." Thus " this is red," "this is earlier than that," are atomic propositions. Atomic propositions may also be defined positively — and this is the better course — as propositions of the following sorts r R 1 (x), meaning "x has the predicate R^'; R*( x >y) [° r xRzy]' meaning "x has the relation R 2 (in intension) to y"; R 3 (x,y, z), meaning "x,y,z have the triadic relation R 3 (in intension)"; R 4 (x, y, z, w), meaning "x,y,z,w have the tetradic relation R 4 (in intension)"; and so on ad infinitum, or at any rate as 'long as possible. Logic does not know whether there are in fact n-adic relations (in intension); this is an empirical question. We know as an empirical fact that there are at least dyadic relations (in intension), because without them series would be impossible. But logic is not interested in this fact; it is concerned solely with the hypothesis of there being propositions of such-and-such a form. In certain cases, this hypothesis is itself of the form in question, or contains a part which is of the form in question ; in these cases, the fact that the hypothesis can be framed proves that it is true. But even when a hypothesis occurs in logic, the fact that it can be framed does not itself belong to logic. Given all true atomic propositions, together with the fact that they are all, every other true proposition can theoretically be deduced by logical methods. That is to- say, the apparatus of crude fact required in proofs can all be con- densed into the true atomic propositions together with the fact that every true atomic proposition is one of the following: (here the list should follow). If used, this method would presumably involve an infinite enumeration, since it seems natural to suppose that the number of true atomic propositions is infinite, though this should not be regarded as certain. In practice, generality is not obtained by the method of complete enumeration, because this method requires more knowledge than we possess. R&W I h Xvi INTRODUCTION We must now advance to molecular propositions. Let p, q, r, s, t denote, to begin with, atomic propositions. We introduce the primitive idea which may be read "p is incompatible with q"* and is to be true whenever either or both are false. Thus it may also be read "p is false or q is false"; or again, "p implies not-q." But as we are going to define disjunction, impli- cation, and negation in terms of p | q, these ways of reading p \ q are better avoided to begin with. The symbol "p\ q" is pronounced: "p stroke q." We now put ~P . = -P\P Df, pD q . = -p|~<7 Df, pv q . = .^pl^q Df, p.q. = .~(p\q) Df. Thus all the usual truth-functions can be constructed by means of the stroke. Note that by the above, p3q. = .p\(q\q) Df. We find that p.D.q.r. = .p\(q\r). Thus p D q is a degenerate case of a function of three propositions. We can construct new propositions indefinitely by means of the stroke ; for example, (p \ q) j r, p \ (q \ r), (p | q) | (r\s), and so on. Note that the stroke obeys the permutative law (p \ q) = (q \p) but not the associative law (p\q)\r =p\(q\r). (These of course are results to be proved later.) Note also that, when we construct a new proposition by means of the stroke, we cannot know its truth or falsehood unless either (a) we know the truth or falsehood of some of its constituents, or (b) at least one of its constituents occurs several times in a suitable manner. The case (a) interests logic as giving rise to the rule of in- ference, viz. Given p and p \ (q \ r), we can infer r. This or some variant must be taken as a primitive proposition. For the moment, we are applying it only when p, q, r are atomic propositions, but we shall extend it later. We shall consider (6) in a moment. In constructing new propositions by means of the stroke, we assume that the stroke can have on either side of it any proposition so constructed, and need not have an atomic proposition on either side. Thus given three atomic propositions p, q, r, we can form, first, p \ q and q \ r, and thence (p\q)\ r and p | (q | r). Given four, p, q, r, s, N we can form {(p\q)\r}\s, (p\q)\(r\s), p\{q\(r\s)} and of course others by permuting p, q, r, s. The above three are substantially * For what follows, see Nicod, " A reduction in the number of the primitive propositions of logic," Proc. Camb. Phil. Soc. Vol. xix. pp. 32—41. INTRODUCTION XV11 different propositions. We have in fact {(P I <l) H I s • - '-^P v ~<7 • r : v :~s, (p\q)\(r\s). = :p.q.v.r.s, p\{q\(r\s)} . = :.~p : V : q .~rv~*. All the propositions obtained by this method follow from one rule: in "P I q" substitute, for p or q or both, propositions already constructed by means of the stroke. This rule generates a definite assemblage of new propositions out of the original assemblage of atomic propositions. All the propositions so generated (excluding the original atomic propositions) will be called " mole- cular propositions." Thus molecular propositions are all of the form p \ q, but the p and q may now themselves be molecular propositions. If p is p 1 \p 2 , p x and p 2 may be molecular; suppose Pi = pn\pi2- Pn may be of the form Piu \Piu, an d so on; but after a finite number of steps of this kind, we are to arrive at atomic constituents. In a proposition/) | q, the stroke between p and q is called the "principal" stroke; if p=p x \p 2 , the stroke between p x and p 2 is a secondary stroke; so is the stroke between q x and q 2 if q = q x \ q 2 . If pi =p u | p u , the stroke between p n and p 12 is a tertiary stroke, and so on. Atomic and molecular propositions together are " elementary propositions." Thus elementary propositions are atomic propositions together with all that can be generated from them by means of the stroke applied any finite number of times. This is a definite assemblage of propositions. We shall now, until further notice, use the letters p, q, r, s, t to denote elementary propositions, not necessarily atomic propositions. The rule of inference stated above is to hold still; i.e. If p, q, r are elementary propositions, given p and p | (q | r), we can infer r. This is a primitive proposition. We can now take up the point (6) mentioned above. When a molecular proposition contains repetitions of a constituent proposition in a suitable manner, it can be known to be true without our having to know the truth or falsehood of any constituent. The simplest instance is P\(P\P)> which is always true. It means "p is incompatible with the incompatibility of p with itself," which is obvious. Again, take "p . q . D . p." This is {(p \q)\(p\ q)\ I (P I P)- Again, take "~jp.D.~pv~ q." This is (p\p)\ {(p \q)\(p\ q)}> Again, "p . D .p v q " is p\i{(p\p)\(q\q)}\i(p\p)\(q\q)}l All these are true however p and q may be chosen. It is the fact that we can build up invariable truths of this sort that makes molecular propositions important to logic. Logic is helpless with atomic propositions, because their 62 XV111 INTRODUCTION truth or falsehood can only be known empirically. But the truth of molecular propositions of suitable form can be known universally without empirical evidence. The laws of logic, so far as elementary propositions are concerned, are all assertions to the effect that, whatever elementary propositions p, q, r, ... may be, a certain function F(p,q,r,...), whose values are molecular propositions, built up by means of the stroke, is always true. The proposition " F(p) is true, whatever elementary proposition p may be " is denoted by (p).F(p). Similarly the proposition "F(p,q,r,...) is true, whatever elementary pro- positions p, q, r, ... may be " is denoted by (p,q,r, ...).F(p,q,r, ...). When such a proposition is asserted, we shall omit the "(p,q,r, ...)" at the beginning. Thus "\-.F{p,q,r,...V denotes the assertion (as opposed to the hypothesis) that F(p,q,r, ...) is true whatever elementary propositions p, q, r, ... may be. (The distinction between real and apparent variables, which occurs in Frege and in Principia Mathematica, is unnecessary. "Whatever appears as a real variable in Principia Mathematica is to be taken as an apparent variable whose scope is the whole of the asserted proposition in which it occurs.) The rule of inference, in the form given above, is never required within logic, but only when logic is applied. Within logic, the rule required is different. In the logic of propositions, which is what concerns us at present, the rule used is : Given, whatever elementary propositions p, q, r may be, both "K F(p, q, r,. ..)" and "h .F(p,q,r, ...)\{G(p,q,r, ...)\H(p, q,r, . ..)}," we can infer " h . H{p, q, r, ...)." Other forms of the rule of inference will meet us later. For the present, the above is the form we shall use. Nicod has shown that the logic of propositions (*1 — *5) can be deduced, by the help of the rule of inference, from two primitive propositions and \- m .pDq.D.s\qDp\s. The first of these may be interpreted as "p is incompatible with not-p," or as "p or not-p," or as "not (p and not-p)," or as "p implies p." The second may be interpreted as pDq.DzqDf^s.D.pD^s, INTRODUCTION XIX which is a form of the principle of the syllogism. Written wholly in terms of the stroke, the principle becomes {p\(9\q)}\[{(s\q)\((p\s)\(p\s))}\{(s\q)\((p\s)\(p\s))}]. Nicod has shown further that these two principles may be replaced by one. Written wholly in terms of the stroke, this one principle is bl(g|r)}|[{*|(*|*)}|K«l«)l((pl*)l(pl«))}]- It will be seen that, written in this form, the principle is less complex than the second of the above principles written wholly in terms of the stroke. When interpreted into the language of implication, Nicod's one principle becomes p.0.q.r:^.tDt.s\qDp\s. In this form, it looks more complex than pDj.D .s\qDp\s, but in itself it is less complex. From the above primitive proposition, together with the rule of inference, everything that logic can ascertain about elementary propositions can be proved, provided we add one other primitive proposition, viz. that, given a proposition (p, q, r, ...) . F (p, q> r, ...), we may substitute for p, q, r, ... functions of the form /,0>, ?, r, ...), f \ (p, q,r ,...), f s (p, q, r, ...) and assert (p,q,r,...).FUi(p,q,r, ...), f 3 (p,q,r, ...),f 3 (p,q,r, ...), ...}, where f 1} / 2 , f 3 , ... are functions constructed by means of the stroke. Since the former assertion applied to all elementary propositions, while the latter applies only to some, it is obvious that the former implies the latter. A more general form of this principle will concern us later. II. ELEMENTARY FUNCTIONS OF INDIVIDUALS 1. Definition of '" individual" We saw that atomic propositions are of one of the series of forms: R x {x), R^{x,y), R 3 (x,y,z), R^{x,y,z y w\ .... Here R lt R 2 , R s , R 4 , ... are each characteristic of the special form in which they are found: that is to say, R n cannot occur in an atomic proposition R m (x 1} # 2 > ••• #m) unless n = m, and then can only occur as R m occurs, not as x lt x 2 , ... x m occur. On the other hand, any term which can occur as the a-'s occur in R n (x 1} x 2 , ... x n ) can also occur as one of the x's in R m (x x , x 2 , . . . x m ) even if m is not equal to n. Terms which can occur in any form of atomic proposition are called " individuals" or " particulars"; terms which occur as the R's occur are called " universals." We might state our definition compendiously as follows: An " individual" is anything that can be the subject of an atomic proposition. XX INTRODUCTION Given an atomic proposition R n (x 1} x 2 , ... x n ), we shall call any of the x's a "constituent" of the proposition, and R n a " component " of the proposition*. We shall say the same as regards any molecular proposition in which R n (x 1} x 2 , ... x n ) occurs. Given an elementary proposition p j q, where p and q may be atomic or molecular, we shall call p and q " parts " of p | q; and any parts of p or q will in turn be called parts of p | q, and so on until we reach the atomic parts of p \ q. Thus to say that a proposition r " occurs in" p \ q and to say that r is a "part " of p | q will be synonymous. 2. Definition of an elementary function of an individual Given any elementary proposition which contains a part of which an individual a is a constituent, other propositions can be obtained by replacing a by other individuals in succession. We thus obtain a certain assemblage of elementary propositions. We may call the original proposition 0a, and then the propositional function obtained by putting a variable x in the place of a will be called <f>x. Thus <f>x is a function of which the argument is x and the values are elementary propositions. The essential use of "<f>x" is that it collects together a certain set of propositions, namely all those that are its values with different arguments. We have already had various special functions of propositions. If p is a part of some molecular proposition, we may consider the set of propositions resulting from the substitution of other propositions for p. If we call the original molecular proposition fp, the result of substituting q is called /#. When an individual or a proposition occurs twice in a proposition, three functions can be obtained, by varying only one, or only another, or both, of the occurrences. For example, p \p is a value of any one of the three functions P I <?> 9 1 P> 9 I 9> where q is the argument. Similar considerations apply when an argument occurs more than twice. Thus p\(p\p) is a value of q\(r\s), or 9 ! ( r I <l)> or 9 i (<? 1 r), or q\(r\ r), or q\(q\ q). When we assert a proposition " ^ • (P) » Fp," the p is to be varied whenever it occurs. We may similarly assert a proposition of the form " (x) . <f>x," meaning " all propositions of the assemblage indicated by <f>x are true"; here also, every occurrence of x is to be varied. • 3. "Always true" and "sometimes true" Given any function, it may happen that all its values are true; again, it may happen that at least one of its values is true. The proposition that all the values of a function (x,y, z, ...) are true is expressed by the symbol "(x,y,z, ...).$ (x,y,z,...)" unless we wish to assert it, in which case the assertion is written "h.(f)(x,y,z, ...)." * This terminology is taken from Wittgenstein. INTRODUCTION XXI We have already had assertions of this kind where the variables were ele- mentary propositions. We want now to consider the case where the variables are individuals and the function is elementary, i.e. all its values are elementary propositions. We no longer wish to confine ourselves to the case in which it is asserted that all the values of <f>(x,y,z, ...) are true; we desire to be able to make the proposition (x } y,z,...).<\>{x,y,z, ...) a part of a stroke function. For the present, however, we will ignore this desideratum, which will occupy us in Section III of this Introduction. In addition to the proposition that a function $x is "always true" (i.e. (x) . <j>x), we need also the proposition that <f>x is " sometimes true," i.e. is true for at least one value of x. This we denote by "(a*) • <K' Similarly the proposition that <f> (x, y, z, . . .) is "sometimes true" is denoted by il (^x,y,z, ...).4>(x,y,z,...)." We need, in addition to (x, y, z, . . . ) . <f> (x, y,z,...) and (3a;, y, z, ...).<£(#, y, z, ... ), various other propositions of an analogous kind. Consider first a function of two variables. We can form (a*) : (y) • <i> fo y)> O) : (as/) ■ <t> ( x > y)> (as/) = (#)■<£ («, y)> (y) • (a*) ■ <t> (*» y)- These are substantially different propositions, of which no two are always equivalent. It would seem natural, in forming these propositions, to regard the function £ (x, y) as formed in two stages. Given <f> (a, b), where a and b are constants, we can first form a function <f> (a, y), containing the one variable y; we can then form (y ) . <f> ( a, y) and (33/) . </> (a , y). We can now vary a, obtaining again a function of one variable, and leading to the four propositions (x) :(y).<f> (x, y), ( a a>) : (y) . <f> {x, y), (x) : (ay) . <f> (x, y), (gar) : (33/) • <\> 0*. 2/)- On the other hand, we might have gone from </> (a, b) to <f> (x, b), thence to (x) . <j> (x, b) and (3a;) . <f> (x, 6), and thence to (y) : {as) . <j> (x, y), ( 3 y) :(*).£ (x, y), (y) : (a*) . <f> (x, y), (ay) : (3*0 • 0*. 2/)- All of these will be called "general propositions"; thus eight general propositions can be derived from the function <f> (x, y). We have (x) : (y) . <f> (x, y) : = : (y) : («) . 4> (a?, y), (a«) : (ay) • <f> te 2/) : = : (ay) = (a*0 ■ 4> te y)- But there are no other equivalences that always hold. For example, the dis- tinction between " (x) : (gy) . <j> (x, y) " and " (gy) : (x) . <f> (x, y) " is the same as the distinction in analysis between " For every e, however small, there is a 8 such that..." and " There is a 8 such that, for every e, however small " XX11 INTRODUCTION Although it might seem easier, in view of the above considerations, to regard every function of several variables as obtained by successive steps, each involving only a function of one variable, yet there are powerful considerations on the other side. There are two grounds in favour of the step-by-step method ; first, that only functions of one variable need be taken as a primitive idea; secondly, that such definitions as the above seem to require either that we should first vary x, keeping y coostant, or that we should first vary y, keeping x constant. The former seems to be involved when " (y) " or " fay) " appears to the left of " (x) " or " fax)," the latter in the converse case. The grounds against the step-by-step method are that it interferes with the method of matrices, which brings order into the successive generation of types of pro- positions and functions demanded by the theory of types, and that it requires us, from the start, to deal with such propositions as (y) . <f> (a, y), which are not elementary. Take, for example, the proposition " h : q . D . p v q." This will be \-:.(p):.(q):q.D .pvq, or h:.(q):.(p):q.D.pvq, and will thus involve all values of either (q) : q . D . p v q considered as a function of p, or (p) :q.D .pvq considered as a function of q. This makes it impossible to start our logic with elementary propositions, as we wish to do. It is useless to enlarge the definition of elementary propositions, since that only increases the values of q or p in the above functions. Hence it seems necessary to start with an elementary function (pi&l, x%, x 3 , ... x n ), before which we write, for each x r , either "(x r )" or " fax r )," the variables in this process being taken in any order we like. Here <f> {x ly x 2i x 3 , ... x n ) is called the " matrix," and what comes before it is called the " prefix." Thus in (a^) : (y) • 4> 0> y) " <f> (x, y) " is the matrix and " fax) : (y) " is the prefix. It thus appears that a matrix containing n variables gives rise to n 1 2 n propositions by taking its variables in all possible orders and distinguishing " (x r ) " and " fax r ) " in each case. (Some of these, however, are equivalent.) The process of obtaining such propositions from a matrix will be called " generalization," whether we take " all values " or " some value," and the propositions which result will be called " general propositions." We shall later have occasion to consider matrices containing variables that are not individuals ; we may therefore say : A " matrix " is a function of any number of variables (which may or may not be individuals), which has elementary propositions as its values, and is used for the purpose of generalization. INTRODUCTION XX111 A " general proposition " is one derived from a matrix by generalization. We shall add one further definition at this stage : A " first-order proposition " is one derived by generalization from a matrix in which all the variables are individuals. 4. Methods of proving general propositions There are two fundamental methods of proving general propositions, one for universal propositions, the other for such as assert existence. The method of proving universal propositions is as follows. Given a proposition «\-.F(p,q,r,...y where F is built up by the stroke, and p, q,r, ... are elementary, we may re- place them by elementary functions of individuals in any way we like, putting P == Ji\p l h) &?> '■'• ®n)> q z =j2\ x ii x 2> ••• x n)> and so on, and then assert the result for all values of x lt x 2 , ... x n . What we thus assert is less than the original assertion, since p, q, r, ... could originally take all values that are elementary propositions, whereas now they can only take such as are values of /i,/ 2 ,/ 3 , — (Any two or more of /i,/ 2 ,/ 3 , ... may be identical.) For proving existence- theorems we have two primitive propositions, namely #81. I- . (g#, y) . <f>a | (<f>a; | <f>y) and #811. I- . fax) . <f>x | (<pa | <f>b) Applying the definitions to be given shortly, these assert respectively <pa . D . fax) . <px and (x) . <j>x . D . <j>a . <f>b. These two primitive propositions are to be assumed, not only for one variable, but for any number. Thus we assume <f> (<*!, a 2 , ... a n ) . D . (g#i, x 2 , ... x n ) . <f> {x 1> x 2 , ... x n ), (x 1} x 2 , ... x n ). <£(#i, # 2 , ... x n ). D. ^>(ai, Oa, ... a»). <£(&i, & 2 > ••■ &»)• The proposition (x) . <f>x . D . <f>a . <f>b, in this form, does not look suitable for proving existence-theorems. But it may be written (g#) . ~ <f>x . v . <f>a . <f>b or ~ <j>a v ~ <f)b . D . fax) . ~ <f>x, in which form it is identical with #911, writing <f> for ~^>. Thus our two primitive propositions are the same as #91 and #911. For purposes of inference, we still assume that from (x) . <j>x and (x) . <f>x D yfrx we can infer (x) . yfrx, and from p and p D q we can infer q, even when the functions or propositions involved are not elementary. XXIV INTRODUCTION Existence-theorems are very often obtained from the above primitive propositions in the following manner. Suppose we know a proposition \-.f(x,x). Since <f>x . D . fay) . <f>y, we can infer May) -/toy). i.e. H:(#):(a2/)./(a;,y). Similarly r : (y) : fax) .f(x, y). Again, since <j> (x, y) . D • faz, w) . <£> (z, w\ we can infer •■ ■ (a^ y) ■"/(*» y) and ' H. (ay, «)•/(*, y). We may illustrate the proofs both of universal and of existence propo- sitions by a simple example. We have Hence, substituting <f>x for p, h . {x) . <f>x D <f)X. Hence, as in the case of /(#, x) above, t- rfa) : (ay) ■ fa D <f>y, b : (y) : fax) . <f>x D cf>y, I" ■ fax, y)-fa^ 4>y- Apart from special axioms asserting existence-theorems (such as the axiom of reducibility, the multiplicative axiom, and the axiom of infinity), the above two primitive propositions give the sole method of proving existence-theorems in logic. They are, in fact, always derived from general propositions of the form (x).f(x,x) or (x) ,f(x,x,x) or etc., by substituting other variables for some of the occurrences of x. III. GENERAL PROPOSITIONS OF LIMITED SCOPE In virtue of a primitive proposition, given (x) . <f>x and (x) . $x D -tyx, we can infer (x) . yjrx. So far, however, we have introduced no notation which would enable us to state the corresponding implication (as opposed to inference). Again, fax) . §x and (x, y) . $x O yfry enable us to infer (y) . tyy; here again, we wish to be able to state the corresponding implication. So far, we have only defined occurrences of general propositions as complete asserted propositions. Theoretically, this is their only use, and there is no need to define any other. But practically, it is highly convenient to be able to treat them as parts of stroke-functions. This is entirely a matter of definition. By introducing suitable definitions, first-order propositions can be shown to satisfy all the propositions of #1 — *5. Hence in using the propositions of #1— #5, it will no longer be necessary to assume that p, q, r, ... are elementary. The fundamental definitions are given below. INTRODUCTION XXV When a general proposition occurs as part of another, it is said to have limited scope. If it contains an apparent variable x, the scope of x is said to be limited to the general proposition in question. Thus in p \ {(x) . <f>x\, the scope of x is limited to ix) . <ftx, whereas in (x) . p | fa the scope of x extends to the whole proposition. Scope is indicated by dots. The new chapter *8 (given in Appendix A) should replace *9 in Principia Mathematica. Its general procedure will, however, be explained now. The occurrence of a general proposition as part of a stroke-function is denned by means of the following definitions: {(x).<j>x}\q. = .fax).<f>x\q Df, 1(3*0 • fa) 1 9. ■ = ■ (*) ■ fa 1 ? Df > p I {(ay) • tyy) • = ■ (y) • v \ fy Df - These define, in the first place, only what is meant by the stroke when it occurs between two propositions of which one is elementary while the other is of the first order. When the stroke occurs between two propositions which are both of the first order, we shall adopt the convention that the one on the left is to be eliminated first, treating the one on the right as if it were ele- mentary; then the one on the right is to be eliminated, in each case, in accordance with the above definitions. Thus {{x) . <f>x} | [(y) . yjry] . = : fax) : <f>x\ {(y) . ^y} : = = (3*0 = (32/) ■ fa I t2A (0) • fa] I Kay) •■fy}- = - (3*0 = fa I {(ay) ■ -M = = = (3*0 : (y) • fa I tyy> {fax) . <f>x} | \{y) .^ry}.= : (x) : fay) . <f>x j fy. The rule about the order of elimination is only required for the sake of definiteness, since the two orders give equivalent results. For example, in the last of the above instances, if we had eliminated y first we should have obtained (ay) : (*0 ■ fa I ^y> which requires either (x) ,<^>$x or fay) .<^-tyy, and is then true. And (x) : fay) . <f>x | yfry is true in the same circumstances. This possibility of changing the order of the variables in the prefix is only due to the way in which they occur, i.e. to the fact that x only occurs on one side of the stroke and y only on the other. The order of the variables in the prefix is indifferent whenever the occurrences of one variable are all on one side of a certain stroke, while those of the other are all on the other side of it. We do not have in general (a*0 : (y) • x ( x > y)- = -iy)' (3*0 ■ x 0»> y); XXVI INTRODUCTION here the right-hand side is more often true than the left-hand side. But we do have (ft®) '• iy) -<l>a!\yjry: = : (y) : (a*) . $x\^y. The possibility of altering the order of the variables in the prefix when they are separated by a stroke is a primitive proposition. In general it is convenient to put on the left the variables of which "all" are involved, and on the right those of which " some " are involved, after the elimination has been finished, always assuming that the variables occur in a way to which our primitive proposition is applicable. It is not necessary for the above primitive proposition that the stroke separating x and y should be the principal stroke, e.g. p I [{(a*) • <H I {(y) • "f y}] ■ = • p I [0*0 : (ay) • 4> x I ^y] ■ (a*) : (y) • p I (<£* I iry) - (y) • (a«) • p I {<t> x 1 1ry)> All that is necessary is that there should be some stroke which separates x from y. When this is not the case, the order cannot in general be changed. Take e.g. the matrix <f>x V yjry . ~ <f>x V <^» ifry. This may be written (<j>x D yjry) j {$-y D <f>x) or {fx | (fy | tyy)} \ [tyy \ (Qx | <f>x)}. Here there is no stroke which separates all the occurrences of x from all those of y, and in fact the two propositions (y) ' (a 57 ) • § x v "tyy ■ ~ ^ v ^ ^y and (a«) : (y) . 4>x v tyy .~^pv<v yfry are not equivalent except for special values of <f> and i|r. By means of the above definitions, we are able to derive all propositions, of whatever order, from a matrix of elementary propositions combined by means of the stroke. Given any such matrix, containing a part p, we may replace p by <f>x or <f> (x, y) or etc., and proceed to add the prefix (x) or (g#) or (x, y) or (x) : (gy) or (y) : (gp) or etc. If p and q both occur, we may replace p by <f)X and q by tyy, or we may replace both by <j>%, or one by <f>x and another by some stroke-function of <f>x. In the case of a proposition such as p I (O) = (ay) ■ * (®> y) ] > we must treat it as a case of p \ {(x) . </>#}, and first eliminate x. Thus p I {(«) : (ay) ■ f («» y)} • = : (a*) -(y)-p\ ^0*»y)« That is to say, the definitions of {(x) . <f>x)} \ q etc. are to be applicable un- changed when <f>x is not an elementary function. INTRODUCTION xxvu The definitions of ~ Thus •~ {(x) . <f>x\ . = p, pv q, p .q, pOq are to be taken over unchanged. p . D . (x) .</>#: = (x) . <f>x . D . p : = (x) . <f>x . v . p : = p . v . (x) . <f>x : = {(x) . fa] I \(x) . <f>x} : (rx) : <f>x | {(x): <f>x\ : (a«0 • (33/) • (<£# 1 4>y\ (x) : (y) . fox | <£y), p'l [{<*)■**} I {(*)■**}]: P I {(3*0 = (33/) • (4& I #)} = (x) : (y) .p | (<f>x | <£y), {(#) . <f>x] \(p\p): (rx) . <f>x | (p | p) : = : (gar) .<f>xDp, [~{(»).^}]| ~p: Ka«) : (ay) • (^ I <&/)} i (/> I P) : (^)-{(ay)-(^l^)}|(i>lp): (*):(y).(<M&/)KH.P)> (x):(y).(p\p)\(<f>x\<f>y). It will be seen that in the above two variables appear where only one might have been expected. We shall find, before long, that the two variables can be' reduced to one ; i.e. we shall have (3*0 : (32/) - <£# I <f>V '• = • (3«) • 4> x I £*» (a;) : (y) . <f>x \ <f>y : = . (a;) . <£# | <j>x. These lead to ~ {(x) . <j)x} . = . (a«) . ~ fa, ~ {(a^) • 4*®} ■ = ■(#)■ ~ <j>x. But we cannot prove these propositions at our present stage ; nor, if we could, would they be of much use to us, since we do not yet know that, when two general propositions are equivalent, either may be substituted for the other as part of a stroke-proposition without changing the truth-value. For the present, therefore, suppose we have a stroke-function in which p occurs several times, say p | (p | p), and we wish to replace p by (x) . <f>x, we shall have to write the second occurrence of p " (y) . <f>y," and the third " (z) . <f>z." Thus the resulting proposition will contain as many separate variables as there are occurrences of p. The primitive propositions required, which have been already mentioned, are four in number. They are as follows: (1) I- . (a», y) . $a | (<f>x | <j>y), i.e. \-:<f>a.D . (ga?) . <f>x. (2) I- . (g#) . <f>x | (<fxi j <f>b), i.e. H : (x) . <f>x . D . <f>a . <f>b. (3) The extended rule of inference, i.e. from (x) . <f>x and (x) . <j>x D i/r# we can infer (x) . tyx, even when <£ and yfr are not elementary. (4) If all the occurrences of x are separated from all the occurrences of y by a certain stroke, the order of x and y can be changed in the prefix; i.e. XXViii INTRODUCTION For (g#) : (y) . <f>x \ -fy we can substitute (y) : (ga;) . <f>% | yjry, and vice versa, even when this is only a part of the whole asserted proposition. The above primitive propositions are to be assumed, not only for one variable, but for any number. By means of the above primitive propositions it can be proved that all the propositions of #1 — *5 apply equally when one or more of the propositions p,q,r t ... involved are not elementary. For this purpose, we make use of the work of Nicod, who proved that the primitive propositions of *I can all be deduced from h .p Op and b .pDq.D .s\qDp\s together with the rule of inference: " Given p and p\(q\ r), we can infer r." Thus all we have to do is to show that the above propositions remain true when p, q, s, or some of them, are not elementary. This is done in #8 in Appendix A r IV. FUNCTIONS AS VARIABLES The essential use of a variable is to pick out a certain assemblage of elementary propositions, and enable us to assert that all members of this assemblage are true, or that at least one member is true. We have already used functions of individuals, by substituting <j>x for p in the propositions of #1 #5 7 and by the primitive propositions of #8. But hitherto we have always supposed that the function is kept constant while the individual is varied, and we have not considered cases where we have "g</>," or where the scope of "<]>" is less than the whole asserted proposition. It is necessary now to consider such cases. Suppose a is a constant. Then "<j>a" will denote, for the various values of <f>, all the various elementary propositions of which a is a constituent. This is a different assemblage of elementary propositions from any that can be obtained by variation of individuals; consequently it gives rise to new general propositions. The values of the function are still elementary propositions, just as when the argument is an individual; but they are a new assemblage of elementary propositions, different from previous assemblages. As we shall have occasion later to consider functions whose values are not elementary propositions, we will distinguish those that have elementary propositions for their values by a note of exclamation between the letter denoting the function and the letter denoting the argument. Thus "<£ ! x" is a function of two variables, x and </> ! £. It is a matrix, since it contains no apparent variable and has elementary propositions for its values. We shall henceforth write "<£ ! x" where we have hitherto written <j>x. If we replace a? by a constant a, we can form such propositions as (<f>).cf>l a, (a<£) . <f> ! a. INTRODUCTION XXIX These are not elementary propositions, and are therefore not of the form </> ! a. The assertion of such propositions is derived from matrices by the method of #8. The primitive propositions of #8 are to apply when the variables, or some of them, are elementary functions as well as when they are all individuals. A function can only appear in a matrix through its values*. To obtain a matrix, proceed, as before, by writing <f> ! x, i/r ! y, % I z, . .. in place of p, q, r, ... in some molecular proposition built up by means of the stroke. We can then apply the rules of *8 to <f>, ty, %, . . . as well as to x, y, z, The difference between a function of an individual and a function of an elementary function of individuals is that, in the former, the passage from one value to another is effected by making the same statement about a different individual, while in the latter it is effected by making a different statement about the same individual. Thus the passage from "Socrates is mortal" to "Plato is mortal" is a passage from/! x to fly, but the passage from "Socrates is mortal" to "Socrates is wise" is a passage from <j> I a to yjr ! a. Functional variation is involved in such a proposition as: "Napoleon had all the characteristics of a great general." Taking the collection of elementary propositions, every matrix has values all of which belong to this collection. Every general proposition results from some matrix by generalization f. Every matrix intrinsically determines a certain classification of elementary propositions, which in turn determines the scope of the generalization of that matrix. Thus " x loves Socrates " picks out a certain collection of propositions, generalized in " (x) . x loves Socrates " and "(qx) . x loves Socrates." But " <f> ! Socrates" picks out those, among elementary propositions, which mention Socrates. The generalizations "(<£) . <f> ! Socrates" and " (a0) . </> ! Socrates " involve a class of elementary propositions which cannot be obtained from an individual- variable. But any value of "<j> ! Socrates " is an ordinary elementary proposition ; the novelty introduced by the variable ^> is a novelty of classification, not of material classified. On the other hand, (x) . x loves Socrates, (<£) . (f> ! Socrates, etc. are new propositions, not contained among elementary propositions. It is the business of #8 to show that these propositions obey the same rules as elementary propositions. The method of proof makes it irrelevant what the variables are, so long as all the functions concerned have values which are elementary propositions. The variables may themselves be elementary propositions, as they are in #1 — #5. A variable function which has values that are not elementary propositions starts a new set. But variables of this sort seem unnecessary. Every elementary proposition is a value of </> ! & ; therefore (p) .fp. = . (<£, *)./(* ! x) : (gp) . fp . = . (a</>, x) ./(<£ ! x). * This assumption is fundamental in the following theory. It has its difficulties, but for the moment we ignore them. It takes the place (not quite adequately) of the axiom of reducibility. It is discussed in Appendix C. f In a proposition of logic, all the variables in the matrix must be generalized. In other general propositions, such as "all men are mortal," some of the variables in the matrix are re- placed by constants. XXX INTRODUCTION Hence all second-order propositions in which the variable is an elementary proposition can be derived from elementary matrices. The question of other second-order propositions will be dealt with in the next section. A function of two variables, say <f> (x, y), picks out a certain class of classes of propositions. We shall have the class <f> (a, y), for given a and variable y ; then the class of all classes <£ (a, y) as a varies. Whether we are to regard our function as giving classes <f> (a, y) or <f> (x, b) depends upon the order of generalization adopted. Thus "(g#):(3/)" involves <f>(a,y), but "(y):(^as)" involves Consider now the matrix <f> I x, as a function of two variables. If we first vary x, keeping <£ fixed (which seems the more natural order), we form a class of propositions <f> I x, <f> I y, <f> ! z, . . . which differ solely by the substitution of one individual for another. Having made one such class, we make another, and so on, until we have done so in all possible ways. But now suppose we vary <f> first, keeping x fixed and equal to a. We then first form the class of all propositions of the form <f> ! a, i.e. all elementary propositions of which a is a constituent ; we next form the class <f> I b ; and so on. The set of propositions which are values of <£ ! a is a set not obtainable by variation of individuals, i.e. not of the form fx [for constant / and variable x\ This is what makes <f> a new sort of variable, different from x. This also is why generalization of the form (<f>) . F I (<f> 1 2, x) gives a function not of the form /! x [for constant /]. Observe also that whereas a is a constituent of/! a, /is not ; thus the matrix <f> ! x has the peculiarity that, when a value is assigned to x, this value is a constituent of the result, but when a value is assigned to <f>, this value is absorbed in the resulting proposition, and completely disappears. We may define a function <£!& as that kind of similarity between propositions which exists when one results from the other by the substitution of one individual for another. We have seen that there are matrices containing, as variables, functions of individuals. We may denote any such matrix by fl(<f>lz, ^r \z,xlz, ... x,y,z, ...). Since a function can only occur through its values, <f> ! 2 (e.g.) can only occur in the above matrix through the occurrence of <f> ! x, <j> ! y, <f> ! z, . .. or of <f> I a, <f>lb,(f>lc, ..., where a, b, c are constants. Constants do not occur in logic, that is to say, the a, b, c which we have been supposing constant are to be regarded as obtained by an extra-logical assignment of values to variables. They may therefore be absorbed into the x, y, z, Now x, y, z themselves will only occur in logic as arguments to variable functions. Hence any matrix which contains the variables <f> ! z, yjr 1 2 , x • %> ®> V> z and no others, if it is of the sort that can occur explicitly in logic, will result from substituting <f>\x,<f>\y,$\z, yfrlx, yfrly, yfrlz, %lx, % 1 y, % I z, or some of them, for elementary propositions in some stroke-function. INTRODUCTION XXXI It is necessary here to explain what is meant when we speak of a " matrix that can occur explicitly in logic," or, as we may call it, a " logical matrix." A logical matrix is one that contains no constants. Thus p | q is a logical matrix ; so is <f> ! x, where <f> and x are both variable. Taking any elementary proposition, we shall obtain a logical matrix if we replace all its components and constituents by variables. Other matrices result from logical matrices by assigning values to some of their variables. There are, however, various ways of analysing a proposition, and therefore various logical matrices can be derived from a given proposition. Thus a proposition which is a value of p | q will also be a value of (<j>lx)\ (^rly) and of %!(#, y). Different forms are required for different purposes ; but all the forms of matrices required explicitly in logic are logical matrices as above denned. This is merely an illustration of the fact that logic aims always at complete generality. The test of a logical matrix is that it can be expressed without introducing any symbols other than those of logic, e.g. we must not require the symbol " Socrates." Consider the expression /! (<f> ! z, yfr I z, x ! z, ••• #, y, z). When a value is assigned to /, this represents a matrix containing the variables $' ty> X> • • • x > y> z > But wn il e / remains unassigned, it is a matrix of a new sort, containing the new variable /. We call / a " second-order function," because it takes functions among its arguments. When a value is assigned, not only to /, but also to <f>, yfr, %, . . . x t y, z, . . . , we obtain an elementary proposition ; but when a value is assigned to f alone, we obtain a matrix containing as variables only first-order functions and individuals. This is analogous to what happens when we consider the matrix <£ ! x. If we give values to both <f> and #, we obtain an elementary proposition ; but if we give a value to <£ alone, we obtain a matrix containing only an individual as variable. There is no logical matrix of the form f ! (<f> ! 2). The only matrices in which <f> ! 1z is the only argument are those containing <j> I a, <f> ! b, <f> ! c, . . . , where a, b, c, ... are constants; but these are not logical matrices, being derived from the logical matrix <f> \x. Since <f> can only appear through its values, it must appear, in a logical matrix, with one or more variable arguments. The simplest logical functions of <f> alone are (#) . <f> ! x and (a«) . <f> ! x, but these are not matrices. A logical matrix fl(<f)lz, a?i,# 2 , ... x n ) is always derived from a stroke-function F(pi,Pz,Ps> >..p n ) by substituting <p I x lt (f> ! x 2 , . . . <f> ! x n for p\, p 2> . . . p n . This is the sole method of constructing such matrices. (We may however have x r = x s for some values of r and s.) Second-order functions have two connected properties which first-order functions do not have. The first of these is that, when a value is assigned to R&W I c XXXU INTRODUCTION /, the result may be a logical matrix; the second is that certain constant values of/ can be assigned without going outside logic. To take the first point first:/! (<j> ! z, x), for example, is a matrix containing three variables,/, <£, and x. The following logical matrices (among an infinite number) result from the above by assigning a value to/: <f> ! x, (<j> ! x) \ (<f> ! x), <j>lxD<f>lx, etc. Similarly <f>lx2<f>ly, which is a logical matrix, results from assigning a vulue to /in/! (<£ ! 2, x, y). In all these cases, the constant value assigned to / is one which can be expressed in logical symbols alone (which was the second property of/). This is not the case with <f> ! x: in order to assign a value to <f>, we must introduce what we may call "empirical constants," such as "Socrates" and "mortality" and "being Greek." The functions of x that can be formed without going outside logic must involve a function as a generalized variable; they are (in the simplest case) such as (<f>).<f>lx and (a<£) .<plx. To some extent, however, the above peculiarity of functions of the second and higher orders is arbitrary. We might have adopted in logic the symbols Ri (x), R* {so, y), R 3 (#, y>z), where R± represents a variable predicate, R % a variable dyadic relation (in intension), and so on. Each of the symbols R x {x), R 2 (x,y), R 3 (x,y,z), ... is a logical matrix, so that, if we used them, we should have logical matrices not containing variable functions. It is perhaps worth while to remind ourselves of the meaning of "<f> ! a," where a is a constant. Th<^ meaning is as follows. Take any finite number of propositions of the various forms jRj (x), R 2 (x, y), ... and combine them by means of the stroke in any way desired, allowing any one of them to be repeated any finite number, of times. If at least one of them has a as a constituent, ie. is of the form R n (a,b 1 , b 2 , ... 6 n _j), • then the molecular proposition we have constructed is of the form <j> ! a, i.e. is a value of " <f> ! a" with a suitable <f>. This of course also holds of the proposition R n (a, b 1} b 2 , . . . 6 M _i) itself. It is clear that the logic of propositions, and still more of general propositions concerning a given argument, would be intolerably complicated if we abstained from the use of variable functions; but it can hardly be said that it would be impossible. As for the question of matrices, we could form a matrix/! (i2j, x), of which R t (x) would be a value. That is to say, the properties of second-order matrices which we have been discussing would also belong to matrices containing variable universals. They cannot belong to matrices containing only variable individuals. By assigning <£ ! £ and x in/! (<£ ! £, x), while leaving /variable, we obtain an assemblage of elementary propositions not to be obtained by means of variables representing individuals and first-order functions. This is why the new variable /is useful. INTRODUCTION XXX111 We can proceed in like manner to matrices Fl{fl($l%$),gl($l%x), ...^\% X \$,...x,y, ...} and so on indefinitely. These merely represent new ways of grouping ele- mentary propositions, leading to new kinds of generality. V. FUNCTIONS OTHER THAN MATRICES When a matrix contains several variables, functions of some of them can be obtained by turning the others into apparent variables. Functions obtained in this way are not matrices, and their values are not elementary propositions. The simplest examples are (y) • £ '• (», V) and (ay) .<f>l(x, y). When we have a general proposition (<£) . F {<£ I z, x, y, ...}, the only values <f> can take are matrices, so that functions containing apparent variables are not included. We can, if we like, introduce a new variable^ to denote not only functions such as <f> I ot,- but also such as (y).<j>l($,y), (y,z).<f>l(x,y,z), ... (ay) •<£!(£, y), ...; in a word, all such functions of one variable as can be derived by generalization from matrices containing only individual-variables. Let us denote any such function by fax, or -ty^sc, or Xl x, or etc. Here the suffix 1 is intended to indi- cate that the values of the functions may be first-order propositions, resulting from generalization in respect of individuals. In virtue of #8, no harm can come from including such functions along with matrices as values of single variables. Theoretically, it is unnecessary to introduce such variables as fa, because they can be replaced by an infinite conjunction or disjunction. Thus e.g. ((f),) .fax. = : (<f>). <f>lx: (fa y) .xf) ! (x, y) : (0) : (ay) .<f>l(x,y): etc., (a<k) . fax . = : (a<£) .<f>l x:v: (g<£) : (y) . <j> ! (x,y):v :{>&<}>, y).<f> ! (x,y) :v: etc., and generally, given any matrix fl(<f>lz, x), we shall have the following pro- cess for interpreting (c^) ./! (faz, x) and (a<£i) ./! (faz, #)• Put (fa) ./! (fa%x) . =. : (<f>) ./ ! {(y) .<£!(£, y), x] : (<f>) ./! {(ay) . </> ! (z, y), x], where/! {(y) . <f> ! (z, y), x) is constructed as follows: wherever, in/! {<£ ! z, x}, a value of <j>, say <f> I a, occurs, substitute (y) . <£ ! (a, y), and develop by the definitions at the 'beginning of #8. / ! {(ay) . <f> I (z, y), x] is similarly con- structed. Similarly put (fa) ./! (fa lz,x). = : (</>) ./! {(y, w) . <f> ! (% y, w), x) : (</>) -/ ! {(y) '■ (a w ) • <f> *(% y, w), x] : etc., where "etc." covers the prefixes (a.y) : ( w ) •> (33/> w) •> (w) : (32/)- We define (f> 3 , fa, ... similarly. Then (fa) .fl(fa% x) . = : (fa) ./! (^ 2, x) : (<£ 2 ) ./! (fa 3, x) : etc. This process depends upon the fact that/! (<£ ! z, x), for each value of <}> and x, is a proposition constructed out of elementary propositions by the stroke, and c2 XXxiv INTRODUCTION that #8 enables us to replace any of these by a proposition which is not elementary. (a<£i) .flifa'z, x) is defined by an exactly analogous disjunction. It is obvious that, in practice, an infinite conjunction or disjunction such as the above cannot be manipulated without assumptions ad hoc. We can work out results for any segment of the infinite conjunction or disjunction, and we can " see " that these results hold throughout. But we cannot prove this, because mathematical induction is not applicable. We therefore adopt certain primitive propositions, which assert only that what we can prove in each case holds generally. By means of these it becomes possible to manipulate such variables as fa. In like manner we can introduce /, (faz, £), where any number of in- dividuals and functions yjr 1} ft, ... may appear as apparent variables. No essential difficulty arises in this process so long as the apparent variables involved in a function are not of higher order than the argument to the function. For example, x e D'JR, which is (ay) . xRy, may be treated without danger as if it were of the form <f> ! x. In virtue of #8, fax may be substituted for <£ ! x without interfering with the truth of any logical pro- position which <f> ! x is a part. Similarly whatever logical proposition holds concerning/! (faz, x) will hold concerning f x (faz, x). But when the apparent variable is of higher order than the argument, a new situation arises. The simplest cases are (*)./! ($!*,*), (3*) ■/! (*!*,*). These are functions of x, but are obviously not included among the values for" <f> ! x (where <f> is the argument). If we adopt a new variable fa which is to include functions in which (f> ! z can be an apparent variable, we shall obtain other new functions ifa).f\{fa%x), (afc) ./!(#*,*)> which are again not among values for fax (where fa is the argument), because the totality of values of faz, which is now involved, is different from the totality of values of <f> ! £, which was formerly involved. However much we may en- large the meaning of <f>, a function of x in which <f> occurs as apparent variable has a correspondingly enlarged meaning, so that, however <f> may be defined, (fa).f\(4>%x) and (a*) ./!(#,*) can never be values for <f>x. To attempt to make them so is like attempting to catch one's own shadow. It is impossible to obtain one variable which embraces among its values all possible functions of individuals. We denote by fax a function of x in which fa is an apparent variable, but there is no variable of higher order. Similarly fax will contain fa as apparent variable, and so on. INTRODUCTION XXXV The essence of the matter is that -a variable may travel through any well- defined totality of values, provided these values are all such that any one can replace any other significantly in any context. In constructing fax, the only totality involved is that of individuals, which is already presupposed. But when we allow <j> to be an apparent variable in a function of x, we enlarge the totality of functions of a;, however <f> may have been defined. It is therefore always necessary to specify what sort of <j> is involved, whenever <f> appears as an apparent variable. The other condition, that of significance, is fully provided for by the definitions of *8, together with the principle that a function can only occur through its values. In virtue of the principle, a function of a function is a stroke-function of values of the function. And in virtue of the definitions in *8, a value of any function can significantly replace any proposition in a stroke-function, because propositions containing any number of apparent variables can always be substituted for elementary propositions and for each other in any stroke-function. What is necessary for significance is that every complete asserted proposition should be derived from a matrix by generaliza- tion, and that, in the matrix, the substitution of constant values for the variables should always result, ultimately, in a stroke-function of atomic propositions. We say " ultimately," because, when such variables as fa% are admitted, the substitution of a value for fa may yield a proposition still containing apparent variables, and in this proposition the apparent variables must be replaced by constants before we arrive at a stroke-function of atomic propositions. We may introduce variables requiring several such stages, but the end must always be the same : a stroke-function of atomic propositions. It seems, however, though it might be difficult to prove formally, that the functions fa, fi introduce no propositions that cannot be expressed without them. Let us take first a very simple illustration. Consider the proposition (H^i) ■ fa x m fa a > which we w *^ call /(a?, a). Since fa includes all possible values of <f> ! and also a great many-other values in its range, /(«, a) might seem to make a smaller assertion than would be made by (g<£) . <f> I x . <j> ! a, which we will call/, (x, a). But in fact f{x, a) . D ./„ (x, a). This may be seen as follows : fax has one of the various sets of forms : (y) . 4> ! (x, y), (y, z).<}>l 0, y, z), ..., (ay) ■ $ ■ 0*> y). to *) • tf> ! fo y.*).—> (y) : (a*) • <M 0»» y. *)> (ay) : (*) • ! fo-y» z ^ Suppose first that fax . = . (y) . <f> ! (x, y). Then fax . faa . = D (y) . <j> I (x, y) : (y) . tf> ! (a, y) <f, I (x, b).<f>l (a, b) : (a</>) . <£ I x . <£ ! a. XXXVI INTRODUCTION Next suppose fax . = . fay) .<f>\{x, y). Then fax .faa. = : (gy) . <f> ! O, ?/) : faz) .<f>l(a, z) : 3 '■ (ay, z):<f>l(oe ) y)v<f>l (x, z).fa\ (a, y)v<f>l (a, z) : D : (g;0) . <j> I x . <j> ! a, because <j> I (%, y) v (f> 1 (x, z) is of the form <f> I x, when y and z are fixed. It is obvious that this method of proof applies to the other cases mentioned above. Hence fafa) . fax . faa . = . (>&<j>) . <f> 1 x . <f> I a. We can satisfy ourselves that the same result holds in the general form (a&)./! (<M>*) ■ = ■ (a*) -/! (*!*,-*) by a similar argument. We know that / ! (0 ! £, a?) is derived from some stroke-function F(p,q,r,...) by substituting <f> I x, <f> I a, </> ! b, . . . (where a, b, ... are constants) for some of the propositions p,q,r,... and g x l x, g 2 lx, g 3 lx, ... (where ^, # 2 , g s , ... are constants) for others of p, q, r, ..., while replacing any remaining propositions p, q, r, ... by constant propositions. Take a typical case ; suppose fl(<l>lz,x). = .(<f>la)\{(<f>lx)\(cl>lb)}. We then have to prove faa\(fax\fab).D.fa<f>).<f>la\(falx\<f>lb), where fax may have any of the forms enumerated above. Suppose first that fax . — . (y) . <$> ! (x, y). Then faa | (fax | fab) . = : (ay) :(z,w).<f>\ (a, y)\{<f>l (x, z)\<t>\ (b, w)} : D : (32/) . fal (a, y) \ {<f> ! (x, y)\<f>l (b, y)} : D:( a <£).<£!a|(<£!tf|0!&) because, for a given y, <f> ! (x, y) is of the form <f> I x. Suppose next that fax . = . (33/) . <j> ! (x, y). Then faa I (&« J fab) . = : (y) : faz, w).<f>l (a, y) | {<f> ! (a;, *) | <f> ! (6, w)} : D : (a>|r) . yjr ! a j (^ ! x | i/r ! b), putting \jrlx .= . (ftl(x,z)v<j>l(x, w). Similarly the other cases can be dealt with. Hence the result follows. Consider next the correlative proposition (fa) ./! (fa% x) . = . (<£) ./! (<£ ! X x). Here it is the converse implication that needs proving, i.e. (fa).f\(<t>l%x).1.(fa).f\(fa%x). This follows from the previous case by transposition. It can also be seen in- dependently as follows. Suppose, as before, that fl(fa$,x). = .(faa)\(fax\fab), and put first fax . = . (y) .<f>\(x, y). Then (faa) \ (fax \ fab) . = : ( H y) : (z, w).<f>\ (a, y)\{<f>l (x, z)\<f>l (6, «/)}. INTRODUCTION XXXV11 Thus we require that, given (ylr).(ylrla)\(yfrlx\^lb), we should have (g#) : (z, w) . <f> I (a, y)\{<f>l (x, z) \ <j> ! (6, w)}. Now (yft) . yfr ! a \ (yfr ! x \ yfr ! b) . D : . <f> ! (a, z) . D . <f> ! (#, z) . <f> I (b, z) : <f> ! (a, w) . D . <£ ! (x, w) . <f> ! (6, «/) :. D :. <f> ! (a, *) . <£ ! (a, w) . D . <£ ! (x, z).<f>l (b, w) :. D:.<f>l(a,w).D:<f>l(a,z).D.(j>l(x > z).(l>l(b > w) (1) Also ~^>!(a,?«).D:<^!(a,w;).D.</>!(«,5).^!(6 ) w) (2) (l).(2).D:.(^).^!o|(^!ar|^!6):D:.(ay):^!(a,y).D.^!(a?,«).^!(6,w) which was to be proved. Put next fax . = . (33/) . <£ ! (x, y). Then (fca) | (fax \ fab). = :(y): faz, w).<f>l (a, y) | {</> ! (x, z) \ <f> ! (6, w)}. In this case we merely put z = w = y and the result follows. The method will be the same in any other case. Hence generally : (fa) ./! (fa% x). = , (<j>) .fl(<j> \X x). Although the above arguments do not amount to formal proofs, they suffice to make it clear that, in fact, any general propositions about <j> ! z are also true about faz. This gives us, so far as such functions are concerned, all that could have been got from the axiom of reducibility. Since the proof can only be conducted in each separate case, it is necessary to introduce a primitive proposition stating that the result holds always. This primitive proposition is h :(*)./! (01 % x).D.fl(fa%x) Pp. As an illustration : suppose we have proved some property of all classes denned by functions of the form <f> ! z, the above primitive proposition enables us to substitute the class T)'R, where R is the relation denned by <f> ! (x, p), or by (gs) . <f> ! (x, $, z), or etc. Wherever a class or relation is denned by a function containing no apparent variables except individuals, the above primitive pro- position enables us to treat it as if it were denned by a matrix. We have nOw to consider functions of the form fax, where fax . = . (<£) ./! (<f> I % x) or fax . = . (gtf) ./! (<f> I % x). We want to discover whether, or under what circumstances, we have (fa) .g\(4>\^x) .1 . g\(faz,x). (A) Let us begin with an important particular case. Put gl(<f>lz,x). = .<f>laD<f>lx. Then (fa . g I (<f> I z, x) . = . x = a, according to #131. XXXV111 INTRODUCTION We want to prove (<j>) . <j> I a D <f> I x . D . <f> 2 a D <f> 2 x, i.e. (<f>).<t>laD<f>lx.3: (£) ./! (tf> ! z, a) . D . (<f>) ./!(</>! % x) : (a<*>) •/! (<*> l%a).D. ( a <£) ./!(<£! *, *). Now/! (0"! 2, a?) must be derived from some stroke-function F(p,q,r,...) by substituting for some of p, q,r, ... the values <j> I x, tf> I b, ! c, . . . where b, c, ... are constants. As soon as <f> is assigned, this is of the form yfr ! #. Hence (<f>).(j>laD<f>lx.D :(<!>) :/! (<£ ! % a) . D ./! (<f> ! % x) : D:(*)./!(*!*,a). 3. (*)./!(*!*,*): (a<*>) ■/'■ (<*> I % a) • ^ • (3*) ■/!■(* ! *, *)• Thus generally (<£) . </> ! a D <£ ! x . D . (<£ 2 ) . <f> 2 a D </> 2 a? without the need of any axiom of.reducibility. It must not, however, be assumed that (A) is always true. The procedure is as follows :/!(</>! 2, x) results from some stroke-function F(p,q,r,...) by substituting for some of p, q,r, ... the values <£ ! x, <j> ! a, <f> I b, ... (a, b, ... being constants). We assume that, e.g. 4> 2 x. = .{<j>).f\{4>\z,x). Thus <f, 2 x. = .(<}>). F((j> I x, <j>la, <f>lb, ...). (B) What we want to discover is whether {<\>).g\{^\%x).^.g\{^%x). Now g ! (<f> I z, x) will be derived from a stroke-function G(p,q,r,...) by substituting <f> I x, <j>la', <f>lb', ... for some of p, q, r, To obtain g\($ 2 z,%), we have to put <f> 2 x, <f> 2 a, <f> 2 b', ... in G(p, q, r, ...), instead of <f> ! x, <f>la', <f>lb', We shall thus obtain a new matrix. If ((f>) . g I ((f) ! z, x) is known to be true because G(p, q, r, ...) is always true, then g ! (<f> 2 z, x) is true in virtue of #8, because it is obtained from G (p, q, r, ...) by substituting for some of p, q, r, ... the propositions <f> 2 x, <f> 2 a', <f> 2 b', ... which contain apparent variables. Thus in this case an inference is warranted. We have thus the following important proposition : Whenever (</>) . gl(<j>lz,x) is known to be true because g ! (<£ ! z,x) is always a value of a stroke-function G(p, q, r, ...), which is true for all values of p, q, r, ..., then g ! (<f> 2 lz, x) is also true, and so (of course) is (<£ 2 ) . g ! (<f> 2 z. x). INTRODUCTION XXXIX This, however, does not cover the case where (<j>) . g ! (<f> ! 2, x) is not a truth of logic, but a hypothesis, which may be true for some values of x and false for others. When this is the case, the infereDce to g ! (<£ 2 2, x) is some- times legitimate and sometimes not ; the various cases must be investigated separately. We shall have an important illustration of the failure of the inference in connection with mathematical induction. VI. CLASSES The theory of classes is at once simplified in one direction and complicated in another by the assumption that functions only occur through their values and by the abandonment of the axiom of reducibility. According to our present theory, all functions of functions are extensional, i.e. <t>x= x +x.l.f(p)=f(1rt). This is obvious, since <f> can only occur in f(4>z) by the substitution of values of <£ for p, q, r, ... in a stroke-function, and, if <f>x = yfrx, the substitution of §x for p in a stroke-function gives the same truth-value to the truth-function as the substitution of yfrx. Consequently there is no longer any reason to distinguish between functions and classes, for we have, in virtue of the above, <j)x = x tyx . D . <f>% = yjrx. We shall continue to use the notation & (<$>x), which is often more convenient than <j)tc ; but there will no longer be any difference between the meanings of the two symbols. Thus classes, as distinct from functions, lose even that shadowy being which they retain in #20. The same, of course, applies to relations in extension. This, so far, is a simplification. On the other hand, we now have to distinguish classes of different orders composed of members of the same order. Taking classes of individuals as the simplest case, & (<£> ! x) must be distinguished from & (<f> 2 x) and so on. In virtue of the proposition at the end of the last section, the general logical properties of classes will be the same for classes of all orders. Thus e.g. aC/3./3C7.D.aC 7 will hold whatever may be the orders of a, #, y respectively. In other kinds of cases, however, trouble arises. Take, as a first instance, p l K and s'k. We have x ep f K . = : a e k . D„ . x e a. Thus p'tc is a class of higher order than any of the members of k. Hence the hypothesis (a) .fa may not imply f{p'ic), if a is of the order of the members of k. There is a kind of proof invented by Zermelo, of which the simplest example is his second proof of the Schroder-Bernstein theorem (given in #73). This kind of proof consists in defining a certain class of classes tc, and then showing that p'tceic. On the face of it, "p'/ce/c" is impossible, since p'/e is Xl INTRODUCTION not of the same order as members of k. This, however, is not all that is to be said. A class of classes k is always denned by some function of the form Ox, x 2 , ...): (gy x , y 2 , . . .) . Ffa e a, x 2 e a, . . . y x e a, y 2 e a, . . .), where F is a stroke-function, and "oe«" means that the above function is true. It may well happen that the above function is true when p'ic is sub- stituted for a, and the result is interpreted by #8. Does this justify us in asserting p*K etc? Let us take an illustration which is important in connection with mathematical induction. Put K = a (R"a Ca.aea). Then R"p'/cCp'K . aep'/c (see *40'81) so that, in a sense, p i K e k. That is to say, if we substitute p l K for a in the defining function of k, and apply #8, we obtain a true proposition. By the definition of #90, 4— R%. t a=p t K. <— Thus R%a is a second-order class. Consequently, if we have a hypothesis (a) .fa, where a is a first-order class, we cannot assume (a)./a.D./CR*'a). (A) By the proposition at the end of the previous section, if (a) ./a is deduced by logic from a universally-true stroke-function of elementary propositions, f(R%a) will also be true. Thus we may substitute R#a for a in any asserted proposition " h .fa" which occurs in Principia Mathematica. But when (a) ./a is a hypothesis, not a universal truth, the implication (A) is not, prima facie, necessarily true. For example, if k = a (R"a C a . a e a), we have ae*.D:a«/3e/c. = . R"(a n@)C@ .ae/3. Hence a e k . R"{ar\ 0) C /3 . a e . D .p'ic C /3 (1) In many of the propositions of #90, as hitherto proved, we substitute p'ic for a, whence we obtain R"(/3np<,e)C/3.ae/3.D.p t feC/3 (2) i.e. z e . aR%z . D Z) w . w e y8 : a e . aR%oc : D . x e /8 or aR^x . D :. z e /3 . aR%z . D z w .M/e/3:ae/8:D.#e/3. This is a more powerful form of induction than that used in the definition of aR%x. But the proof is not valid, because we have no right to substitute p'tc for a in passing from (1) to (2). Therefore the proofs which use this form of induction have to be reconstructed. INTRODUCTION xli It will be found that the form to which we can reduce most of the fallacious inferences that seem plausible is the following: Given " h . (x) . f(x, %)" we can infer " h : (x) : (gy) . f(x, y)." Thus given " I- . (a) ./(a, a)" we can infer " V : (a) : (g£) ./(a, #)." But this depends upon the possibility of a = 0. If, now, a is of one order and /8 of another, we do not know that a = /? is possible. Thus suppose we have a e k . D a . got. and we wish to infer g$, where # is a class of higher order satisfying /3 e k. The proposition (/3) :. a e /e . D a . #a : D : /3 e /c . D . gryS becomes, when developed by #8, (£) :: (ga) :.ae re .3 .ga-.D : fie k .D .g/3. This is only valid if o = $ is possible. Hence the inference is fallacious if /3 is of higher order than a. Let us apply these considerations to Zermelo's proof of the Schroder- Bernstein theorem, given in *73"8 ff. We have a class of classes * = 3(a C D'R . £- <l l R C a . R"a Ca) and we prove p'tc e x (#73"81), which is admissible in the limited sense ex- plained above. We then add the hypothesis x~e(0-Q.'R)vIt"p t ic and proceed to prove p l K — i'x e k (in the fourth line of the proof of *7382). This also is admissible in the limited sense. But in the next line of the same proof we make a use of it which is not admissible, arguing from p*K — i'xe k to p'x Cp'tc — i l x, because ae k . D a . p'k C a. The inference from a e k . D a .p ( K Ca to p'/c— t'xe k .0 . p'/cCp'ic — i'x is only valid if p e /c — i'x is a class of the same order as the members of k. For, when a e k . D a . p l K C a is written out it becomes (a) ::: (g/S) ::. (x) :: a e k . D :. ft e k . D .xe /S : D . x e a. This is deduced from a e k . O :. a e k . D . x e a : D . x e a by the principle that /(a, a) implies (g/3) ./(a, /3). But here the fi must be of the same order as the a, while in our case a and /8 are not of the same order, if a = p l K — i l x and /3 is an ordinary member of k. At this point, there- fore, where we infer p l K Qp'ic — fc'#,*the proof breaks down. It is easy, however, to remedy this defect in the proof. All we need is x~e(@-<l'R)u R"p'/c. D.x~ep'/c or, conversely, Xlii INTRODUCTION Now x ep'/c . D :. a e k . D a : a — i'x^e k : 3 a . ^(p _ a f jR C a - i'a>) . v . ~ [R"(ol - i'x) C a - t'«?} : X :xe0- Q.'R . v . xe R"{<t- l'x) D :. x e - CF# : v : a e k . D a . x e i£"a. Hence, by *72-341, a; e|><* . D . x e (J3 - d'R) u R"p<" which gives the required result. We assume that a — l'x is of no higher order than a; this can be secured by taking a to be of at least the second order, since i'x, and therefore — t'oc, is of the second order. We may always assume our classes raised to a given order, but not raised indefinitely. Thus the Schroder-Bernstein theorem survives. Another difficulty arises in regard to sub-classes. We put Cl'a = /§(/3Ca) Df. Now "0Ca" is significant when /3 is of higher order than a, provided its members are of the same type as those of a. But when we have 0Ca.Dp.ffr the /3 must be of some definite type. As a rule, we shall be able to show that a proposition of this sort holds whatever the type of /3, if we can show that it holds when is of the same type as a. Consequently no difficulty arises until we come to Cantor's proposition 2 W > n, which results from the proposition ~{(Cl'a)sm«} which is proved in #102. The proof is as follows: R e 1 -» 1 . WR = a . <J'i2 C Cl'a . £ = x [x e a - R'x) . D : ^ w ^ ye a. ye R'y .Oy.y^eg-.yeu. y~e R'y .D y .ye^:D:yea.D y .^ R'y : D :£<-€<!<#. As this proposition is crucial, we shall enter into it somewhat minutely. Let a = £ (A ! x), and let xR${4>\z)). = .f\{4>\%x). Then by our data, Alx.D. (?[<!>). fl(<j>l%x), fl(<f>l^,x).D.Alx.cf>lyD y Aly > fl(<f>lz,x).fl(<l>lz,y).O.x = y, f ! (<f> I % x).f\{^r\%x).0.<i>\y =y.^ ! y. With these data, xea — R'x . = '. A ! x :/!(<£! z, x) . D^ . ~ <j> I x. Thus £ = £{(<£) '• A '■ x : / ! (4> '■ 2. x ) ■ D -~4> '■ x )- INTRODUCTION xliii Thus £ is defined by a function in which <\> appears as apparent variable. If we enlarge the initial range of <f>, we shall enlarge the range of values involved in the definition of £. There is therefore no. way of escaping from the result that £ is of higher order than the sub-classes of a contemplated in the definition of Cl'a. Consequently the proof of 2 n >• n collapses when the axiom of reducibility is not assumed. We shall find, however, that the propo- sition remains true when n is finite. With regard to relations, exactly similar questions arise as with regard to classes. A relation is no longer to be distinguished from a function of two variables, and we have 0(£,£) = ^(£,#) . = :<f>{x,y) . =,,„ .f(x,y). The difficulties as regards^'X and Rl'Pare less important than those concerning p'/c and Cl'a, because p l \ and HYP are less used. But a very serious difficulty occurs as regards similarity. We have a sm . = . (rR) . i2 e 1 -* 1 . a = D'i£ . /3 = (Pi*. Here R must be confined within some type; but whatever type we choose, there may be a correlator of higher type by which o and can be correlated. Thus we can never prove <~(asmy8), except in such special cases as when either a or is finite. This difficulty was illustrated by Cantor's theorem 2 n > n, which we have just examined. Almost all our propositions are con- cerned in proving that two classes are similar, and these can all be interpreted so as to remain valid. But the few propositions which are concerned with proving that two classes are not similar collapse, except where one at least of the two is finite. VII. MATHEMATICAL INDUCTION All the propositions on mathematical induction in Part II, Section E and Part III, Section C remain valid, when suitably interpreted. But the proofs of many of them become fallacious when the axiom of reducibility is not assumed, and in some cases new proofs can only be obtained with considerable labour. The difficulty becomes at once apparent on observing the definition of "xR%y" in #90. Omitting the factor "xeC'R," which is irrelevant for our purposes, the definition of " xR%y " may be written zRw.'D z>w .<l>lz'2 4>l w.D^.^lxD^ly, (A) i.e. " y has every elementary hereditary property possessed by x." We may, instead of elementary properties, take any other order of properties ; as we shall see later, it is advantageous to take third-order properties when R is one-many or many- one, and fifth-order properties in other cases. But for preliminary purposes it makes no difference what order of properties we take, and therefore for the sake of definiteness we take elementary properties to begin with. The difficulty is that, if <f> 2 is any second-order property, we cannot deduce from (A) zRw . D z>w . <j> 2 z D <p 2 w : D . <f> 2 x D <f> 2 y. (B) xliv INTRODUCTION Suppose, for example, that <j> 2 z.= .(<f>).fl(<frlz,z); then from (A) we can deduce zRw . D Zj w ./! (<£ ! % z) D 4 /! (£ ! % iv) : D :/! (<£ ! % x) . D* ./! (<£ ! £ , y) : D : fax . D . <j> 2 y. (C) But in general our hypothesis here is not implied by the hypothesis of.(B). If we put <f> 2 z . = . (g<£) .f\ {<f> I z, z), we get exactly analogous results. Hence in order to apply mathematical induction to a second-order property, it is not sufficient that it should be itself hereditary, but it must be composed of hereditary elementary properties. That is to say, if the property in question is <f> 2 z, where (f> 2 z is either (<f>) ./! (* ! % z) or (a*) ./! <* ! *,*), it is not enough to have zRw .D 2>w .<l> 2 z~)<f> 2 w, but we must have, for each elementary 0, zRw.D ZtW ./l t4> ! % z) D/I (0 ! f , «;). ■ One inconvenient consequence is that, primd facie, an inductive property must not be of the form xR%. z . <f>lz or SeFotid'R.tfilS or a e NC induct . <}> I a. This is inconvenient, because often such properties are hereditary when <f> alone is not, i.e. we may have xR^z .<f>lz .zRw . D 2(OT . xR% w . <ft ! w when we do not have <f> ! z . zRw . D z>w .<j>lw, and similarly in the other cases. These considerations make it necessary to re-examine all inductive proofs. In some cases they are still valid, in others they are easily rectified; in still others, the rectification is laborious, but it is always possible. The method of rectification is explained in Appendix B to this volume. There is, however, so far as we can discover, no way by which our present primitive propositions can be made adequate to Dedekindian and well-ordered relations. The practical uses of Dedekindian relations depend upon #211 "63 — •692, which lead to #214'3 — '34, showing that the series of segments of a series is Dedekindian. It is upon this that the theory of real numbe