Some Notes on Big Numbers

Big numbers are, well, big. Although that seems rather tautological, there is an important point about the size of big numbers that does have to be emphasized.

Let us imagine a combination lock which has a combination consisting of three decimal digits. Actually, such locks do exist in reality; they are common on attaché cases, for example.

There are a thousand different combinations to such a lock.

What about a lock with a six digit combination?

This time, the number of combinations is a million.

Although there are twice as many digits in the combination, the number of combinations is not twice as large. There aren't only two thousand combinations, but a million combinations. For each possible arrangement of the first three digits, one has to try all possible combinations of the last three digits.

Thus, a lock with a six-digit combination is five hundred times as secure as two locks with three-digit combinations that can be unlocked separately.

This shows why, if a cipher system is well designed, so that only brute-force search is possible as an attack on it, as long as the key is long enough, very high security can be achieved. Ciphers do have their own inherent limits on security, however. A simple change that makes a cipher's key much longer would normally mean that there is a way to break the cipher that is faster than doing a brute-force search on the longer key. But while making ciphers with longer keys is not completely trivial, it basically consists of making the cipher bigger: more rounds and a larger block size.

For conventional symmetric-key ciphers, which are the ones that can be made as hard to crack as brute-force on the key, how big a key is enough?

Here is one number to start with: in 1998, the Electronic Frontier Foundation built a machine, for about $250,000, that could break DES, with a 56-bit key, by brute force in three and a half days.

Based on that number, I estimated that an 80-bit key, such as that used in SKIPJACK, would be subject to brute-force search by the NSA.

Let's assume that they would have twenty-five billion dollars available to spend on a cipher-cracking machine; that is a hundred thousand times as much money as the EFF had to spend.

Let's assume they could use this machine for a year to crack the key to one particularly important cipher system. That is a hundred times longer than it took to crack DES.

Let's also assume that no cost penalty was involved in designing the machine so that it could be configured to attack different ciphers, unlike the EFF machine, which is specifically designed to attack DES and DES only. Of course, I am still assuming Kerckhoff's dictum that the cipher system is known; I am making pessimistic assumptions here to be on the safe side, but I am not intending to venture into utter fantasy. Of course, techniques like genetic algorithms could perhaps be used on unknown systems, but here there would be a very large increase in the complexity of what is being done.

Finally, I threw in another factor of a hundred, to account for the NSA having access to advanced electronics beyond the current public state of the art, and with the chips the EFF used not being the fastest or highest-performing chips in existence even then. Since 1998 is now five years ago, of course, Moores' Law has provided us with half that factor in any case.

One hundred thousand (times as much money), times one hundred (times as much time), times one hundred (times as fast equipment). That comes out to a nice round factor of an American billion, which is a thousand million, or a thousand thousand thousand. As it happens, two raised to the tenth power is 1,024, very close to a thousand. Each factor of a thousand, therefore, adds ten bits to the size of the keys that can be attacked.

Fifty-six plus thirty is eighty-six.

While tying up $25 billion for a year might seem a bit much, those extra six bits let us be more reasonable; let us say $2.5 billion to crack an 80-bit key in a few months.

Bruce Schneier's book Applied Cryptography notes that fundamental physical limitations appear to prevent brute-force attacks against keys 256 bits in length.

Quantum computing allows a single computer to work on an immense number of possibilities at once, but there is a fundamental physical limitation on their operation that means that if they did exist, they would simply double the length of keys that could be cracked.

If it becomes possible to travel and communicate faster than light, then one could fill the Universe, with some 10^20 stars in it, with computers that just barely avoid turning into black holes; and, if one is going to throw in ideas that are from soft science fiction, what about parallel universes (of course, it could be said that parallel universes are exactly what are employed by quantum computers)? This may not matter in the real world, but for purposes of self-consistency, science-fiction writers should pay attention to the lengths of cryptographic keys their characters will need to employ. Of course, technologies to scan computing devices - and read minds - will make it harder to keep secrets as well. The proverbial tinfoil hat, rather than a fancier cryptographic algorithm, might well be what concerns adventurers wandering through the galaxy.

A Useful Table

Since scientists are using bigger and smaller quantities these days, they've been adding new prefixes to the metric system with which not everyone is familiar. As a public service, a table thereof is included here. But first, here are the prefixes everyone is familiar with:

1 deka ten deci tenth 2 hecto hundred centi hundredth 3 (10) kilo thousand milli thousandth 6 (20) mega million micro millionth 9 (30) giga billion (thousand million) nano 12 (40) tera trillion (billion) pico

and now those which are newer:

15 (50) peta quadrillion (thousand billion) femto 18 (60) exa quintillion (trillion) atto 21 (70) zetta sextillion (thousand trillion) zepta 24 (80) yotta septillion (quadrillion) yocto

The first column indicates the powers of 10 involved, both positive and negative, for the two prefixes given, and the second column, in parentheses, the powers of 2 sometimes used with the positive-power prefixes in the computer field, on the basis that two to the tenth power is 1,024, which is closely approximated by ten to the third power, 1,000.

Note also that the British (and German, and Russian) number names follow, in parentheses, the American (and French) number names for the large numbers.

It would seem that the current range of prefixes, from yocto to yotta, is sufficient to cover all that might be encountered in the physical world, but the advance of science and technology have been full of surprises up to this point, and that trend may well continue.

Beyond the Vigintillion

Since I'm talking about large numbers here, it might be noted in passing that we don't have names for very many large numbers.

As may be familiar to many people, while a million is a thousand thousand, in [the United States of] America a billion is a thousand million, while in Britain a billion is a million million. In Canada, we also follow the American usage in this. Much of the world, however, follows the British practise; this is true, for example, for the names of numbers in Russian and German: but in French, the same principle as used by the Americans is followed.

The following table shows the powers of ten which correspond to the various -illions under both standards:

American British million 6 6 billion 9 12 trillion 12 18 quadrillion 15 24 quintillion 18 30 sextillion 21 36 septillion 24 42 octillion 27 48 nonillion 30 54 decillion 33 60 undecillion 36 66 duodecillion 39 72 tredecillion 42 78 quattuordecillion 45 84 quindecillion 48 90 sexdecillion 51 96 septendecillion 54 102 octodecillion 57 108 novemdecillion 60 114 vigintillion 63 120

Since there is no good Latin prefix for the number twenty-one, this is where the sequence ends in standard dictionaries. Of course, the same pattern that followed the decillion could be repeated, starting with the unvigintillion, but just as in English twenty-seven and seventeen are quite unlike one another in form, the same may be true in Latin, making such a form quite incorrect.

But Latin prefixes are available to make the occasional -illion in the territory beyond (as noted previously by Dimitri A. Borgmann in the justly famous Language on Vacation):

American British trigintillion 93 180 quadragintillion 123 240 quinquagintillion 153 300 sexagintillion 183 360 septuagintillion 213 420 octogintillion 243 480 nonagintillion 273 540 centillion 303 600

and just as one can refer to ten million, or a hundred million, and, in the British system, to a thousand million, it would seem that if one can refer to a thousand vigintillion and a million vigintillion and so on, there is no problem with the fact that the words are now more widely spaced.

Power American British 3 thousand thousand 6 million million 9 billion thousand million 12 trillion billion 15 quadrillion thousand billion ... 60 novemdecillion decillion 63 vigintillion thousand decillion 66 thousand vigintillion undecillion 69 million vigintillion thousand undecillion 72 billion vigintillion duodecillion 75 trillion vigintillion thousand duodecillion 78 quadrillion vigintillion tredecillion 81 quintillion vigintillion thousand tredecillion 84 sextillion vigintillion quattuordecillion 87 septillion viginitillion thousand quattuordecillion 90 octillion vigintillion quindecillion 93 trigintillion thousand quindecillion 96 thousand trigintillion sexdecillion ... 120 octillion trigintillion vigintillion 123 quadragintillion thousand vigintillion 126 thousand quadragintillion million vigintillion 129 million quadragintillion thousand million vigintillion 132 billion quadragintillion billion vigintillion 135 trillion quadragintillion thousand billion vigintillion 138 quadrillion quadragintillion trillion vigintillion 141 quintillion quadragintillion thousand trillion vigintillion 144 sextillion quadragintillion quadrillion vigintillion 147 septillion quadragintillion thousand quadrillion vigintillion 150 octillion quadragintillion quintillion vigintillion 153 quinquagintillion thousand quintillion vigintillion 156 thousand quinquagintillion sextillion vigintillion 159 million quinquagintillion thousand sextillion vigintillion 162 billion quinquagintillion septillion vigintillion 165 trillion quinquagintillion thousand septillion vigintillion 168 quadrillion quinquagintillion octillion vigintillion 171 quintillion quinquagintillion thousand octillion vigintillion 174 sextillion quinquagintillion nonillion vigintillion 177 septillion quinquagintillion thousand nonillion vigintillion 180 octillion quinquagintillion trigintillion 183 sexagintillion thousand trigintillion

This quite nicely extends the American system far enough to embrace the googol. However, it but scratches the surface of the integers; it does not begin, for example, to reach towards numbers like the googolplex, (which is 10^(10^100) just as the googol is 10^100) and very large numbers of that kind can indeed be meaningful to mathematicians; they are sometimes required in mathematical proofs, for example.

Although the endless line of finite integers defeats any attempt to name them all, this scheme can be extended a minute amount further. Good Latin prefixes for two hundred, three hundred, and so on, would allow this scheme to be extended upwards for another level, up to what would be, I suppose, the millillion, 10^3003 in the American system and 10^6000 in the British system.

The British system does seem to be superior and more logical, but it would also seem that it is too late to do more than deplore the ambiguity. If there were a way to start over on a blank sheet of paper, as it were, though... and perhaps there is.

At one point, I thought, given the -ard suffix of milliard, a British term for a thousand million, that perhaps the British and American systems could be distinguished by new suffixes. Now, I have a better idea: start over, with the myriad, instead of the thousand, as the basis, and consistently use the British system:

1,0000 myriad 1,0000,0000 milliad 1,0000,0000,0000 myriad milliad 1,0000,0000,0000,0000 billiad

and so on, up to the centilliad, which would be ten to the eight hundredth power.

However, Donald Knuth had already proposed such a scheme, with the additional improvement that new number names would come along not every time eight digits are added to the end of the number, but when the length of the number doubles, thus enabling the system to be extended to much larger numbers.

Given that printouts of pi to many places of decimals usually include a space after every five digits, using 100,000 rather than 10,000 as the basic unit might also be considered. This quantity is known as the lakh in India, which also has the word crore for ten million. The chiliad is simply a synonym for a thousand; apparently there is no simple Greek word for 100,000.

One might also look here for some historical information on the two different systems; it is noted that the original system proposed by Nicolas Chuquet (in 1484) was the one used everywhere else but America, France, and Canada, and that Jacques Peletier (in 1529) introduced the milliard, billiard, and subsequent names for intermediate numbers in that same system.

An ancient Indian religious book, the Lalitavitsara, notes ten million as the koti, and then continues with reference to the ayuta, niyuta, kankara, vivara, asobhya, vivaha, utsanga, bahula, and nagabala, each one a hundred times larger than the one before, but the ayuta was also used to mean 10,000 instead of 1,000,000,000 in India.

The common Indian system of numeration proceeds as follows for the powers of ten:

1 dasa das 2 sata san 3 sahasra hazar 4 ayuta 5 laksa lakh 6 prayuda 7 koti crore 8 vyarbuda 9 padma arahb (arab) 10 kharva 11 nikharva carahb (kharab) 12 mahapadma 13 sankha nie (neel) 14 samdra 15 madhya padham (padma) 16 antya 17 pararddha sankh (shankh)

the first column giving one transliteration of the system used in the Sanskrit language, and the second column giving the names of large numbers in use in India today (with an alternate transliteration, found in the Wikipedia article on the Indian numbering system, in parentheses).

Agreement at Last?

Rather than attempting to use a binary notation for powers of ten, it would seem appropriate to use a decimal system for large decimal numbers.

Perhaps the following system might be usable as one to replace the current system, which causes confusion by not being the same in all countries:

1 ten 2 hundred 3 thousand 4 ten thousand 5 hundred thousand 6 million 7 ten million 8 hundred million 9 thousand million 10 kharva 20 bikharva 30 trikharva 40 quadrakharva 50 quintikharva 60 sexakharva 70 septakharva 80 octokharva 90 nonakharva 100 decikharva 200 vigintikharva 300 trigintikharva 400 quadragintikharva 500 quinquagintikharva 600 sexagintikharva 700 septuagintikharva 800 octogintikharva 900 nonagintikharva 1000 centikharva 2000 duocentikharva 3000 tricentikharva 4000 quadracentikharva 5000 quinquacentikharva 6000 sexacentikharva 7000 septuacentikharva 8000 octocentikharva 9000 nonacentikharva 10000 millikharva 20000 duomillikharva 30000 trimillikharva 40000 quadramillikharva 50000 quinquamillikharva 60000 sexamillikharva 70000 septuamillikharva 80000 octomillikharva 90000 nonamillikharva 100000 myriakharva 1000000 megakharva 10000000 kotikharva

Thus, the googol is now the decikharva. One could continue with the vyarbudakharva, followed by the gigakharva, but perhaps something better could be worked out.

Also note that the duomyriakharva, trimyriakharva, and so on are still required, but are simply omitted to save space, as the pattern is well established with the centikharva and millikharva that the myriakharva, megakharva, and kotikharva would continue.

In this system,

2,048 10

would be one hundred million quadrakharva duocentikharva, for example, or 52!, the number of ways in which a deck of cards without jokers can be shuffled, which is 80,658,175,170,943,878,571,660,636,856,403,766,975,289,505,440,883,277,824,000,000,000,000, would be:

Eighty million, six hundred and fifty-eight thousand, one hundred and seventy five sexakharva, one thousand, seven hundred and nine million, four hundred and thirty-eight thousand, seven hundred and eighty-five quintikharva, seven thousand, one hundred and sixty-six million, sixty-three thousand, six hundred and eighty-five quadrakharva, six thousand, four hundred and three million, seven hundred and sixty-six thousand, nine hundred and seventy-five trikharva, two thousand, eight hundred and ninety-five million, fifty-four thousand, four hundred and eight bikharva, eight thousand, three hundred and twenty-seven million, seven hundred and eighty-two thousand, four hundred kharva.

Perhaps a simpler way to achieve unity would be to note that the milliard is not used in America, since the billion is used there for the same number, and so one could simply eliminate the ambiguous names, and instead follow the series beginning with the milliard. That is, a British billion would now become a thousand milliard, and a British decillion would become a thousand nonillard, as follows:

9 milliard 12 thousand milliard 15 billiard 18 thousand billiard 21 trilliard 24 thousand trilliard 27 quadrilliard 33 quintilliard 39 sextilliard 45 septilliard 51 octilliard 57 nonilliard 63 decilliard 69 undecilliard 75 duodecilliard 81 tredecilliard 87 quattuordecilliard 93 quindecilliard 99 sexdecilliard 105 septendecilliard 111 octodecilliard 117 novemdecilliard 123 vigintilliard 183 trigintilliard 243 quadragintilliard 303 quinquagintilliard 363 sexagintilliard 423 septuagintilliard 483 octogintilliard 543 nonagintilliard 546 thousand nonagintilliard 549 million nonagintilliard 552 milliard nonagintilliard 555 thousand milliard nonagintilliard 558 billiard nonagintilliard 597 octilliard nonagintilliard 600 thousand octilliard nonagintilliard 603 centilliard

In this tradition-based system, 52! becomes:

Eighty thousand and six hundred and fifty eight decilliard, one hundred and seventy-five thousand, one hundred and seventy nonilliard, nine hundred and forty-three thousand, eight hundred and seventy-eight octilliard, five hundred and seventy-one thousand, six hundred and sixty septilliard, six hundred and thirty-six thousand, eight hundred and fifty-six sextilliard, four hundred and three thousand, seven hundred and sixty-six quintilliard, nine hundred and seventy-five thousand, two hundred and eighty-nine quadrilliard, five hundred and five thousand, four hundred and forty trilliard, eight hundred and eighty-three thousand, two hundred and seventy-seven billiard, eight hundred and twenty-four thousand milliard.

I had also considered avoiding the exotic by the simple expedient of assigning definite explicit numerical values to the colloquial "zillion", "bazillion", and "gazillion", but for the moment, I would refer you to this page, which discusses this type of number in great detail.

Since there happens to be a game known as billiards, the alternative of keeping the milliard as 10^9, but then making the billard 10^18, the trilliard 10^27, and the quadrilliard as 10^36, and so on for a unified system does not seem to be available either. However, thinking of other likely suffixes, like the -ant of savant, or the -et of gourmet versus the -and of gourmand, provides another idea, one where a step backwards is taken to take a step forwards:

3 thousand 6 million (or bisand) 9 trisand 12 quadrasand 15 quintisand 18 sexasand 21 septasand 24 octosand 27 nonasand 30 decisand

and another way to proceed would be to use prefixes whose purity was not affected by the British/French split:

6 million 12 ethillion 18 butyllion 24 tetrillion 30 pentillion 36 hexillion 42 heptillion

Since "million" starts with an M, and so does "methane", it seems as if the natural thing to do is to begin with something different from the standard set of Greek prefixes. The trouble is that, after seven, the Greek and Latin prefixes no longer remain sufficiently distinct to sustain this alternative. Also, someone else already has made a suggestion to switch to the Greek prefixes to permit larger numbers to be reached. Having two distinct sets of prefixes would allow, say, a dillion to be a thousand (or a million) centillion, thus permitting a jump to a new level when a system using one set of prefixes becomes exhausted.

Given the coincidence of prefixes between the Greek and Latin systems, perhaps we now have a use for the zillion after all, with ba-, ka-, and ga- replacing tri-, oct-, and non-, as follows:

American British centillion 303 600 septillion centillion 327 642 octillion centillion 330 648 nonillion centillion ** 654 zillion 333 660

with the zillion coming into play just after either system ends. However, this has the consequence of it becoming difficult to convert from a power of ten to a numeric name, particularly if the practice continues, so that one is dealing with number names that refer to 10^(K*(101^n)) where K is either 333 or 660, and thus it would be more appropriate to make the zillion just a bit smaller, and avoid using the centizillion before creating the dizillion, so as to obtain the following system:

zillion 600 bizillion 1200 trizillion 2400 nonagintizillion 54000 dizillion 60000 bidizillion 120000 nonagintidizillion 5400000 bazillion 6000000 tetrazillion 600000000 pentazillion 60000000000 hexazillion 6000000000000 heptazillion 600000000000000 kazillion 60000000000000000 gazillion 6000000000000000000 dekazillion 600000000000000000000

And, if we continue with smaller numbers expressed by the unified system based on the milliard, billiard, and trilliard and so on as above, numbers up to quite a large size now have unambiguous names.

(It might be noted, though, that the Greek prefix for nine is, or at least can be, the unambiguous ennea-, and thus it isn't really necessary to retain the gazillion... but I do recall that chemists refer to nonane and not enneanane after octane.)

In this system,

31,415,926,535 10

becomes one hundred quadrilliard quinquagintilliard septizillion septigizillion octadizillion nonagidizillion quinquabazillion trigintibazillion bitetrazillion quinquatetrazillion.

Yet another possibility has occured to me:

1 ten 2 hundred 3 thousand 4 myriad 5 ten myriad 6 hundred myriad 7 thousand myriad 8 byriad 10 hundred byriad 12 tryriad 14 hundred tryriad 16 quadriad 18 hundred quadriad 20 quintriad 22 hundred quintriad 24 sextriad 26 hundred sextriad 28 septriad 30 hundred septriad 32 octriad 34 hundred octriad 36 nonyriad 38 hundred nonyriad 40 decyriad

and so on, a scaled-down long form system based on the myriad instead of the million. Since it is scaled down, there would be less of an impulse to go to the corresponding short form.