[Home] [Other]

Some Notes on Big Numbers

Big numbers are, well, big. Although that seems rather tautological, there is an important point about the size of big numbers that does have to be emphasized.

Let us imagine a combination lock which has a combination consisting of three decimal digits. Actually, such locks do exist in reality; they are common on attaché cases, for example.

There are a thousand different combinations to such a lock.

What about a lock with a six digit combination?

This time, the number of combinations is a million.

Although there are twice as many digits in the combination, the number of combinations is not twice as large. There aren't only two thousand combinations, but a million combinations. For each possible arrangement of the first three digits, one has to try all possible combinations of the last three digits.

Thus, a lock with a six-digit combination is five hundred times as secure as two locks with three-digit combinations that can be unlocked separately.

This shows why, if a cipher system is well designed, so that only brute-force search is possible as an attack on it, as long as the key is long enough, very high security can be achieved. Ciphers do have their own inherent limits on security, however. A simple change that makes a cipher's key much longer would normally mean that there is a way to break the cipher that is faster than doing a brute-force search on the longer key. But while making ciphers with longer keys is not completely trivial, it basically consists of making the cipher bigger: more rounds and a larger block size.

For conventional symmetric-key ciphers, which are the ones that can be made as hard to crack as brute-force on the key, how big a key is enough?

Here is one number to start with: in 1998, the Electronic Frontier Foundation built a machine, for about $250,000, that could break DES, with a 56-bit key, by brute force in three and a half days.

Based on that number, I estimated that an 80-bit key, such as that used in SKIPJACK, would be subject to brute-force search by the NSA.

Let's assume that they would have twenty-five billion dollars available to spend on a cipher-cracking machine; that is a hundred thousand times as much money as the EFF had to spend.

Let's assume they could use this machine for a year to crack the key to one particularly important cipher system. That is a hundred times longer than it took to crack DES.

Let's also assume that no cost penalty was involved in designing the machine so that it could be configured to attack different ciphers, unlike the EFF machine, which is specifically designed to attack DES and DES only. Of course, I am still assuming Kerckhoff's dictum that the cipher system is known; I am making pessimistic assumptions here to be on the safe side, but I am not intending to venture into utter fantasy. Of course, techniques like genetic algorithms could perhaps be used on unknown systems, but here there would be a very large increase in the complexity of what is being done.

Finally, I threw in another factor of a hundred, to account for the NSA having access to advanced electronics beyond the current public state of the art, and with the chips the EFF used not being the fastest or highest-performing chips in existence even then. Since 1998 is now five years ago, of course, Moores' Law has provided us with half that factor in any case.

One hundred thousand (times as much money), times one hundred (times as much time), times one hundred (times as fast equipment). That comes out to a nice round factor of an American billion, which is a thousand million, or a thousand thousand thousand. As it happens, two raised to the tenth power is 1,024, very close to a thousand. Each factor of a thousand, therefore, adds ten bits to the size of the keys that can be attacked.

Fifty-six plus thirty is eighty-six.

While tying up $25 billion for a year might seem a bit much, those extra six bits let us be more reasonable; let us say $2.5 billion to crack an 80-bit key in a few months.

Bruce Schneier's book Applied Cryptography notes that fundamental physical limitations appear to prevent brute-force attacks against keys 256 bits in length.

Quantum computing allows a single computer to work on an immense number of possibilities at once, but there is a fundamental physical limitation on their operation that means that if they did exist, they would simply double the length of keys that could be cracked.

If it becomes possible to travel and communicate faster than light, then one could fill the Universe, with some 10^20 stars in it, with computers that just barely avoid turning into black holes; and, if one is going to throw in ideas that are from soft science fiction, what about parallel universes (of course, it could be said that parallel universes are exactly what are employed by quantum computers)? This may not matter in the real world, but for purposes of self-consistency, science-fiction writers should pay attention to the lengths of cryptographic keys their characters will need to employ. Of course, technologies to scan computing devices - and read minds - will make it harder to keep secrets as well. The proverbial tinfoil hat, rather than a fancier cryptographic algorithm, might well be what concerns adventurers wandering through the galaxy.

A Useful Table

Since scientists are using bigger and smaller quantities these days, they've been adding new prefixes to the metric system with which not everyone is familiar. As a public service, a table thereof is included here. But first, here are the prefixes everyone is familiar with:

 1           deka    ten                               deci    tenth
 2           hecto   hundred                           centi   hundredth

 3  (10)     kilo    thousand                          milli   thousandth
 6  (20)     mega    million                           micro   millionth
 9  (30)     giga    billion (thousand million)        nano
12  (40)     tera    trillion (billion)                pico

and now those which are newer:

15  (50)     peta    quadrillion (thousand billion)    femto
18  (60)     exa     quintillion (trillion)            atto
21  (70)     zetta   sextillion (thousand trillion)    zepta
24  (80)     yotta   septillion (quadrillion)          yocto
27  (90)     ronna   octillion (thousand quadrillion)  ronto
30 (100)     quetta  nonillion (quintillion)           quecto

The ronna and quetta prefixes were added in 2022; zetta and yotta were added in 1991; while femto and atto were addopted in 1964, their counterparts peta and exa were only adopted in 1975.

The prefixes giga and nano, as well as tera and pico, are also of relatively recent origin, having been adopted in 1960, even though they are now generally familar due to their usage in connection with computers.

The first column indicates the powers of 10 involved, both positive and negative, for the two prefixes given, and the second column, in parentheses, the powers of 2 sometimes used with the positive-power prefixes in the computer field, on the basis that two to the tenth power is 1,024, which is closely approximated by ten to the third power, 1,000.

Note also that the British (and German, and Russian) number names follow, in parentheses, the American (and French) number names for the large numbers.

It would seem that the current range of prefixes, from yocto to yotta, is sufficient to cover all that might be encountered in the physical world, but the advance of science and technology have been full of surprises up to this point, and that trend may well continue.

Speaking of computers: since 103 is 1,000 and 210 is 1,024, often 1,024 bytes are referred to as a kilobyte, and 1,048,576 bytes is referred to as a megabyte, and so on.

However, alternatives have been proposed to eliminate this source of confusion and inaccuracy:

K kilo                               1,000    Ki kibi                             1,024
M mega                           1,000,000    Mi mibi                         1,048,576
G giga                       1,000,000,000    Gi gibi                     1,073,741,824
T tera                   1,000,000,000,000    Ti tebi                 1,099,511,627,776
P peta               1,000,000,000,000,000    Pi pebi             1,125,899,906,842,624
E exa            1,000,000,000,000,000,000    Ei exbi         1,152,921,504,606.846,976
Z zetta      1,000,000,000,000,000,000,000    Zi zebi     1,180,591,620,717,411,303,424
Y yotta  1,000,000,000,000,000,000,000,000    Yi yobi 1,208,925,819,614,629,174,706,176

Actually, not only have these alternatives been proposed, they have been officially adopted by the IEC (International Electrotechnical Commission), even if they are not yet in general use.

Beyond the Vigintillion

Since I'm talking about large numbers here, it might be noted in passing that we don't have names for very many large numbers.

As may be familiar to many people, while a million is a thousand thousand, in [the United States of] America a billion is a thousand million, while in Britain a billion is a million million. In Canada, we also follow the American usage in this. Much of the world, however, follows the British practise; this is true, for example, for the names of numbers in Russian and German: but in French, the same principle as used by the Americans is followed.

The following table shows the powers of ten which correspond to the various -illions under both standards:

                       American   British
million                  6          6
billion                  9         12
trillion                12         18
quadrillion             15         24
quintillion             18         30 
sextillion              21         36
septillion              24         42
octillion               27         48
nonillion               30         54
decillion               33         60
undecillion             36         66
duodecillion            39         72
tredecillion            42         78
quattuordecillion       45         84
quindecillion           48         90
sexdecillion            51         96
septendecillion         54        102
octodecillion           57        108
novemdecillion          60        114
vigintillion            63        120

Since there is no good Latin prefix for the number twenty-one, this is where the sequence ends in standard dictionaries. Of course, the same pattern that followed the decillion could be repeated, starting with the unvigintillion, but just as in English twenty-seven and seventeen are quite unlike one another in form, the same may be true in Latin, making such a form quite incorrect.

But Latin prefixes are available to make the occasional -illion in the territory beyond (as noted previously by Dimitri A. Borgmann in the justly famous Language on Vacation):

                       American   British
trigintillion           93        180
quadragintillion       123        240
quinquagintillion      153        300
sexagintillion         183        360
septuagintillion       213        420
octogintillion         243        480
nonagintillion         273        540
centillion             303        600

and just as one can refer to ten million, or a hundred million, and, in the British system, to a thousand million, it would seem that if one can refer to a thousand vigintillion and a million vigintillion and so on, there is no problem with the fact that the words are now more widely spaced.

Power   American                            British
  3     thousand                            thousand
  6     million                             million
  9     billion                             thousand million
 12     trillion                            billion
 15     quadrillion                         thousand billion
...
 60     novemdecillion                      decillion
 63     vigintillion                        thousand decillion
 66     thousand vigintillion               undecillion
 69     million vigintillion                thousand undecillion
 72     billion vigintillion                duodecillion
 75     trillion vigintillion               thousand duodecillion
 78     quadrillion vigintillion            tredecillion
 81     quintillion vigintillion            thousand tredecillion
 84     sextillion vigintillion             quattuordecillion
 87     septillion viginitillion            thousand quattuordecillion
 90     octillion vigintillion              quindecillion
 93     trigintillion                       thousand quindecillion
 96     thousand trigintillion              sexdecillion
...
120     octillion trigintillion             vigintillion
123     quadragintillion                    thousand vigintillion
126     thousand quadragintillion           million vigintillion
129     million quadragintillion            thousand million vigintillion
132     billion quadragintillion            billion vigintillion
135     trillion quadragintillion           thousand billion vigintillion
138     quadrillion quadragintillion        trillion vigintillion
141     quintillion quadragintillion        thousand trillion vigintillion
144     sextillion quadragintillion         quadrillion vigintillion
147     septillion quadragintillion         thousand quadrillion vigintillion
150     octillion quadragintillion          quintillion vigintillion
153     quinquagintillion                   thousand quintillion vigintillion
156     thousand quinquagintillion          sextillion vigintillion
159     million quinquagintillion           thousand sextillion vigintillion
162     billion quinquagintillion           septillion vigintillion
165     trillion quinquagintillion          thousand septillion vigintillion
168     quadrillion quinquagintillion       octillion vigintillion
171     quintillion quinquagintillion       thousand octillion vigintillion
174     sextillion quinquagintillion        nonillion vigintillion
177     septillion quinquagintillion        thousand nonillion vigintillion
180     octillion quinquagintillion         trigintillion
183     sexagintillion                      thousand trigintillion

This quite nicely extends the American system far enough to embrace the googol. However, it but scratches the surface of the integers; it does not begin, for example, to reach towards numbers like the googolplex, (which is 10^(10^100) just as the googol is 10^100) and very large numbers of that kind can indeed be meaningful to mathematicians; they are sometimes required in mathematical proofs, for example.

Although the endless line of finite integers defeats any attempt to name them all, this scheme can be extended a minute amount further. Good Latin prefixes for two hundred, three hundred, and so on, would allow this scheme to be extended upwards for another level, up to what would be, I suppose, the millillion, 10^3003 in the American system and 10^6000 in the British system.

The British system does seem to be superior and more logical, but it would also seem that it is too late to do more than deplore the ambiguity. If there were a way to start over on a blank sheet of paper, as it were, though... and perhaps there is.

At one point, I thought, given the -ard suffix of milliard, a British term for a thousand million, that perhaps the British and American systems could be distinguished by new suffixes. Now, I have a better idea: start over, with the myriad, instead of the thousand, as the basis, and consistently use the British system:

               1,0000    myriad
          1,0000,0000    milliad
     1,0000,0000,0000    myriad milliad
1,0000,0000,0000,0000    billiad

and so on, up to the centilliad, which would be ten to the eight hundredth power.

However, Donald Knuth had already proposed such a scheme, with the additional improvement that new number names would come along not every time eight digits are added to the end of the number, but when the length of the number doubles, thus enabling the system to be extended to much larger numbers.

Given that printouts of pi to many places of decimals usually include a space after every five digits, using 100,000 rather than 10,000 as the basic unit might also be considered. This quantity is known as the lakh in India, which also has the word crore for ten million. The chiliad is simply a synonym for a thousand; apparently there is no simple Greek word for 100,000.

One might also look here for some historical information on the two different systems; it is noted that the original system proposed by Nicolas Chuquet (in 1484) was the one used everywhere else but America, France, and Canada, and that Jacques Peletier (in 1529) introduced the milliard, billiard, and subsequent names for intermediate numbers in that same system.

An ancient Indian religious book, the Lalitavitsara, notes ten million as the koti, and then continues with reference to the ayuta, niyuta, kankara, vivara, asobhya, vivaha, utsanga, bahula, and nagabala, each one a hundred times larger than the one before, but the ayuta was also used to mean 10,000 instead of 1,000,000,000 in India.

The common Indian system of numeration proceeds as follows for the powers of ten:

 1  dasa         das
 2  sata         san
 3  sahasra      hazar
 4  ayuta
 5  laksa        lakh
 6  prayuda
 7  koti         crore
 8  vyarbuda
 9  padma        arahb   (arab)
10  kharva
11  nikharva     carahb  (kharab)
12  mahapadma
13  sankha       nie     (neel)
14  samdra
15  madhya       padham  (padma)
16  antya
17  pararddha    sankh   (shankh)

the first column giving one transliteration of the system used in the Sanskrit language, and the second column giving the names of large numbers in use in India today (with an alternate transliteration, found in the Wikipedia article on the Indian numbering system, in parentheses).

Agreement at Last?

Rather than attempting to use a binary notation for powers of ten, it would seem appropriate to use a decimal system for large decimal numbers.

Perhaps the following system might be usable as one to replace the current system, which causes confusion by not being the same in all countries:

       1 ten
       2 hundred
       3 thousand
       4 ten thousand
       5 hundred thousand
       6 million
       7 ten million
       8 hundred million
       9 thousand million
      10 kharva
      20 bikharva
      30 trikharva
      40 quadrakharva
      50 quintikharva
      60 sexakharva
      70 septakharva
      80 octokharva
      90 nonakharva
     100 decikharva
     200 vigintikharva
     300 trigintikharva
     400 quadragintikharva
     500 quinquagintikharva
     600 sexagintikharva
     700 septuagintikharva
     800 octogintikharva
     900 nonagintikharva
    1000 centikharva
    2000 duocentikharva
    3000 tricentikharva
    4000 quadracentikharva
    5000 quinquacentikharva
    6000 sexacentikharva
    7000 septuacentikharva
    8000 octocentikharva
    9000 nonacentikharva
   10000 millikharva
   20000 duomillikharva
   30000 trimillikharva
   40000 quadramillikharva
   50000 quinquamillikharva
   60000 sexamillikharva
   70000 septuamillikharva
   80000 octomillikharva
   90000 nonamillikharva
  100000 myriakharva
 1000000 megakharva
10000000 kotikharva

Thus, the googol is now the decikharva. One could continue with the vyarbudakharva, followed by the gigakharva, but perhaps something better could be worked out.

Also note that the duomyriakharva, trimyriakharva, and so on are still required, but are simply omitted to save space, as the pattern is well established with the centikharva and millikharva that the myriakharva, megakharva, and kotikharva would continue.

In this system,

  2,048
10

would be one hundred million quadrakharva duocentikharva, for example, or 52!, the number of ways in which a deck of cards without jokers can be shuffled, which is 80,658,175,170,943,878,571,660,636,856,403,766,975,289,505,440,883,277,824,000,000,000,000, would be:

Eighty million, six hundred and fifty-eight thousand, one hundred and seventy five sexakharva, one thousand, seven hundred and nine million, four hundred and thirty-eight thousand, seven hundred and eighty-five quintikharva, seven thousand, one hundred and sixty-six million, sixty-three thousand, six hundred and eighty-five quadrakharva, six thousand, four hundred and three million, seven hundred and sixty-six thousand, nine hundred and seventy-five trikharva, two thousand, eight hundred and ninety-five million, fifty-four thousand, four hundred and eight bikharva, eight thousand, three hundred and twenty-seven million, seven hundred and eighty-two thousand, four hundred kharva.

Perhaps a simpler way to achieve unity would be to note that the milliard is not used in America, since the billion is used there for the same number, and so one could simply eliminate the ambiguous names, and instead follow the series beginning with the milliard. That is, a British billion would now become a thousand milliard, and a British decillion would become a thousand nonillard, as follows:

  9      milliard
 12      thousand milliard
 15      billiard
 18      thousand billiard
 21      trilliard
 24      thousand trilliard
 27      quadrilliard
 33      quintilliard
 39      sextilliard
 45      septilliard
 51      octilliard
 57      nonilliard
 63      decilliard
 69      undecilliard
 75      duodecilliard
 81      tredecilliard
 87      quattuordecilliard
 93      quindecilliard
 99      sexdecilliard
105      septendecilliard
111      octodecilliard
117      novemdecilliard
123      vigintilliard
183      trigintilliard
243      quadragintilliard
303      quinquagintilliard
363      sexagintilliard
423      septuagintilliard
483      octogintilliard
543      nonagintilliard
546      thousand nonagintilliard
549      million nonagintilliard
552      milliard nonagintilliard
555      thousand milliard nonagintilliard
558      billiard nonagintilliard
597      octilliard nonagintilliard
600      thousand octilliard nonagintilliard
603      centilliard

In this tradition-based system, 52! becomes:

Eighty thousand and six hundred and fifty eight decilliard, one hundred and seventy-five thousand, one hundred and seventy nonilliard, nine hundred and forty-three thousand, eight hundred and seventy-eight octilliard, five hundred and seventy-one thousand, six hundred and sixty septilliard, six hundred and thirty-six thousand, eight hundred and fifty-six sextilliard, four hundred and three thousand, seven hundred and sixty-six quintilliard, nine hundred and seventy-five thousand, two hundred and eighty-nine quadrilliard, five hundred and five thousand, four hundred and forty trilliard, eight hundred and eighty-three thousand, two hundred and seventy-seven billiard, eight hundred and twenty-four thousand milliard.

80,658,175,170,943,878,571,660,636,856,403,766,975,289,505,440,883,277,824,000,000,000,000

I had also considered avoiding the exotic by the simple expedient of assigning definite explicit numerical values to the colloquial "zillion", "bazillion", and "gazillion", but for the moment, I would refer you to this page, which discusses this type of number in great detail.

Since there happens to be a game known as billiards, the alternative of keeping the milliard as 10^9, but then making the billard 10^18, the trilliard 10^27, and the quadrilliard as 10^36, and so on for a unified system does not seem to be available either. However, thinking of other likely suffixes, like the -ant of savant, or the -et of gourmet versus the -and of gourmand, provides another idea, one where a step backwards is taken to take a step forwards:

   3  thousand
   6  million (or bisand)
   9  trisand
  12  quadrasand
  15  quintisand
  18  sexasand
  21  septasand
  24  octosand
  27  nonasand
  30  decisand

and another way to proceed would be to use prefixes whose purity was not affected by the British/French split:

   6 million
  12 ethillion
  18 butyllion
  24 tetrillion
  30 pentillion
  36 hexillion
  42 heptillion

Since "million" starts with an M, and so does "methane", it seems as if the natural thing to do is to begin with something different from the standard set of Greek prefixes. The trouble is that, after seven, the Greek and Latin prefixes no longer remain sufficiently distinct to sustain this alternative. Also, someone else already has made a suggestion to switch to the Greek prefixes to permit larger numbers to be reached. Having two distinct sets of prefixes would allow, say, a dillion to be a thousand (or a million) centillion, thus permitting a jump to a new level when a system using one set of prefixes becomes exhausted.

Given the coincidence of prefixes between the Greek and Latin systems, perhaps we now have a use for the zillion after all, with ba-, ka-, and ga- replacing tri-, oct-, and non-, as follows:

                          American                British
centillion                     303                     600
septillion centillion          327                     642
octillion centillion           330                     648
nonillion centillion            **                     654
zillion                        333                     660

with the zillion coming into play just after either system ends. However, this has the consequence of it becoming difficult to convert from a power of ten to a numeric name, particularly if the practice continues, so that one is dealing with number names that refer to 10^(K*(101^n)) where K is either 333 or 660, and thus it would be more appropriate to make the zillion just a bit smaller, and avoid using the centizillion before creating the dizillion, so as to obtain the following system:

zillion                                 600
  bizillion                            1200
  trizillion                           2400
  nonagintizillion                    54000
dizillion                             60000
  bidizillion                        120000
  nonagintidizillion                5400000
bazillion                           6000000
tetrazillion                      600000000
pentazillion                    60000000000
hexazillion                   6000000000000
heptazillion                600000000000000
kazillion                 60000000000000000
gazillion               6000000000000000000
dekazillion           600000000000000000000   

And, if we continue with smaller numbers expressed by the unified system based on the milliard, billiard, and trilliard and so on as above, numbers up to quite a large size now have unambiguous names.

(It might be noted, though, that the Greek prefix for nine is, or at least can be, the unambiguous ennea-, and thus it isn't really necessary to retain the gazillion... but I do recall that chemists refer to nonane and not enneanane after octane.)

In this system,

  31,415,926,535
10

becomes one hundred quadrilliard quinquagintilliard septizillion septigizillion octadizillion nonagidizillion quinquabazillion trigintibazillion bitetrazillion quinquatetrazillion.

Yet another possibility has occured to me:

  1     ten
  2     hundred
  3     thousand
  4     myriad
  5     ten myriad
  6     hundred myriad
  7     thousand myriad
  8     byriad
 10     hundred byriad
 12     tryriad
 14     hundred tryriad
 16     quadriad
 18     hundred quadriad
 20     quintriad
 22     hundred quintriad
 24     sextriad
 26     hundred sextriad
 28     septriad
 30     hundred septriad
 32     octriad
 34     hundred octriad
 36     nonyriad
 38     hundred nonyriad
 40     decyriad     

and so on, a scaled-down long form system based on the myriad instead of the million. Since it is scaled down, there would be less of an impulse to go to the corresponding short form.

Copyright (c) 2004, 2005 John J. G. Savard


[Home] [Other]