Main : Index : Pencil and Paper Systems : Improving Substitution : Polyalphabetic Substitution

Polyalphabetic Substitution

The idea of using substitution ciphers that change during the course of a message was a very important step forwards in cryptography. David Kahn's book, The Codebreakers, gives a full account of the origins of this idea during the Italian Renaissance.

The earliest form of polyalphabetic cipher was developed by Leon Battista Alberti by 1467. His system involved writing the ciphertext in small letters, and using capital letters as symbols, called indicators, to indicate when the substitution changes, now and then through a message. The plaintext alphabet on his cipher disk was in order, and included the digits 1 through 4 for forming codewords from a small vocabulary.

Subsequently, more modern forms were devised, which change the substitution for each letter:

A progressive-key system, where keys are used one after the other in normal order. This was first published posthumously, in a book by Johannes Trithemius that appeared in 1518. The key ABCD...Z was used with regular alphabets in the form depicted therein.
A keyword indicating the alphabets to use in turn. Although this system is what is called the Vigenère, it originated with Giovan Batista Belaso in 1553. In 1563, Giovanni Battista Porta added the use of mixed alphabets to this system.
The autokey system, where a key starts the choice of alphabet, but the message itself determines the alphabets to use for later parts of the message. Although an unusable form of this was first proposed by Girolamo Cardano, it was Blaise de Vigenère who proposed the modern form of the autokey cipher in 1585.

The following compact table provides 26 alphabets, each labelled with a letter of the alphabet:

B C D E F  ZABCDEFGHIJKLMNOPQRSTUVWXY
G H I J K  UVWXYZABCDEFGHIJKLMNOPQRST
L M N O P  PQRSTUVWXYZABCDEFGHIJKLMNO
Q R S T U  KLMNOPQRSTUVWXYZABCDEFGHIJ
V W X Y Z  FGHIJKLMNOPQRSTUVWXYZABCDE

B G L Q V  ABCDEFGHIJKLMNOPQRSTUVWXYZ
C H M R W  BCDEFGHIJKLMNOPQRSTUVWXYZA
D I N S X  CDEFGHIJKLMNOPQRSTUVWXYZAB
E J O T Y  DEFGHIJKLMNOPQRSTUVWXYZABC
F K P U Z  EFGHIJKLMNOPQRSTUVWXYZABCD

The A alphabet isn't shown, since in that alphabet, every letter stands for itself, and so, if nothing is done, nothing need be looked up in the table. For any other alphabet, use the letter indicating the alphabet to find a row among the top five, and a row among the bottom five; using those two rows, the upper row stands for plaintext, the lower for cipher.

Thus, for alphabet Q, the top row begins KLMNO... and the bottom row begins ABCDE..., and so K becomes A, Q becomes G, and A becomes Q in that alphabet.

If you think of A as standing for zero, B for 1, up to Z for 25, this particular set of alphabets is nothing more than the modulo 26 addition of the plaintext and the key to obtain the ciphertext. Circular disks or sliding scales can be used to carry out the addition. This, perhaps, can be more easily seen if we exhibit the Vigenère tableau in full, accompanied by the table for modulo-26 addition:

 |ABCDEFGHIJKLMNOPQRSTUVWXYZ    | 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
-+--------------------------  --+-----------------------------------------------------------------------------
A|ABCDEFGHIJKLMNOPQRSTUVWXYZ   0| 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
B|BCDEFGHIJKLMNOPQRSTUVWXYZA   1| 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0
C|CDEFGHIJKLMNOPQRSTUVWXYZAB   2| 2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1
D|DEFGHIJKLMNOPQRSTUVWXYZABC   3| 3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2
E|EFGHIJKLMNOPQRSTUVWXYZABCD   4| 4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3
F|FGHIJKLMNOPQRSTUVWXYZABCDE   5| 5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4
G|GHIJKLMNOPQRSTUVWXYZABCDEF   6| 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5
H|HIJKLMNOPQRSTUVWXYZABCDEFG   7| 7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6
I|IJKLMNOPQRSTUVWXYZABCDEFGH   8| 8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7
J|JKLMNOPQRSTUVWXYZABCDEFGHI   9| 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8
K|KLMNOPQRSTUVWXYZABCDEFGHIJ  10|10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9
L|LMNOPQRSTUVWXYZABCDEFGHIJK  11|11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10
M|MNOPQRSTUVWXYZABCDEFGHIJKL  12|12 13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11
N|NOPQRSTUVWXYZABCDEFGHIJKLM  13|13 14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12
O|OPQRSTUVWXYZABCDEFGHIJKLMN  14|14 15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13
P|PQRSTUVWXYZABCDEFGHIJKLMNO  15|15 16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14
Q|QRSTUVWXYZABCDEFGHIJKLMNOP  16|16 17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
R|RSTUVWXYZABCDEFGHIJKLMNOPQ  17|17 18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
S|STUVWXYZABCDEFGHIJKLMNOPQR  18|18 19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
T|TUVWXYZABCDEFGHIJKLMNOPQRS  19|19 20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
U|UVWXYZABCDEFGHIJKLMNOPQRST  20|20 21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19
V|VWXYZABCDEFGHIJKLMNOPQRSTU  21|21 22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
W|WXYZABCDEFGHIJKLMNOPQRSTUV  22|22 23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21
X|XYZABCDEFGHIJKLMNOPQRSTUVW  23|23 24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22
Y|YZABCDEFGHIJKLMNOPQRSTUVWX  24|24 25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Z|ZABCDEFGHIJKLMNOPQRSTUVWXY  25|25  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

And, of course, instead of the modulo-26 addition table for our 26-letter alphabet, a Vigenere table can be constructed for any alphabet. Thus, modulo-24 addition would be used for the Greek alphabet, modulo-32 addition for the Russian alphabet, or, as shown in the picture at left, modulo-22 addition for the Hebrew alphabet. Here, the table is written from right to left, in the same direction as normally used for Hebrew writing.

The message "Wish you were here" can be encrypted by the three possible methods, using SIAMESE as the keyword:

Straight keyword:

Message: WISHYOUWEREHERE
Key:     SIAMESESIAMESES
Cipher:  OQSTCGYOMRQLWVW

Progressive key:

Message: WISHYOUWEREHERE
Key:     SIAMESETJBNFTFU
Cipher:  OQSTCGYPNSRMXWY

Autokey:

Message: WISHYOUWEREHERE
Key:     SIAMESEWISHYOUW
Cipher:  OQSTCGYSMJLFSLA

For the progressive key, the keyword, followed by the keyword advanced one position at a time through the alphabet, is used. Just using ABCDEF... as the key would not have been unique enough to serve as a real cipher.

The table shown here can be thought of as a table for the addition of letters, which is equivalent to addition modulo 26, where A stands for 0, B stands for 1, continuing on to Z, which would stand for 25.

The plain keyword system can be solved by the Kasiski method; look for repeated sequences of letters in a message, and count the number of letters between them. From this, it is easy to spot common factors, and determine the length of the keyword used. This lets one sort the letters into the ones enciphered with the same alphabet. If normal alphabets are used, looking at the profile of the frequency count makes solution trivial.

For the other two methods, elementary cryptanalysis only allows solution for normal (or at least known) alphabets. The progressive key case can be made to yield its period if one looks not for repeated letters, but for repeated distances in the alphabet between adjacent letters; this subtracts out the slow movement of the keyword through the alphabet. The autokey can basically be solved by brute-force trial on the length of its starting keyword. Of course, these systems can still be solved with mixed alphabets, but more advanced methods are needed, involving statistics or multiple messages with the same key.

In addition to using mixed alphabets for greater security, there are other systems of historical importance.

The Gronsfeld, which added a numeric key to the plaintext, meant that there were only ten possible equivalents for each letter, but was easier to do by hand without a table or slide or disk. The Porta system used a smaller table; the first half of the alphabet was stationary while the second half moved, and equivalents for letters in each half of the alphabet were found in the other half.

The table for the Porta system (converted to the modern 26-letter alphabet) is as follows:

      ABCDEFGHIJKLM
      -------------
  AB |NOPQRSTUVWXYZ
  CD |OPQRSTUVWXYZN
  EF |PQRSTUVWXYZNO
  GH |QRSTUVWXYZNOP
  IJ |RSTUVWXYZNOPQ
K KL |STUVWXYZNOPQR
e MN |TUVWXYZNOPQRS
y OP |UVWXYZNOPQRST
  QR |VWXYZNOPQRSTU
  ST |WXYZNOPQRSTUV
  UV |XYZNOPQRSTUVW
  WX |YZNOPQRSTUVWX
  YZ |ZNOPQRSTUVWXY

The Gronsfeld, and, even more easily, the Porta, because they only allow some letters, but not others, as equivalents for any given plaintext letter, can be attacked through this weakness.

In attempting to devise a cipher that, like the Gronsfeld, lends itself to mental arithmetic, I used (for the English alphabet) the method of representing numbers as letters that was used by the ancient Hebrews and the ancient Greeks:

A 1  J 10  S 100
B 2  K 20  T 200
C 3  L 30  U 300
D 4  M 40  V 400
E 5  N 50  W 500
F 6  O 60  X 600
G 7  P 70  Y 700
H 8  Q 80  Z 800
I 9  R 90

Then, the rule for encipherment is this:

a) If the plaintext and key letters are in the same column, they are added:

B (2) + F (6) = H (8)
L (30) + J (10) = M (40)

b) If the plaintext and key letters are in two different columns, their nonzero digits are added, and the letter in the third column which contains neither key nor plaintext containing the sum is taken:

D (4) + L (30) = Y (700)
W (500) + K (20) = G (7)

If we had a 27-letter alphabet, we would only have to add that when the sum is greater than 9, subtract 9 (in the appropriate digit place):

M (40) + Q (80) = L (30)

For the 26-letter alphabet, it's easy to modify rule (a): if the two letters are in the third column, subtract 800 instead of 900.

U (300) + Y (700) = T (200)

Rule (b) is modified in this way: always subtract 9; if the cipher letter and the key letter produce 900 as the result, use instead the letter that would be produced by enciphering a letter with the value 900 with the key letter. Since there is no letter with that value, when one is produced by deciphering, decipher 900 with the key to get the true plaintext letter.

This produces the table seen below:

    Plaintext
    ABCDEFGHI JKLMNOPQR STUVWXYZ
    ----------------------------
  A|BCDEFGHIA TUVWXYZJS KLMNOPQR
  B|CDEFGHIAB UVWXYZKST LMNOPQRJ
  C|DEFGHIABC VWXYZLSTU MNOPQRJK
  D|EFGHIABCD WXYZMSTUV NOPQRJKL
  E|FGHIABCDE XYZNSTUVW OPQRJKLM
  F|GHIABCDEF YZOSTUVWX PQRJKLMN
  G|HIABCDEFG ZPSTUVWXY QRJKLMNO
  H|IABCDEFGH QSTUVWXYZ RJKLMNOP
  I|ABCDEFGHI STUVWXYZR JKLMNOPQ

  J|TUVWXYZAS KLMNOPQRJ BCDEFGHI
  K|UVWXYZBST LMNOPQRJK CDEFGHIA
  L|VWXYZCSTU MNOPQRJKL DEFGHIAB
K M|WXYZDSTUV NOPQRJKLM EFGHIABC
e N|XYZESTUVW OPQRJKLMN FGHIABCD
y O|YZFSTUVWX PQRJKLMNO GHIABCDE
  P|ZGSTUVWXY QRJKLMNOP HIABCDEF
  Q|HSTUVWXYZ RJKLMNOPQ IABCDEFG
  R|STUVWXYZI JKLMNOPQR ABCDEFGH

  S|KLMNOPQRJ BCDEFGHIA TUVWXYZS
  T|LMNOPQRJK CDEFGHIAB UVWXYZST
  U|MNOPQRJKL DEFGHIABC VWXYZSTU
  V|NOPQRJKLM EFGHIABCD WXYZSTUV
  W|OPQRJKLMN FGHIABCDE XYZSTUVW
  X|PQRJKLMNO GHIABCDEF YZSTUVWX
  Y|QRJKLMNOP HIABCDEFG ZSTUVWXY
  Z|RJKLMNOPQ IABCDEFGH STUVWXYZ

This table is slightly imperfect. For each of the first eighteen letters in the alphabet, when they occur in the plaintext, there is one letter that no key letter will cause to be its ciphertext equivalent, and there is another letter that will be that plaintext letter's ciphertext equivalent for two different key letters. However, although imperfect, it is less so than the Gronsfeld cipher, and so the system might be of some use (although just converting to digits with a straddling checkerboard achieves the same goal, of simplifying applying a key, without any imperfections, and considerably more simply).

It is, however, more important to recognize the names of two other systems. If Vigenère can be thought of as plaintext + key = cipher, Beaufort amounts to key - plaintext = cipher. Since cipher = key + plaintext, Beaufort, like Porta, is reciprocal: the same steps exactly will both encipher and decipher. Variant Beaufort is plaintext - key = cipher, and is the same as deciphering for Vigenère.

Slides and disks are often used for the Vigenère and other polyalphabetic ciphers, particularly mixed-alphabet Vigenère.

Helen Fouché Gaines' Elementary Cryptanalysis gives a classification of mixed alphabet slides into four types:

Type 1: Mixed plaintext alphabet, plain cipher alphabet.
Type 2: Plain plaintext alphabet, mixed cipher alphabet.
Type 3: The same mixed alphabet for plain and cipher.
Type 4: Different mixed plain and cipher alphabets.

I would like to extend this classification slightly to make it comprehensive:

Type 0: Plain plaintext and cipher alphabet.
Type 0a: Plain plaintext alphabet, reversed cipher alphabet.
Type 1: Mixed plaintext alphabet, plain cipher alphabet.
Type 2: Plain plaintext alphabet, mixed cipher alphabet.
Type 3: The same mixed alphabet for plain and cipher.
Type 3a: Mixed plaintext alphabet, the same alphabet in reverse for cipher.
Type 4: Different mixed plain and cipher alphabets.

A slide of type 0a produces a reciprocal cipher, and can be used for Beaufort. The mechanical equivalent of such a slide is an element of a Hagelin machine.

The Type 1 slide is more easily cryptanalyzed than the Type 2 or above slides since once the different alphabets have been determined by discovering the period of the cipher by the Kasiski method (looking for repeated digrams, trigrams and above, noting the distance between them, and looking for a common factor to most of the distances, giving greater weight to longer repetitions) the frequency counts can be lined up, since they are displaced along the cipher slide, which in this case has the known regular alphabet along it.

Even in the mixed-alphabet case, once the period is found, letter frequencies and bigram frequencies can be used to read the message. For a frequent letter, whether only a few letters, or a wide variety of letters, appear before or after that letter helps to identify whether the letter is a vowel or a consonant, or to determine exactly which letter it is.

When some alphabets are partly reconstructed, if you know that the alphabets have been produced by a slide, even one with two mixed alphabets, there are certain logical inferences that you may be able to make that will obtain the values of additional letters. This technique is known as symmetry of position.

Let us suppose that we know that the plaintext letters E and N become, in one alphabet, the ciphertext letters P and Q, and in another alphabet the ciphertext letters K and V.

Since the distance between the letters E and N on the slide with the scrambled plaintext alphabet does not change, even though it is unknown, we know from this that the letters P and Q are the same distance apart on the ciphertext slide as the letters K and V are.

And since one goes from one alphabet generated by a slide to another by moving one of the alphabets a certain distance, one also knows that the two equivalents for E, the letters P and K on the ciphertext slide, are the same distance apart as Q and V.

Of course, these two facts are really consequences of one another: since P-Q = K-V, then it must also be true that P-K = Q-V, where P, Q, K, and V are now standing for unknowns in modulo-26 arithmetic.

In any event, if we then find that in a third alphabet, the plaintext letters R and T become the ciphertext letters P and K, respectively, and if we are fortunate enough to have an alphabet in which R becomes Q, we can then conclude that T will become V in that alphabet.

Later, when at least two alphabets are almost completely reconstructed, it may be possible to work out what alphabets were present on the slide that generated them. This can help in continuing the solution, or in reading future messages.

For example, a slide in two consecutive positions generates these alphabets:

QWERTYUIOPASDFGHJKLZXCVBNM   abcdefghijklmnopqrstuvwxyz
BOXWITHFVEDZNLQURJGSPACKMY   dkanxlqufrjgymvebwzihcopts

QWERTYUIOPASDFGHJKLZXCVBNM   abcdefghijklmnopqrstuvwxyz
OXWITHFVEDZNLQURJGSPACKMYB   zmclwqurvjgsbyedointfkxahp

On the left is the way the slide looks, and on the right are the alphabets as they appear to the cryptanalyst, who does not yet know what alphabets are on the slides. Consecutive alphabets are shown; the method works for any odd shift between alphabets except 13 (not because it's unlucky, but because it is exactly half of 26, the number of letters in the alphabet); for an even shift, partial information about the slide is obtained: two separate 13-letter pieces.

The slides can be reconstructed because of the information the alphabets above give about them: D and Z are the equivalents for A in the two alphabets. So, the distance between D and Z on the bottom slide must be the amount by which the slide must be moved to get from one alphabet to the other. But that is also true of K and M, the two equivalents for B, and A and C, the two equivalents for C, and so on.

So, we start with the letter D. The letter Z is our displacement ahead of it: here, it happens that the displacement is one, so we'll get the exact alphabet on the slide, instead of simply an equivalent alphabet. To move another displacement forward, we find Z as the equivalent for S in the first alphabet, and N as its equivalent in the second.

Thus, we reconstruct the alphabets like this:

a  s  d  f  g  h  j  k  l  z  x  c  v
DZ ZN NL LQ QU UR RJ JG GS SP PA AC CK

b  n  m  q  w  e  r  t  y  u  i  o  p
KM MY YB BO OX XW WI IT TH HF FV VE ED

If the shift were three instead of one, then the alphabets would be subjected to decimation, so that instead of DZNLQ... the first few letters after D would be in positions D..Z..N..L..Q..., but the slides would still generate the same cipher alphabets, only after different shifts.

It is particularly because the Kasiski method of attacking the simple keyword form of the Vigenere is so powerful that other methods, such as key progression and the autokey were originated to avoid it.

The Lanaki Classical Cryptography Course, available at The Crypto Drop Box, notes two other methods sometimes tried. One is to repeat the letters of one's keyword an increasing number of times, with a period different from the length of the keyword: thus, the keyword SIAMESE might be repeated with the pattern 12345, so that one's key would begin

SIIAAAMMMMEEEEESEESSSIIIIAAAAAMEESSSEEE...

and an idea suggested by W. F. Friedman is then noted, where instead of using a simple sequence like 12345, one uses the lengths of the Morse Code symbols for the letters in a keyword to control the pattern of repetitions.

These methods don't make a truly aperiodic cipher the way the autokey does, but like the progressive-key method, they provide a longer period. However, the presence of stretches of plaintext enciphered with the same keyletter at least seems like a weakness.

Of course, one could take Friedman's idea, and make it more complicated (and therefore impractical for actual use!) in a number of ways.

Let us use a Morse Code keyword of BAKER, which is, in dots and dashes,

-... .- --. . .-.

and use two other keywords, ORANGE and CHOCOLATE, for Vigenere use.

A dot selects a letter from the first keyword, a dash selects a letter from the second keyword. And let both keywords also experience key progression.

While this will still lead to a sequence with a finite period, the period will be a long one, and will start like this:

ORAN GE PSB O HFQ  TCPI GR UDQ J HSV  ERKI TW FSL J UXG  TMKV YH UNL W ZIV
CHOC OL ATE D IPD  PMBU FE JQE Q NCV  GFKR FR ODW H GLS  GSPE XI HMT H TQF
-... .- --. . .-.  -... .- --. . .-.  -... .- --. . .-.  -... .- --. . .-.
CRAN GL ATB O HPQ  PCPI GE JQQ J HCV  GRKI TR ODL J ULG  GMKV YI HML W ZQV

alternatively, the ends of each letter in the Morse keyword could control key progression for the first keyword, and the end of each occurence of the whole keyword could control key progression for the second keyword, like this:

ORAN HF QTC Q KIS  WFSL KU YHU O NXA  KXQO ZC MZS R CFO  CVTD HQ EXV G KTG
CHOC OL ATE C HOC  PMBU FD IPD P MBU  GEJQ EQ NCV G EJQ  FROD WH FKR F ROD
-... .- --. . .-.  -... .- --. . .-.  -... .- --. . .-.  -... .- --. . .-.
CRAN HL ATC Q KOS  PFSL KD IPU O NBA  GXQO ZQ NCS R CJO  FVTD HH FKV G KOG

thus making use of more of the information in the Morse keyword, but, since such things are clearly impractical by hand, we shall instead let such devices as rotor machines and the Hagelin lug and pin machine introduce complicated polyalphabetic ciphers, as it was with such devices that such ciphers finally became usable.

Cipher disks are harder to make than slides, but they do look prettier. One type of cipher disk I invented independently looks like this:

This disk was the best of my attempts to devise something which would be easy to use, but involve something more than just sliding one alphabet against another. My intention was to approach the power of a rotor machine.

The wheel is built from four components, as illustrated below:

Essentially, the wheel is constructed from four disks, one of which has half of its outer portion removed.

On the bottom is a disk with the mixed plaintext alphabet, shown as item 1 in the diagram.

Above it is item 2, a disk with the numbers 1 through 13 repeated twice, against a backround of one color (shown here as purple).

Then comes item 3, a disk of the same diameter, but with half of its perimeter removed to expose thirteen of the spaces on the disk below. The half remaining has a background of a color contrasting with that disk (shown here as green), and a mixed sequence of the numbers 1 through 13.

Finally, there is the smallest disk on the top, item 4, with the mixed cipher alphabet.

How the disks are stacked one on top of the other is illustrated by item 5 in the drawing, a view of the four disks one on top of the other, but slightly offset to allow the perspective to be visible.

In use, one looks up a letter in the plaintext alphabet, and proceeds to the number in the next inner band. Then, starting from the same number, against the background of the opposite color, also in that band (but on the other of the two middle disks) one proceeds to the cipher letter on the innermost circle with the cipher alphabet. This is illustrated by item 6 in the picture.

One idea I had for making this type of device useful, by making it easier to move the disks with each letter enciphered, was to add to each disk a rim (presumably with 26 notches or bumps around the outside, not shown for simplicity) so that the four disks could be advanced like thumbwheels.

Versions of the disks using latticework to hold the outside rim in place, while allowing all the letters and numbers in lower disks to be visible, are shown as items 7, 8, 9, and 10 in the diagram. Item 11 shows an oblique view of the disks stacked, item 12 a face-on view.

One way to make use of this construction would be to have layers of cardboard between the disks, so that each rim is individually exposed for a short distance. The 26 possible key letters would be allocated so that seven of them are used to indicate that the lowest disk is to turn counterclockwise from 1 to 7 spaces, another 7 turn the smallest disk clockwise 1 to 7 spaces, and 6 indicate the disk shown with a purple rim is to turn counterclockwise from 1 to 6 spaces, and 6 indicate the disk with half its periphery cut away is to turn clockwise from 1 to 6 spaces.

For use with text as a key, the letters would be allocated so that the frequent letters represented odd displacements, and were distributed evenly among the four disks.

A version of the same device as a slide instead of a disk would look like this:

 -----------------------------------------------------
  F S H G Y M L Q K X O N E R Z A D I V C U T P W J B
        -------------------------
 ------|    1       1       1 1  |--------------------
       |7 4 1 6 3 5 0 1 9 2 3 2 8|        1 1 1 1
  3 4 5|                         |6 7 8 9 0 1 2 3 1 2
 ------                           --------------------
 -----------------------------------------------------
  X V I D U J C G M E Z A F T W O S H K Y P B R N Q L
 -----------------------------------------------------

The Byrne Chaocipher, which was mentioned in David Kahn's The Codebreakers, was subsequently the subject of a Cryptologia article, which indicated that the principle of this wheel may have been used before in that cipher.

Another device I designed myself in trying to produce a convenient cardboard approximation to a rotor machine is the following slide:

The slide has a frame with an approximately rectangular pattern of holes, but with two corners clipped. Behind the frame, there are two slides; one with slots in it that moves vertically, and a solid one that moves horizontally behind it.

The letters on the vertical slide are arranged so that as it moves up or down one position, the letters that become concealed are rearranged and fill the spaces on the slide that now become visible, like this:

   P E H S C A      p e h s c a
 L V F O K R Z    l V F O K R Z
 I B Q Y G W N    I B Q Y G W N
 X T M J D U      X T M J D U E
                  C H L S P A

The same principle applies to the horizontal slide. However, since space is required for the vertical sliding alphabets, and the spaces between the open rectangles, instead of being filled with merely one set of alphabets, it contains three interlaced alphabets, which independently change in this way.

In the previous section, we saw how a straddling checkerboard like this:

    9 8 2 7 0 1 6 4 3 5
    -------------------
    A T   O N E   S I R
 2  B C D F G H J K L M
 6  P Q U V W X Y Z . /

could be used to convert letters to digits. Since it takes account of letter frequencies, the conversion is an efficient one.

This type of conversion makes the various types of polyalphabetic encipherment more convenient by hand, since the Vigenere encryption is now simply addition (on a digit-by-digit basis, ignoring carries).

The table for this operation is, of course:

 |0123456789
-+----------
0|0123456789
1|1234567890
2|2345678901
3|3456789012
4|4567890123
5|5678901234
6|6789012345
7|7890123456
8|8901234567
9|9012345678

and, as this is the table for addition modulo 10, it is not surprising that it resembles the table above for addition modulo 26, or the equivalent Vigenère table.

So, using this conversion from letters to numbers to perform the equivalent of the examples above:

Repeating key:
 W ISH Y OU W EREH ERE
 603421667626015121151
 439251414392514143925
 ---------------------
 032672071918529264076

Progressive key:
 W ISH Y OU W EREH ERE
 603421667626015121151
 439251415403625265147
 ---------------------
 032672072029630386298

Autokey:
 W ISH Y OU W EREH ERE
 603421667626015121151
 439251416034216676260
 ---------------------
 032672073650221797311

so the same methods are as applicable to digits as they are to letters. However, the ease of doing addition on digits as opposed to letters also means that the equivalent of the use of mixed alphabets is not found often with digits.

Besides a straddling checkerboard, another thing that is sometimes done is to use a plaintext alphabet that is different from the cipher alphabet when making a cipher wheel, not in the sense that one or both of them are mixed, but in the sense that they use a different set of symbols. Thus, one can use a ciphertext disk or slide which uses all the letters of the alphabet from A through Z. But on the plaintext disk, X might be replaced by a space, and Q might be replaced by a shift function.

A separate table shows how this sort of thing could work:

 A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P     R  S  T  U  V  W  X  Y  Z
-----------------------------------------------------------------------------
 A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P     R  S  T  U  V  W sp  Y  Z

QA QB QC QD QE QF QG QH QI QJ QK QL QM QN QO QP QQ QR QS QT QU QV QW QX QY QZ
-----------------------------------------------------------------------------
 1  2  3  4  5  6  7  8  9  .  ,  :  ;  !  ?  /  Q  (  )  $ QU  '  "  X  -  0

Additional rules might be used to shorten messages: instead of Q simply shifting the next character, it might be that while QQ represents Q, QQ might be represented by QQQ instead of QQQQ. As well, 1234 might be represented by QABCDQ instead of QAQBQCQD.

Since the space is more common than the letter E, using X to stand for a space, if messages can be comprehensible without spaces, is not advisable. But there are several low-frequency letters that could be omitted from the main alphabet, to produce something like this:

 A  B  C  D  E  F  G  H  I        L  M  N  O  P     R  S  T  U  V  W     Y   
-----------------------------------------------------------------------------
 A  B  C  D  E  F  G  H  I        L  M  N  O  P     R  S  T  U  V  W     Y   

QA QB QC QD QE QF QG QH QI QJ QK QL QM QN QO QP QQ QR QS QT QU QV QW QX QY QZ
-----------------------------------------------------------------------------
 1  2  3  4  5  6  7  8  9  J  K  .  ,  !  ?  /  Q  (  )  0 QU  :  ;  X  -  Z

so that while Q remains the shift character, J, K, X and Z might be used for indicators; for example, in a simple polyalphabetic cipher where the relative position of the discs is only changed occasionally, they could be used for that purpose. They could also be used to make up codewords, as was done with the cipher disk of Alberti; of course, while having a codeword begin with one of those four letters helps it to be distinguishable, there is no reason that codewords whose length is known need to be composed entirely of letters not used for other purposes.

In addition to moving the disks only when an indicator is found, another alternative is to advance them one space at a time, for a progressive key, except when an indicator is found. Of course, such simple ciphers wouls not be used for serious purposes.

As noted in the previous section, Soviet spies using a straddling checkerboard for converting letters to digits have then often encrypted these digits using a polyalphabetic substitution of the Vigenere type.

However, the attempt was made to use a perfect form of Vigenere encryption. (As the declassified information concerning VENONA reveals, the attempt wasn't made quite well enough: supposed "one-time pads" were used at least twice on many separate occasions.)

If, ahead of sending a message, you and your correspondent have arranged to share a key consisting of:

genuinely random digits,
long enough to be used only once, that is, matching in length all the messages you will be sending

then there is no way to "break" your messages by cryptanalysis. (Of course, the length and timing of messages may still give information away.)

Why is there no way to break messages in such a case? Because, if the key is random, and if it is never used anywhere else but with just one message, so there is no other way to get information about that key, then that key could be anything.

So any possible message having the same length as the intercepted message remains possible, because there is no way to exclude the key that would give rise to any hypothetical message. If the key were not random, but generated by some rule, then perhaps the key that would transform some possible plaintexts into the ciphertext actually intercepted could not have been generated by that rule, thereby ruling out those plaintexts. If the key were used another time, then one possible text for the first message could be ruled out because the other message would not make sense. But without either of these opportunities, one speculative plaintext is only as good a guess as any other, and no better.

This survey of cryptographic algorithms has not gone very far, still being in its first chapter, and already we have found perfection. But not only is quite a bit of key required, but more importantly, when that key is used up, no more secure communications are possible using this method. And since the key is bulky, and impossible to memorize, this is only useful where the messages are easy to intercept but the key is safe, for example, in communications, but not in protecting files on your hard disk. Also, it can't complement public-key cryptography, since it doesn't provide a way by which encrypting a small session key by a slow method with some useful special properties (making it possible to start communicating securely without any advance secret exchange of a key) can protect a large message.

The one-time pad is perfect, and it is relatively easy to understand why it is impregnable. In general, it is not suitable for everyday use, since computers make it very easy to carry out conventional encryption with methods of enormous complexity. Such enormous complexity that there seems to be no rational reason to doubt their security. But just as there is a tendency in some quarters to advocate the use of methods that aren't quite complex enough (for example, DES with its 56-bit key, or other block ciphers that are only a step or two better), there is a tendency in other quarters to say that the "experts" should not be trusted, and only the one-time system, in one of its various forms, is any good.

Incidentally, for the polyalphabetic encryption of a stream of message digits by a stream of key digits, it is not strictly necessary to use only addition without carries. Just as either addition modulo 10 or addition modulo 26 are equally valid, depending on whether the message is made up of digits or letters, addition modulo 100,000 is also equally valid. That is, five message digits could be added to five key digits to produce five cipher digits using normal addition with the propagation of carries within the five-digit group.

The one carry, though, that must be discarded is the one out of the five digits, if it is not propagated to the preceding five-digit group. Sending the sum of 53478 and 88412 as 141890 instead of 41890 is a serious mistake, as it proves that both the plain and key groups at that position must be larger than 41889, as neither of those five-digit groups can be larger than 99999.

One method of polyalphabetic encipherment that suffered from this error is any cipher of the Nihilist type.

A five-by-five square like that illustrated below was used to convert both a message and a key (repeatedly used, as in Porta's cipher commonly known as the Vigenère) into a stream of digits:

1 P Z R M T
2 Y A V F I
3 O H K W N
4 G D X B U 
5 S J C E L
  1 2 3 4 5

However, when the stream of key digits was added to the stream of cipher digits, instead of using a modified modulo-5 tableau such as:

  | 1 2 3 4 5
-------------
1 | 2 3 4 5 1
2 | 3 4 5 1 2
3 | 4 5 1 2 3
4 | 5 1 2 3 4
5 | 1 2 3 4 5

the digits were simply added by normal decimal addition without carries. Hence, while each digit which was added belonged to a family of only five digits, the sum as transmitted could have any of nine values, (a sum of 1 could not occur) and only one of those values, the digit 6, did not place some constraint on both the key digit and the message digit, as can be seen from the table of that operation:

  | 1 2 3 4 5
-------------
1 | 2 3 4 5 6
2 | 3 4 5 6 7
3 | 4 5 6 7 8
4 | 5 6 7 8 9
5 | 6 7 8 9 0

The actual Nihilist cipher, as described in Gaines, was even worse than this: the letters in the checkerboard square were in their normal order, and the addition was performed with carries, to the extent that messages were sent in two-digit groups, and three digit sums (such as 45 + 55, or 52 + 51) were sent in three digit form, preserving the carry out of the group. However, the Russian language, before the Russian Revolution, had a thirty-six letter alphabet, and the use of a full 6 by 6 square would have improved things slightly.

Since the most natural way to apply a stream of key letters to a stream of plaintext letters is the Vigenère, and the most natural way to apply a stream of key digits to a stream of plaintext digits is modulo-10 addition, when applying a stream of key bits to a stream of plaintext bits, the very simple operation of modulo-2 addition, or exclusive-OR (XOR) is often used, with the very short table:

  | 0 1
-------
0 | 0 1
1 | 1 0

The following two tables:

  | 0 1 2 3 4 5 6 7     | 0 1 2 3 4 5 6 7 
-------------------   -------------------
0 | 0 1 2 3 4 5 6 7   0 | 0 1 2 3 4 5 6 7
1 | 1 2 3 4 5 6 7 0   1 | 1 0 3 2 5 4 7 6
2 | 2 3 4 5 6 7 0 1   2 | 2 3 0 1 6 7 4 5
3 | 3 4 5 6 7 0 1 2   3 | 3 2 1 0 7 6 5 4
4 | 4 5 6 7 0 1 2 3   4 | 4 5 6 7 0 1 2 3
5 | 5 6 7 0 1 2 3 4   5 | 5 4 7 6 1 0 3 2
6 | 6 7 0 1 2 3 4 5   6 | 6 7 4 5 2 3 0 1
7 | 7 0 1 2 3 4 5 6   7 | 7 6 5 4 3 2 1 0

the first for modulo-8 addition, and the second for an independent exclusive-OR of each of the three bits of an octal digit, shows how these two operations differ in their algebraic structure; thus, some modern computer ciphers alternate between XOR and addition modulo 256 or modulo 65,536 within a complicated sequence of operations to make the cryptanalyst's path a more tangled one.

The following two tables:

  | 0 1 2 3 4 5 6 7     | 0 1 2 3 4 5 6 7 
-------------------   -------------------
0 | 0 1 2 3 4 5 6 7   0 | 0 1 2 3 4 5 6 7
1 | 1 2 3 0 5 6 7 4   1 | 1 2 0 4 5 6 7 3
2 | 2 3 0 1 6 7 4 5   2 | 2 0 1 5 6 7 3 4
3 | 3 0 1 2 7 4 5 6   3 | 3 4 5 6 7 0 1 2
4 | 4 5 6 7 0 1 2 3   4 | 4 5 6 7 1 3 2 0
5 | 5 6 7 4 1 2 3 0   5 | 5 6 7 2 3 4 0 1
6 | 6 7 4 5 2 3 0 1   6 | 6 7 3 0 2 1 4 5
7 | 7 4 5 6 3 0 1 2   7 | 7 3 4 1 0 2 5 6

illustrate two ways of obtaining tables that are different algebraically from either of the previous two. The first is a table of the operation of adding the last two bits of a three-bit octal digit with modulo-4 addition, and XORing the first bit independently. The second results from deliberately constructing a table which must be algebraically distinct from anything like this, by beginning with a subtable for modulo-3 addition, which could not occur in a table built from exclusive-OR and addition modulo a power of two, and then using displacements of the other five digits in the shadow of that subtable. Finally, the remaining five by five square is filled; at first, some diagonals are filled systematically, and then the rest is filled through trial and error, with some backtracking.

Since it is only required that the operation table for polyalphabetic combining of a keystream with text be a Latin square for it to function in an ideal fashion, it is not necessary that the table be patterned after that for a group; but groups certainly are a common source of tables.

Thus, the addition table shown corresponds to C(8), the cyclic group of order 8, and the XOR table to C(2)*C(2)*C(2).

For the 26-letter alphabet, the Vigenere table corresponds to C(26); C(2)*C(13) and D(13), the dihedral group of order 13, are available. The D(13) table would be the table of the alphabets produced by a two-sided cipher disk where the inner circle and outer ring both have half the alphabet written with equal spacing on each side, so it would look like this:

 A | A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 B | B C D E F G H I J K L M A Z N O P Q R S T U V W X Y
 C | C D E F G H I J K L M A B Y Z N O P Q R S T U V W X
...
 M | M A B C D E F G H I J K L O P Q R S T U V W X Y Z N
 N | N O P Q R S T U V W X Y Z A B C D E F G H I J K L M
 O | O P Q R S T U V W X Y Z N M A B C D E F G H I J K L
 P | P Q R S T U V W X Y Z N O L M A B C D E F G H I J K

For order 27, one could use C(27), or C(3)*C(3)*C(3), as well as several other possibilities, including this rather weird one:

   | & A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 ---------------------------------------------------------
 & | & A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 A | A Z L X M U K I C T E O W Q P F V D S Y G J N B H R &
 B | B F Y W A X D J T Q Z S U C L E K V H P R N I M O & G
 C | C X K N Y M I L V W D T O R & G A S E U B Q Z F P J H
 D | D J N E W L P V M H O X C I S A Q F K B Z G U & R T Y
 E | E R X S T I H C U & W Z A K D V J B L G N F Y P Q O M
 F | F G U O C N S Q W R X E M V K D I A P & J Z L Y T H B
 G | G B M T V Z E R O J N D Y A I S L C & H Q X K U W P F
 H | H C F V S D G B P L Q I K Y Z T & R J E W M A O N U X
 I | I L Q G N & U S F E P M R Z T Y W X A C D H O J B V K
 J | J Y C R I G X H E Z L A & F Q P U W B T V O S N M K D
 K | K I J F X P Y D B S H U Q N O M T Z V A E & W R G C L
 L | L K R B Z H M E G D & Y J X W U O N C V S P T Q F A I
 M | M E P U B W V N Q C F H X T Y Z D K O L & I J A S G R
 N | N P T & J R W O Z F Y B G E C L X U M Q K S H I A D V
 O | O T E K H V R M L Y C J D P F Q G & N Z U A B S I X W
 P | P V G A E S B F & K R L I U X W H J Q D O Y C T Z M N
 Q | Q U V J L B N P D X K C H G R & M O F W A T E Z Y I S
 R | R M A Q K F Z & S N I V P B J H Y T G O C W D X U L E
 S | S Q Z D O K & A Y P T N V L E C R G I F X B M H J W U
 T | T W D I P A J Y K U V Q S & G R B H Z X M C F E L N O
 U | U S H Y G T C X J A B & Z O M N E L W I P K R V D F Q
 V | V N I Z U Y L K A O S W T J H B C E D M F R X G & Q P
 W | W O S L & C Q U I M A R E H B J F P X N Y V G D K Z T
 X | X H O P R Q T W N B M G F S A I Z Y U J L D & K V E C
 Y | Y D & M F O A Z R V G P N W U X S I T K H L Q C E B J
 Z | Z & W H Q J O T X G U F B D V K N M Y R I E P L C S A

I may as well reveal that it is based on combining the group B(2,3) with Morse code, as it is unlikely that anyone would have puzzled that out for themselves! (E, enciphered in the T alphabet, is A, and E is (dot), T is (dash), and A is (dot, dash) in Morse, and this is true of some of the other entries in this table. Others are obtained by removing three dots or three dashes in a row, but there is quite a bit more than that going on, such as seven letters having different codes assigned to them than those in Morse.)

Next
Chapter Start
Table of Contents
Main page
Home page