Main : Index : Pencil and Paper Systems : Improving Substitution : Homophones and Nomenclators

Homophones and Nomenclators

One of the earliest methods used to create ciphers stronger than simple substitution was to create cipher tables which had more than one substitute for each letter, and which had additional substitutes for names that would be commonly used. Because of the significance given to proper names, these systems were called nomenclators.

Some of the early nomenclators were fairly unsophisticated; the substitutes for the letter B might be the letter M or the digit 4, written in several distinctive styles - and then the substitutes for C might be the letter N or the digit 5, again written in distinctive styles. Thus, a cryptanalyst willing to try a simple guess would only need to solve a Caesar cipher - a simple substitution where the alphabet is merely displaced instead of being thoroughly scrambled - instead of facing the full problem of finding substitutes for the full set of symbols individually.

One ingenious modern method of producing a homophonic cipher, called the Grandpré cipher, involves choosing ten ten-letter words, which can be ordered so that their first letters form an eleventh ten-letter word, and which collectively include all 26 letters of the alphabet.

For example:

  0 1 2 3 4 5 6 7 8 9
0 S T R A T I F I E D
1 U N D E R S T O O D
2 B A R K E N T I N E
3 M A J O R I T I E S
4 A S T R O L O G E R
5 R E E X A M I N E D
6 I N V E S T M E N T
7 N E G A T I V E L Y
8 E F F E R V E S C E
9 S Q U E E Z I N G S

The advantage it has, over a more routine type of homophonic table, for example:

         0,3,8  4,7  9 1 5 2 6
1,2,7,8  E      T    A O I N S
0,4,5    H      R    D L U B C
3,9      M      F    G J K P Q
6        V      W    X Y Z

is that the multiple substitutes for each letter are not closely related.

The book The American Black Chamber, by Herbert Osborn Yardley, illustrated a cipher wheel used by the Mexican Army which could be set up to produce a homophonic cipher with a key that could be easily changed.

Changed from a wheel to a slide, it would look like this:

  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z
 15 16 17 18 19 20 21 22 23 24 25 26 01 02 03 04 05 06 07 08 09 10 11 12 13 14
 43 44 45 46 47 48 49 50 51 52 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 53 54 55 56 57 58 59 60
 92 93 94 95 96 97 98 99 00             79 80 81 82 83 84 85 86 87 88 89 90 91

having four movable disks, one containing the two-digit pairs from 01 to 26, the second the pairs from 27 to 52, the third the pairs from 53 to 78, the fourth the pairs from 79 to 99, followed by 00 and four blank, unused spaces. The key consisted of the four two-digit pairs aligned under the letter A, and the possible substitutes for any letter were the four (or possibly three) two-digit pairs aligned under it. Obviously, the system would have been more secure had the alphabet and the sequence of digit pairs been mixed.

The most important weakness of a homophonic system is that the person using it can become lazy, and use the same substitute for a letter over and over, or use the substitutes in rotation, rather than using them randomly.

Also, as many homophonic systems are devised by amateurs, they can have defects of one kind or another. Helen Fouché Gaines in Elementary Cryptanalysis notes that Givierge, author of the Cours de Cryptographie, described a homophonic system of the following kind:

    E J G F D
    O K M H S
    P   R   W
    ---------
IT |a b c d e
AL |f g h i j
BQ |k l m n o
CN |p q r s t

    u v x y z
    ---------
    V X Y Z U

This is a type of straddling checkerboard, and we will meet a more elegant form of it later in the section on fractionation. The word straddling refers to the fact that while most letters have a two-letter group as their substitute, consisting of the letters indicating their row and column (which may, incidentally, be taken in either order, as the alphabet has been split in half for this purpose), five less-frequent letters represent each other. Thus, the presence of occasional one-letter symbols is intended to complicate the problem for the cryptanalyst, making it difficult for him to find out where the letter pairs that make up most of the message begin and end.

Although this cipher has many nice features, it does have a number of defects. Since the letters that have only one letter as their substitute are, essentially, in a separate table, why use only a 25-letter alphabet? Of course, in French, the letter W is so little used as almost not to be part of the alphabet. But there are other defects.

Although a group can begin with a letter from either half of the alphabet, the second letter always has to be from the other half.
Also, the second letter of a two-letter group can't be one of the five letters that represent themselves, although since the first letter already indicates that there is a two-letter group, that would not cause confusion.

Hence, this cipher omits a large number of two-letter substitutes which it could be making use of. An improved design could be the following:

    E J G F D            b|M
    O K U H S            f|T
    V   W                g|N
    ---------            p|R
IZ |r q l m x| DFOQTX    v|P
AL |e u w h n| CJLNPWY   y|Q
BY |c i k z a| AEGKUV
CX |o j t s d| BHIMRSZ
    ---------
    E A C F G
    I B H J L
    P D O K M
    Q T R N V
    S U Z X W
    Y

Here, six mid-frequency letters have single-letter substitutes, but these substitutes are drawn from other letters in the alphabet.

The rest of the alphabet is divided into two halves, but once a letter is chosen to indicate either a row or a column, the other co-ordinate of the plaintext letter is chosen from a set made from the entire alphabet. Hence, if a letter on the left begins a two-letter group, it is ended with a letter below; if a letter on the top begins a two-letter group, it is ended with a letter on the right.

Thus, the plaintext letter R can become ED, EO, OO, IQ, ZQ, or II.

As noted previously, the basic concepts of cryptography were slow to emerge. David Kahn's book The Codebreakers illustrates the earliest known example of a cipher with homophones, from the year 1401. It looked like this:

a b c d e f g h i k l m n o p q r s t u x y z
---------------------------------------------
Z y x D t s r q p o n l m k j h g f e d c b a
2       4                 8           F
3       H                 9           T
+       J                 L           ~

where the capital letters stand for various special symbols (Z indicates a reverse script lowercase z, F indicates an ff ligature, and J indicates the astrological symbol for Jupiter, for example).

To modern eyes, what is particularly striking about this cipher is that, even though the step of improving on simple substitution by using multiple equivalents was taken, the basic cipher alphabet itself is not thoroughly mixed, but instead varies only slightly from a simple reversed alphabet.

Incidentally, the British publisher Hodder and Stoughton has an extensive series of books on various subjects in a series called "Teach Yourself Books": particularly noteworthy in this series are the instructional books for foreign languages, which the case of some languages are the only readily available introductory book in print in English. (These are the books that used to have yellow covers, but which changed to light blue covers some years back.) The book Codes and Ciphers by Frank Higenbottam in this series, while a general introduction to the subject, is distinguished by its uniquely extensive coverage of the topic of breaking messages enciphered using nomenclators.

Next
Chapter Start
Skip to Next Section
Table of Contents
Home page