WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT
WAHRSCHEINLICHESWORT *WAHRSCHEINLICHESWORT *WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT *WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT WAHRSCHEINLICHESWORT *WAHRSCHEINLICHESWORT
Figure 3.7: Negative pattern search. The highlighted letters show a coinci- dence (i.e., a match between the plaintext and the ciphertext in that position). All that remains are the five possibilities marked ‘*’.
(probable word) appears in the plaintext. We’ve come across such probable words in Chapter 2. They even play a greater role in the modern data processing world—just think of headers in word processor files.
Back to the probable word. We write it on a piece of paper and move it along underneath the ciphertext, as shown in Figure 3.7. No character transforms onto itself in the Caesar cipher either. So, when two equal characters happen to be superimposed in any place, this position of the paper strip is out of the question.
Based on the theory, only 46 % of all cases will have no such match, but language is not random. In our case, there are only five possible positions of the paper strip.
Next, we pick a letter that occurs at least twice in the probable word; let’s select ‘R’. Notice that the same character always has to correspond to ‘R’ in the ciphertext since the text is Caesar-ciphered. The first possibility is no good
3.4. Back to Ciphering Cylinders 77
since a ‘W’ is above the left ‘R’, and a ‘Z’ above the right ‘R’. This can’t be the right position. Similarly, we find the letters ‘K’ and ‘H’, ‘D’ and ‘L’ as well as ‘Z’ and ‘V’ above the ‘R’ in the second, third, and fourth positions. They are out of the question, too. The last position remains, where ‘R’ is always converted into ‘U’. We try to determine a shift (here 3) and, running a deciphering attempt, obtain the following:
DIESERTEXTENTHAELTEINWAHRSCHEINLICHESWORT
All right, so this text was ‘originally’ Caesar-encrypted (i.e., using shift 3). Of course, there are other methods we could have used to cryptanalyze this example. For instance, the letter ‘H’ occurs in the ciphertext with a frequency of 19 %. Assuming that ‘H’ corresponds to the most frequent letter, ‘E’, we would also have found the solution. This is actually the way for ciphertext attacks. However, using the negative pattern search on a probable word meant that we didn’t have to count up anything. It led to success almost effort- lessly.
Approach for Ciphering Cylinders
The approach for ciphering cylinders is similar: the negative pattern search supplies us with a few possible positions of the plaintext for a starter. What makes this approach more difficult, however, is that the probable word could be torn, and a careful code writer may have selected a different cylinder line for each period. But we’re not interested in this right now.
We start a plaintext attack by exploiting each possible position of the probable word as follows:
• For every letter, we know the character it will be transformed to in the ciphertext. This heavily limits the selection of the disks, and even homophony cannot change that (i.e., the fact that we don’t know from what row the ciphertext was read), which increases our effort by 26-fold.
• For each assumed disk choice, we look up another period of the cipher- text to see whether or not this choice will produce a fragment of some meaningful plaintext in a row. Only a few possibilities will generally remain.
• We add more and more sections (periods) of the ciphertext as we continue revealing the correct choice for as many disks as there are letters in the probable word.
• In every period, we decrypt the part of the ciphertext determined by the disks we already know. We will hit scraps of words like ORDE, ANNABI, MITTANC, or XPOS, and completing them shouldn’t pose a major problem. Having revealed yet another piece of plaintext, we can start all over again, luckily from a better starting position.
• Step by step and piece by piece, we will end up knowing all disks. The interesting part of this approach is that even homophony—ambiguity in the cipher—does not represent an insurmountable obstacle. Of course, my representation refers to the way we’d have worked before the computer era. Humans are still better than computers when it comes to forming sentences from scraps of words. But when you use a computer you’d proceed differently anyway.
3.4.2
The Viaris Method
The method developed by Viaris represents a refinement of the cryptanalysis discussed above. Again, we use a probable word, except that this time we improve the negative pattern search.
To this end, we hold the row from which the ciphertext had been read (the so-called generatrix) on the cylinder for a moment. Under this prerequisite, we analyze the letters that could form at all from the letters of the probable word for all disks. We appropriately build ourselves a table (matrix) for this purpose. Each row in the table corresponds to a disk, and each column to a letter of the probable word (see Figure 3.8).
As before, we move the probable word along underneath the ciphertext. We’ll know we have hit the correct position when each character of the ciphertext above the word appears at least once in the corresponding matrix column. Positions with coincidences (character matches like above) fall out automat- ically, but generally other cases do too. If we find no possible position, we have to try all over again using a different generatrix. Notice that the number of possibilities to be analyzed is slightly smaller with this method. Givierge refined the method once more by doing without probable words and using only digram and trigram frequencies. You will find more details and references in [BauerMM, 14.3].
3.4. Back to Ciphering Cylinders 79
The Attacking Method by Viaris
Let the probable word be CHIFFRE, and the ciphertext VIWSHQTLUFT- WDTZ.
The ciphering cylinder should consist of ten disks with the following settings (to be read from top to bottom; connecting the first row with the last row to form a ring):
1 2 3 4 5 6 7 8 9 10 (disk number) N X F V M S X U T P B J C X X A I B V M A E L I T G L J G G C Q G G Y J F W F Z R N B D K V C R A X V P T E D R V V X I U T R C B D W M E S H B A T L Q D L L V D Z D R E O J F U L Q H V F F L P O K J W V M Q R H S P J W M I H S H C R K W C K G S N Q I E Y Q A Z A O Z A U G Q S T P S Z K W E T G Z U L R U W J Z O A O N F Y I H U Y A H I D T K X U I M Y S B O S F W Y O B N I R B O O J M G X M C H Y J C Y J N W B Z C R G W P A C N U X N K Y M E P Z K H T M F X D Q O V F Z N Y E I U N L S T Q E D Q E L K B P P K D P H
We look at the first generatrix, i.e., the ciphertext is read from one row underneath the plaintext row. This turns ‘C’ into ‘R’, ‘H’ into ‘D’, ‘I’ into ‘E’, and so on, in the first disk. We write this result in the first row of a matrix. We fill the second row analogously for the second disk. We obtain the following 10×7 matrix:
1 R D E T T V N 2 W V G O O Y Q
3 L S X C C A Q 4 T U G Q Q F C 5 Z Q O R R H F 6 I C U T T D Z 7 V Z L C C E G 8 Z S C O O V D 9 N C B A A H L 10 A P S E E K Q The position VIWSHQTLUFTWDTZ CHIFFRE
of the probable word supplies no coincidence (no two superimposed characters are equal), which means that it is theoretically no option. We place the pertaining ciphertext fragment, VIWSHQT, on top of the 10×7 matrix above. Only the ‘V’ from the first position can be found in the next column (disk 7); the other characters are not in the first generatrix in any other disk. This completes that word position for this generatrix.
The word CHIFFRE is now moved forward and, excluding all positions, we look at the next generatrix until we find a ciphertext fragment in which each ciphertext character happens to be in one of the lower columns at least once. Another exclusion condition is that the ciphertext characters must occur indifferent matrix rows. (If there are multiple occurrences in one column, then we should be able to make the choice so that this condition is met.)
We can now mount a plaintext attack on all positions found.
Figure 3.8: (continued)
The method fails when the permuted alphabets on the disks form a Latin square, i.e., when the disks have turning positions that cause each letter to occur at least once in every row.
It is very unlikely that people still use ciphering cylinders today, and nobody implements them in software. So why dedicate a full section to the Viaris method, which is especially tailored to these devices? For a couple of reasons. First, because of the comment on Latin squares in the paragraph above: when you run cryptanalysis yourself, you will begin to understand why this disk property is so important for cryptanalysis. This still doesn’t mean that we are able to design secure algorithms: we simply don’t know what methods have been or will be used by all the cryptanalysts in the world.
3.4. Back to Ciphering Cylinders 81
Second, there is one more risk we should be aware of: when designing an algo- rithm, the developer may be particularly cautious, never letting any character transform onto itself. In doing this, he actually compromises his own method. Bauer [BauerMM] refers to this approach as anillusory complication. Endeav- oring to design things particularly well often leads to the exact opposite. At this point, you might not understand why German cryptologists hadn’t seen the risk caused by the Enigma’s reversing drum: it enabled negative pattern search.
3.4.3
This is Still Interesting Today!
The ciphering cylinder is history, and so is characterwise encryption. ‘So what do we discuss it for?’, you will probably ask. We encrypt bitwise nowadays! Well, negative pattern search is still a potential risk, even with algorithms working bitwise. We certainly won’t compare superimposed bits any longer. But we might be able to prove a statement like the following:
If byte 1 has even and byte 3 uneven parity in the plaintext block, then there is a 76 % likelihood that bit 26 in the ciphertext block is equal to 1.
Of course, it would be best to have a 100 % probability, for we could then run a negative pattern search, like before. But every value that deviates from 50 % can be helpful.
These kinds of statements are dangerous for all algorithms that are vulnerable to plaintext attacks. Look at this not totally unrealistic example: assume a WordPerfect file was encrypted bitwise using a Vigen`ere method (more about this in Sections 3.5 and 3.6). We know for sure that it includes the string
Lexmark 4039 plus PS2
(21 characters), since our security department uses this printer. Moreover, we know the code writer is chronically lazy, i.e., he would never bring himself to use a password with a length of ten characters. We are looking for the position of the probable word; we have a hunch where in the ciphertext it might be found. If the password is four characters long, then ‘L’ and ‘a’ have got to be encrypted in the same way. This is written as follows in cryptology:
p1 ⊕ s = c1
Where p1 represents the plaintext character ‘L’; p5 represents ‘a’; c1 and c5 denote the pertaining ciphertext characters; s represents the key character for this position; and ⊕ denotes the bitwise XOR according to the operationˆin the C language. We XOR the left and right sides of both equations and obtain
p1 ⊕ p5 = c1 ⊕ c5
This is a good criterion for checking a position in the text, since we already know that p1⊕p5 =‘L’⊕‘a’. We will naturally run this test on the other character pairs, too. If no possible position at all results, we have to try it with a different period length.
Once we’ve eventually found the correct position of the word, we use it to reveal the correct key, since the Vigen`ere method is not resistant to plaintext attacks: the plaintext length is greater than the period length so that we can compute the key (plaintext XOR ciphertext) directly. Again, it is remarkable that this approach can do totally without statistic analyses.
3.5
WordPerfect Encryption as a Modern Example
The WordPerfect word processor let’s you encrypt your files just like many other application programs. Though the method hadn’t been disclosed, it appeared on the Internet nevertheless. First I suspected somebody might have reverse- engineered parts of the program, but then I realized that this effort wasn’t nec- essary. Finding out this method is so unbelievably simple that I want to briefly demonstrate it here without qualifying you for a hacker (after all, I’m not one either). All you need is the right intuition. I once used WordPerfect Version 5.1 under UNIX (it’s equivalent to the same version for other operating systems).