• No se han encontrado resultados

La imagen en la pedagogía del aula

relacionada con el concepto de desarrollo

1. Referentes de fundamentación y ejes curriculares

1.10. La imagen en la pedagogía del aula

Most major optimisation techniques have been applied to classical cipher crypt- analysis. Progress remained sluggish until 1993 when numerous papers appeared. Spillman et al. [120] showed how simple substitution ciphers could be attacked using genetic algorithms. Mechanisms for representation, mutation and crossover are discussed as might be expected, but perhaps the most interesting feature is the fitness function used:

! #" $ &% ' ( " *)+, #"-" .)+/ $01 2435 67 98 (2.1) The letters

are referenced by the indices

. Here ;: <

is the stan- dard frequency of character

in English. = ;: <

is the measured frequency of the character

in the decoded ciphertext. Similarly>= ?: @<

is the standard fre- quency of bigram character

followed by character in English. =A= ?: @<

is the corresponding frequency of that bigram of the decoded ciphertext. The really unusual element is the exponent of 8, included ‘to amplify small differences.’ Ex- perimentation with such exponentiation parameters is almost universally ignored in much subsequent work. Spillman emphasises the importance of experimenta- tion with genetic algorithm parameters. Also, he suggests possibilities of more sophisticated cost functions involving trigrams, i.e. three letter strings such as “THE”.

Independently, Matthews [74] was also investigating the use of genetic algo- rithms for transposition ciphers. He describes the operation of GENALYST — a flexible scheduling type GA. The particular transposition cipher examined is a familiar one. A key is some permutation of , for example

CB for . Plaintext is written in rows of length under the key and enci- phered by reading it off in columns in the order dictated by the integers mak- ing up the key. Using the key

CB

the phrase “NOWISTHETIME- FORALLGOODMEN” is written as shown in Figure 2.6 and hence enciphered as “WTROIIADOEOOTEWLESMLMNHGFGN”. This was one of the earliest

papers and exhibits considerable originality and sophistication. Firstly, the stan- dard frequency based cost was replaced with a points scoring system. Six bigrams and four trigrams were considered. With each a number of points was associ- ated reflecting the likelihood of its occurrence in successfully deciphered text. The (N-gram, points) pairs were (“TH’,+2), (“HE”,+1) (“IN”,+1), (“ER”,+1), (“AN”,+1), (“ED”, +1), (“THE”,+5),(“ING”,+5),(“AND”,+5) and (“EEE”,-5). If the text length is L then the fitness function is given by

(2.2) where is the percentage frequency of the

th bi– or trigram tested for, is its score and is the number of bigrams or trigrams checked for. This approximate means seems to work effectively (at least for the experiments reported). The sys- tem can be used to determine the keylength. Essentially attempts at wrong key lengths are limited in the fitnesses that can be achieved. Trying runs at various key lengths readily reveals the most effective length. The text notes that random testing also is quite effective in this respect. The real power of GAs comes when the actual permutation is sought. The work describes various enhancements that have been brought to bear, such as elite survival. Another notion advanced is human interaction to aid what he terms ‘perming’. Essentially, manual analy- sis of the schedules resulting from various runs reveals little groups of columns that regularly appear somewhere in the key. The actual solution is likely to be a permutation that maintains such groups. This is the first paper to espouse real hybridisation techniques. Examining the results of repeated runs is an excellent idea and will reappear in the work of Knudsen and Meier (see Section 2.7).

Giddy and Safavi-Naini [60] use simulated annealing to attack simple trans- position ciphers where sections of letters are each shuffled according to a key permutation. The work places the problem very clearly within the theoretical domain of applicability of simulated annealing (showing the search space to be connected, arguing that the cost surface is reasonably smooth, giving general the- oretical advice on cooling schedule, appealing to theory to justify the number of iterations within a temperature cycle etc). The authors demonstrate a new move function that is intended to increase the smoothness of transitions. The cost func- tion used, based as usual on expected and actual (i.e. under decryption) plaintext bigram frequencies is

(2.3)

(The and would in Spillman’s notation be SF[ , ] and DF[ , ]). The authors note that the value of

often referred to informally as ‘fiddle factors’ and highlight the need for general experimentation when applying metaheuristic search techniques.

Jakobsen [59] attacks simple and polyalphabetic subsitution ciphers (assuming that the number of alphabets used is obtained by standard means such as Kasiski or Index of Coincidence, see [97]). The work is essentially a form of hill-climbing. It shows marked efficiency gains on previous work, due in part to clever manip- ulation of the matrix of bigrams obtained under decryption. In the conclusion he states ‘this approach is not immediately useful for the more modern type of encryption algorithms (IDEA, DES etc. )’ echoing a widely held view.

Clark and Dawson have carried out the most extensive research on classical ci- pher cryptanalysis. The work is reported in various places and covers applications of genetic algorithms, simulated annealing and the more recently developed tabu search technique. The work has attacked substitution, transposition and polyal- phabetic substitution ciphers (the latter using a parallel genetic algorithm). The journal paper [18] provides a comparison of simulated annealing, genetic algo- rithms and tabu search attacks on simple substitution ciphers. There appears to be little to choose between them according to correctness of final keys produced (TS comes out marginally on top) but there are significant differences with respect to efficiency. TS again comes out on top (roughly twice as fast as SA) with GAs markedly worst (roughly twice as slow as SA). The work of Jakobsen [59] above indicates strongly that local search is effective for the simple substitution cipher (of English) and so it is not so surprising that all techniques work well (with GAs essentially hill-climbing at the end via mutation.) Where hill-climbing has merit then SA is an inefficient way of achieving gradient-ascent, TS will take the form of steepest gradient-ascent. The ability to calculate delta-costs for local search and not for GAs may also explain the relative inefficiency of GAs for this prob- lem. The GAs uses a mating operation that is far more intuitive that that used by Spillman. The authors experiment (via enumeration) with cost function weight- ing for unigram, bigram and trigram costs. Bigrams are chosen for much of the comparative study. Would higher level parametric optimisation be of any use?

As Bagnall et al. note [1] the ciphers attacked are generally simple ones. The application of heuristic search gives no surprises and to some extent the body of research is much of a muchness. There seems to be a stark lack of ambition. It is pleasing to see something a little different and harder attacked. The Enigma variants attacked by Bagnall et al. are arguably the most sophisticated classical ciphers attacked (both odometer and other rotor rotation strategies are considered) using metaheuristic search. Their technique involves solving for the last rotor of a three-rotor machine (and then solving for the remaining two rotors using a known technique). The first two rotors give rise to a cipher with period

(where is the cardinality of the alphabet). If the ciphertext is mapped through the correct third rotor, and the resulting intermediate ciphertext is split into

ciphertext strings, these strings will exhibit the statistical characteristics of mono- alphabetically enciphered text. The degree to which this is actually holds can be taken as a measure of the correctness of the current solution for the final rotor (and so used to guide the search appropriately).

Parallelism is little exploited in heuristic cryptanalysis research. Clark and Dawson [17] use a parallised GA to attack a polyalphabetic substituion cipher. This combines several individual substitution ciphers. Given a plaintext message, the letters at positions

might be encrypted using the first cipher, those at

using the second and those at

using the fifth etc. The individual ciphers are farmed out to various processes. Calculation of unigram costs can be carried out in isolation for each such cipher, but bigram and trigram statistics cannot. Initially, each local process calculates its estimate of its key based only on local unigram measures. Every so often each process communicates its best local key to neighbouring processes to enable such costs to be calculated (e.g. a process solving cipher 2, would receive the best keys so far for ciphers 1 and 3 and use this to calculate appropriate bigram costs.) Eventually, the process converges. The results show that this is a highly effective approach. Lebedko and Topcy [87] comment that the use of parallel GAs is ‘not original’. It is hard to say whether this comment applies to polyalphabetic substitution ciphers (no other reference is given) or to parallel genetic algorithms in general (which seems true, but irrelevant). Whatever, this paper seems a useful contribution.

2.4.3

General Commentary

All the work described above has served a useful purpose. Classical cipher crypt- analysis provides a simple testbed for examining the capabilities of the techniques. In addition, the cryptological knowledge needed is small and so makes these prob- lems attractive to researchers outside the cryptographic community (understand- ing letter frequency characteristics is probably easier than understanding differ- ential cryptanalysis). As noted earlier [1] most work has concentrated on ciphers readily breakable by other means. Several authors have commented that the tech- niques are not readily applicable for modern cryptanalysis. It is disappointing is that no-one appears to suggest any way forward in this respect.

On a technical level, the classical cryptanalysis work exhibits symptoms com- mon to virtually all applications of the search techniques to cryptological prob- lems. A major feature is that optimisation is a ‘one shot’ technique — the idea is to ‘solve’ the problem, to extract the whole key in one go (Matthews’ exploitation of multiple runs is highly unusual in the area.) This is not the way modern crypt- analysts work. Cryptanalysts typically work by exploiting small biases and using perhaps many billions of pieces of data (e.g. plaintext-ciphertext pairs). They generally don’t run a program for a few minutes and expect the result to pop out!

With respect to classical ciphers this is entirely understandable; after all, the direct approach appears to work reasonably. But there seems to be a general agreement that the techniques will not work when applied to modern cryptanalysis problems. Although such cryptanalysis is not addressed here, the results of various pieces of work in this thesis would suggest that moving away from this one-shot view has considerable potential. This author also suggests that any future cryptanalysis attempts should address ‘hard’ problems.