VIEWS: 4 PAGES: 3 POSTED ON: 5/1/2011 Public Domain
Stephen Hurley 5329198 CS 178 Term Paper The Current Involvement of Genetic Algorithms in Cryptography Introduction Cryptography has been in use since the time of the ancient Romans with simple monoalphabetic substitution. It has gradually developed into a much more sophisticated field with techniques such as knapsacks, modular arithmetic, and public key cryptography. Modern computers have also allowed the use of more complicated algorithms that are very difficult to decipher, allowing for incredibly secure transfer of data between parties. As algorithms have getting more powerful by the use of advanced computing, however, techniques for decryption have also increased in their effectiveness. One such method that is currently being studied is genetic algorithms. Genetic algorithms are a way to imitate natural selection to hone in on an optimal solution for a given environment. People have been using genetic algorithms on a practical basis since the late 1970's with the study of cellular automata.1 With the advent of more powerful computers, people were able to simulate more complex genetic processes and develop more specialized theories. Genetic algorithms today are used in a variety of fields including business analysis, robotics, search engines, and materials analysis. Description of a Genetic Algorithm A genetic algorithm begins with a set of data, called the population, that is usually at least psuedo-randomized, with each piece of data, called the individual, being within the limits of the problem space. The individual is applied to a function to give an estimate how close to the optimal answer the individual is. The function usually returns a number, called the fitness, lower or higher (depending on the specific implementation) determining its closeness to optimal. After each individual is given a fitness, the algorithm creates a new generation from the original population. Individuals are randomly selected based on the fitness weight to combine with others via mutations and crossovers to produce individuals for the new population. Then the process starts over again, except instead with the new population creating a third generation based on the fitness of each individual. This process roughly mimics natural selection, producing a highly optimized population after a number of generations have passed. Because of this, there are only certain problems for which genetic algorithms work well. These fall into a category where solutions cannot easily be found in any deterministic way, such as a formula or specific algorithm. Problems that can easily be deterministically solved are usually solved faster using the predetermined solution rather than a genetic algorithm. The solutions also cannot be too random, where there is no easy way to tell whether a given proposition is close to the optimal solution. In these cases, the fitness function could not be computed in any efficient way. Analysis of the Genetic Algorithm in Simple Cases of Cryptoanalysis Monoalphabetic substitution ciphers are typical viewed as the simplest of all method of encryption. Consequently, they are also the simplest to decrypt as well. Methods of using cribs or frequency analysis can quickly decipher a given text. In the encryption, each instance of a letter is replaced by another letter. For example, every 'a' in a given text might be replaced by a 'x'. In one study in which a genetic algorithm was applied to monoalphabetic substitution,2 the individuals represent a substitution of all 26 letters. For example, (q,w,e,r,t,y,u,i,o,p,a,s,d,f,g,h,j,k,l,z,x,c,v,b,n,m) would be an individual and a key to decrypt some text. The fitness function is a count of adjacent letters compared to the average frequency of adjacent letters in a larger body of known text. The closer the frequency of the adjacent letter counts in the encrypted text decrypted with the key to the count in the larger amount of known text, the higher the fitness rating. The mutation is a swapping between random letters in the individual. Crossover is more difficult, however, because there cannot be more than one instance of a letter in any given individual. The process makes a copy of one of the parents and then gets a subsection of the other parent. The child then has its letters swapped around in a minimal fashion until the subsection of the parent is seen in the child in the same location. This method, and similar variants as discussed in the paper,2 do not work very well. If they do work, the standard methods of decryption are still much faster and more efficient. The reason for this is that the problem is a highly mathematical one if using the frequency of adjacent letters. It is also solved already in a variety of ways, and so a genetic algorithm will not work with any efficiency compared to the other solutions. Analysis of the Genetic Algorithm in Comlex Cases of Cryptoanalysis The Merkle-Hellman knapsack cipher has also been studied from the point of view of genetic algorithms. The knapsack cipher uses a superincreasing sequence of numbers, b, that are reordered in a secret way and then modified using the equation ai = W bi mod M. The W and M are also kept private, so that only the a sequence of numbers is public. To encrypt a given message, the message's binary equivalent form is divided into sections the size of public key sequence, making blocks of binary sequences. Then the inner product of each binary sequence with the a sequence is sent to the user as the encrypted code for that block. This is done repeatedly for each block of text that is encrypted.3 There exist methods that can efficiently decipher encrypted text with just the public key that are quite efficient. A genetic algorithm has been used as an attempt to create a more efficient method of cryptanalysis. The individuals are a binary sequence with each element being 0 or 1 representing whether or not that term should be included in the knapsack sum. The fitness function is very complicated3 and essentially measures the proximity of the sum of terms, using the individual's sequence, with the actual sum of the knapsack. The crossover process is just a swapping of the a block of elements in the sequences of two individuals. Mutation is even simpler, with a random bit in the individual's sequence flipped. This method was successful at solving simpler knapsack problems. Unfortunately, although there may have been some slight gain in searching a smaller key space, they overhead for the genetic algorithm canceled out any possible efficiencies. Thus, the traditional methods of solving the Merkle-Hellman knapsack problem are still more efficient than the use of genetic algorithms. Conclusion There were a number of other cryptographic systems studied by others.3 A current trend among nearly all of them is that either the genetic algorithm did not solve the problem, or when it did, it was not nearly as efficient as preexisting methods of cryptanalysis. This makes a strong argument that nearly all cryptographic systems that have known solutions are not well solved using genetic algorithms. Further study on this topic should include systems that do not have known, or efficiently solved, solutions. Genetic algorithms generally do very well with these types of problems and may greatly further the field of cryptanalysis. Bibliography 1. Biesbrock, R. History of GA's. http://www.estec.esa.nl/outreach/gatutor/history_of_ga.htm. 2. Gester, J. Solving Substitution Ciphers with Genetic Algorithm. http://www.cs.rochester.edu/u/brown/Crypto/studprojs/SubstGen.pdf. 3. Delman, B. Genetic Algorithms in Cryptography. https://ritdml.rit.edu/dspace/bitstream/1850/263/1/thesis.pdf.