Document Sample

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 3, March 2011 Implementation in Java of a Cryptosystem using a Dynamic Huffman Coding and Encryption Methods e Eug` ne C. Ezin e Institut de Math´ matiques et de Sciences Physiques e e Unit´ de Recherche en Informatique et Sciences Appliqu´ es e e e Universit´ d’Abomey-Calavi, R´ publique du B´ nin eugene.ezin@imsp-uac.org Abstract—Data transmission through a secure channel is im- The paper is organized as follows: section II presents some portant in our daily experience with the objective to ensure that basic concepts about secure systems and cryptography; section the receiver is the only one authorized and able to discover the III reviews the dynamic Huffman coding we introduced for message content. In this paper, and based upon the proposed model in [1], we implement a cryptosystem for data transmission. data compression [1]. Section IV presents the DES and RSA The plaintext message to transmit is ﬁrst compressed using algorithms. The model description is given in section V which the dynamic Huffman coding algorithm. The security level is consists of the alphabet construction for the plaintext message reinforced by transmitting the resulting message through an in subsection V-A whereas subsections V-B and V-C present encrypted subsystem using DES or/and RSA algorithms. The full respectively the encryption model and the decryption model. description of the whole system is given and the results analyzed. In section VI, we present the cryptosystem implementation in Keywords-component—Data compression, Huffman coding tech- detail using Java programming language meanwhile section nique, DES and RSA algorithms. VII shows the testing results of the proposed cryptosystem. We ﬁnally conlude this work and give some perspectives in I. I NTRODUCTION section VIII. Data transmission occurs when there is a channel between two machines. To exchange a datum, an encoding must be II. S ECURE S YSTEMS chosen for the transmission signals. This basically depends The term secure systems is somehow misleading and it upon the physical medium used to transfer the data, the implies that systems are either secure and insecure. In truth, guaranteed data integrity and the transmission speed. Data there is no absolute security. Every system can be broken given transmission can be simple if there are only two machines enough time and money. According to Knudsen et al. [2], communicating, or if only a single piece of data is sent. there are more secure and less secure systems but no totally Otherwise, it is necessary to install several transmission lines secure systems. When people talk about secure systems, they or to share the line among several different communication mean systems where security is a concern or was considered actors. Transmitting a message sometimes requires a safe as part of the design. The security of any application is channel to prevent an unauthorized person from discovering determined by the security of the platform it runs on, as well its content. as the security features designed into the application itself. In our previous work [1], we proposed a model that can be The most important tool used by applications for security is summarized as in Fig. 1. This model is based upon a dynamic cryptography, a branch of mathematics that deals with secret Message to tranmit Received message writing. Cryptography is the biggest tool in the application pro- Plaintext message Dynamic Huffman Coded message DES / RSA encryption Delivered message grammer’s arsenal even though a cryptographycally-enabled coding methods program is not necessarily a secure one. When correctly used, Receiver cryptography provides the standard security features that are Sender conﬁdentiality, integrity, authentiﬁcation and non-repudiation Fig. 1. Block diagram for transmitting the original message. [2], [3]. Conﬁdentiality assures that data cannot be viewed by unau- Huffman coding and encryption methods. thorized people. In this paper we implement such a cryptosystem considering Integrity assures that data cannot be changed without certain an alphabet with 256 symbols in the ASCII code based knowledge. The receiver of a message should be able to check on dynamic Huffman coding and existing DES and RSA whether the message has been modiﬁed during transmission, algorithms. By doing so, security is reinforced at two levels. either accidentally or deliberately. No one should be able to 154 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 3, March 2011 substitute a false message for the original message, or for parts left child inserts a 0. The result is a unique bit pattern for of it. each symbol that can be uniquely decoded. This is due to the Authentiﬁcation assures that people dealing with the system fact that no codeword can be a preﬁx of any larger-length are not imposters. The receiver of a message should be able codeword. to verify its origin. No one except Alice, should be able to The dynamic aspect of the Huffman coding we introduced is send a message to Bob and pretend to be Alice (data origin the following: instead of assigning frequency to each symbol authentiﬁcation). When initiating a communication, Alice and of the alphabet according to predeﬁnied values, we rather as- Bob should be able to identify each other i.e. the entity sign to each symbol of the alphabet a random value belonging authentiﬁcation. to the interval [0, 1]. We then normalized each frequency. By Non-repudiation assures that one can prove to a third party doing so, it is impossible to decode the compressed data by that he/she received a message from a speciﬁc and known using frequency analysis. A full description about the dynamic sender. The sender should not be able to later deny that he/she Huffman algorithm can be found in [1]. sent a message. IV. E NCRYPTION M ETHODS III. DYNAMIC H UFFMAN C ODING Encryption is a tool one can use to protect secrets. A cipher Applications of Huffman coding are pervasive in computer encrypts or decrypts data. Ciphers come in three ﬂavors: science. This coding scheme is not limited to encoding mes- • Symmetric, or private-key ciphers use a single secret key sages. Indeed, Huffman coding can be used to compress parts to encrypt and decrypt data. Symmetric keys can be useful of both digital photographs and other ﬁles such as digital sound in applications like hard-disk ﬁle encryption, when the ﬁles (MP3) and ZIP ﬁles. In [4], Massey showed that the same person encrypts and decrypts data. Data Encryption techniques of source coding readily applied in cryptography. System (or DES) is a symmetric cipher. The term symmet- Data reduction is accomplished by Patra and Sanka in [5] ric encryption is used since both communication partners in using Huffman coding and encryption by the insertion of use the same key k for encryption and decryption. shufﬂed cyclic redondancy code. Klein et al. [6] analyzed • Asymmetric, or public key ciphers use a pair of keys. One the cryptographic aspects of Huffman codes used to encode key is public and may be freely distributed while the other a large natural language on CD-ROM and concluded that key is private and should be kept secret. Data encrypted this problem is N P −complete for several variants of the with either key can be decrypted using the other key. In encoding process [7]. Rivest et al. [8] cryptanalysed a Huffman practice, the public key which may be freely distributed is encoded text assuming that the cryptanalyst does not know the used for data encryption and the private key which must codebook. According to them, cryptanalysis in this situation be kept secret is used for data decryption. is surprisingly difﬁcult and even impossible in some cases due • Hybrid systems use a combination of symmetric and to the ambiguity of the resulting encoded data. Data compres- asymmetric ciphers. Asymmetric ciphers are much slower sion algorithms have been considered by cryptographers as a than their symmetric counterparts. In a hybrid system, an ciphering scheme [9]. In summary, many researchers still pay asymmetric cipher is used to exchange a private key (also attention to Huffman coding. The following paragraphs brieﬂy called a secret key or a session key). The secret key is present the Huffman coding technique. used with a symmetric cipher for data encryption and The Huffman (or variable length coding) method is a decryption. lossless data compression algorithm based on the fact that Symmetric and asymmetric ciphers are described by algo- some symbols have a higher frequency of occurence than rithms. A hybrid system, is at a higher level, a protocol that others. These symbols are encoded in fewer bits than the ﬁxed uses both public and private key algorithms [2]. length coding producing in average a higher compression. The In subsections IV-A and IV-B we present the DES and RSA idea behind Huffman coding is to use shorter bit patterns algorithms. for more frequent symbols. To achieve that goal, Huffman algorithm needs the probability of occurence of each symbol. A. Presentation of DES Algorithm These symbols are stored in nodes and then sorted in ascending In 1973, The National Institute of Standards and Technol- order of their probability values. ogy published a solicitation for cryptosystem in the Federal The algorithm selects two nodes (children) with the smaller Register. This led to the adoption of the DES which stands probabilities and constructs a new node (parent) with a proba- for Data Encryption Standard. It’s a symmetric cipher, ﬁrst bility value equal to the sum of the probabilities of its children. published in 1975 and based largely on research performed at This process ends when the node with the probability equal to IBM. The National Security Agency (NSA) also had a hand in 1 is constructed. The result of this algorithm is a binary tree the algorithm, although its involvement and motives are still a data structure with the root being the node with probability subject of debate. A paper on the history of DES was written equal to 1, and the last nodes (leaves) represent the original by Smid and Brandstad [10]. symbols. The binary tree is then crossed to reach its leaves DES processes a plaintext blocks of n = 64 bits producing when a certain symbol is needed. Moving to a right child 64-bit ciphertext blocks with a secret key of 56 bits [11]. Its inserts a 1 in the bit pattern of the symbol, while moving to weakest part is that its 56-bit key size is stored in 8 bytes, 155 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 3, March 2011 which makes it vulnerable to key search attacks1 . For short, Bob computes c = me (mod n). • The DES algorithm takes 56-bit keys and 64-bit plaintext Data decryption. Alice can decrypt the message by doing messages as inputs and outputs a 64-bit cryptogram. the following: DES : {0, 1}56 × {0, 1}64 −→ {0, 1}64 (1) • Use the private key d. d • Compute m = e mod n. If the key k is chosen, we get RSA encryption/decryption is substantially slower than the DESk : {0, 1}64 −→ {0, 1}64 commonly used symmetric-key encryption algorithms such as (2) x −→ DES(k, x). DES [17]. In practice, RSA encryption is most commonly used for the transport of symmetric-key encryption algorithm An encryption with DESk consists of 16 major steps, called keys and for the encryption of small data items. More detailed rounds. In each of the 16 rounds, a 48-bit round key ki is used. information about the RSA encryption can be found in [19]. DES decryption is identical to encryption but the key scheduler is reverse [12]. V. M ODEL D ESCRIPTION The well-known drawback of a symmetric-key system is The full description of the proposed model in concern is that it requires a prior communication of the key between the done in [1]. Some extensions are done for its implementation involved people before any ciphertext is transmitted. In prac- that we mentioned in the following subsections by presenting tice, this may be very difﬁcult to achieve. A deep explanation the used alphabet, data encryption system modeling, and data about the DES can be found in [13]. decryption system modeling. B. Presentation of RSA Algorithm A. Presentation of the Alphabet The general idea about public-key cryptography was intro- An alphabet is important for the deﬁnition of the plaintext duced in the open literature by Difﬁe and Hellman in 1976 message. Indeed, a readable text is possible when both the [14]. The RSA encryption falls into this category of cryptog- sender and receiver use the same alphabet. Instead of using raphy. For a general survey article on public-key cryptography a restricted alphabet which consists of 36 symbols as we did the reader could refer to [15]. for symplicity in [1], we now consider the 256 symbols in the The RSA cryptosystem, named after its inventors R. Rivest, ASCII code which is used by many mailers through the world A. Shamir, and L. Adleman, is the most widely used public- [2]. Extension can be done easily using other codes such as key cryptosystem [16]. It may be used to provide both se- Unicode [20], etc. crecy and digital signatures and its security is based on the intractability of the integer factorization problem. B. Modeling Data Encryption System Let us assume Bob and Alice want to communicate using The proposed model for data transmission is shown in Fig. RSA encryption method. First, both of them create an RSA 2. public key and a corresponding private key. The algorithm has three steps, namely: the key generation for RSA public-key Identification of Huffman encryption, the RSA public-key encryption and the decryption. different symbols codewords Key generation for RSA public-key encryption. Alice should Compressed message do the following [17], [18]: Message Random probability to transmit value assignment with Huffman algorithm • Generate two distinct large random primes p and q, each Sender roughly the same size. • Compute n = pq and ϕ = (p − 1)(q − 1). 2 • Select a random integer e, 1 < e < ϕ such that Transmitted Encryption gcd(e, ϕ) = 1. method message • Use the extended Euclidean algorithm to compute this unique integer d, 1 < d < ϕ such that ed = 1 (mod ϕ). Receiver • Alice’s public key is (n, e); Alice’s private key is d. Fig. 2. Block diagram for transmitting the original message. Key generation for RSA public-key encryption. Bob en- crypts the plaintext message m for Alice by doing the follow- The sender uses the alphabet described in subsection V-A ing: to compose the plaintext message to transmit. Once the original message is formed, the set of different • Bob obtains Alice authentic public-key (n, e). symbols in such a message is considered with assignment • Bob represents the message as an integer m in the interval of random value probability to each symbol [1]. Huffman [0, n − 1]. algorithm is then applied to compress the data to transmit. 1 The 56 bits of the key are packed with 8 bits of parity. Next, an encryption method is used to encrypt the message 2 gcd stands for great common divisor. before delivering it to the receiver. 156 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 3, March 2011 In this paper, we consider the DES algorithm in the group In the second part, according to the encryption algorithm of symmetric cipher methods and the RSA algorithm as used, one can have one or two keys. In the case of DES belonging to asymmetric cypher methods. encryption method (or more generally a symmetric encrpytion algorithm), only a private key is in concern. In the case of C. Modeling Data Decryption System RSA encryption method (or more generally an asymmetric On the receiver’s side, the user needs to know the key of encryption algorithm), a private key (to keep secret) and a the encryption method and the codebook constructed based public key (can be freely distributed) are involved. Fig 4 on the dynamic Huffman coding. A full descrption about keys presents how the whole process of data encryption is involved involved in the system is given in subsection VI-A. Fig. 3 with the management of the keys. shows the process for decrypting the received message. Huffman Received message codewords Decryption Huffman decoding method method Receiver Alphabet symbols and their codewords Original message Fig. 3. Block diagram for decoding the received message. Fig. 4. Model describing the management of the keys. The major problem is about how to transfer the different VI. I MPLEMENTATION OF THE C RYPTOSYSTEM keys in a secure mode to avoid their descovering by imposters There are a number of excellent cryptographic libraries in or men in the middle. C and C++ [21], but Java serves as the reference since it is We suggest the following solutions when the DES encryp- very popular for new business and server application. Indeed, tion is chosen: memory management is automatically done eliminating entire • First, send the encrypted message. classes of bugs. It provides a standard cryptographic API • Then, transmit the decoding key, followed by the encryp- which, while not perfect or complete, is about as universal tion key. as we can get [22]. • Finally, the receiver should authentify himself or herself A. Keys Description before decrypting the message using the two keys. Two subsystems can be clearly identiﬁed in the proposed We suggest the following solutions when the DES and RSA model. The ﬁrst one concerns the data compression which uses encryptions are chosen: the dynamic Huffman coding and the second part concerns • First, send the encrypted message. encryption methods. Depending upon the used encryption • Then, transmit the decoding key. method, two or three keys are involved in the whole model. • Next, get the public key to encrypt the coded message. In the ﬁrst part, the compressed data are obtained using the • Finally, the receiver should authentify himself or herself dynamic Huffman coding [1]. The decoding process should before decrypting the message using the decoding key involve the codewords registrered as a binary tree or the and the private key. symbols with respect to their distrubution frequencies. To These solutions are not without weakness as it is the case with prevent the previous information from being as clear as a data transmission problems. The question How do we exhange plaintext for data decrpytion, one can encrypt it to obtain the keys in a safe mode ? is still a challenge in cryptography. the key. We call this latter, the decoding key. Therefore, the For security reasons the sender can use different channels decoding key can be one of the previous encrypted data namely to transmit each of the keys to the receiver. By doing so, even the codebook or the symbols with respect to their distribution though a man in the middle intercepts one of the keys, it will frequencies. Any encryption algorithm can be used at this be hard for him to recover the original message. stage. In our case, we used the DES encryption method for practical purpose. The use of the RSA algorithm at this step B. Interface of the Cryptosystem should involve two keys. That is the ﬁrst type of key involved The cryptosystem tool is built for a peer to peer architecture in the model. but it can easily be extended to distributed environment. 157 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 3, March 2011 Figure 5 presents the interface for the encryption process. The cryptosystem tool is built for a peer to peer architecture, The transaction form has information about the sender and but it can be easily extended to distributed environment. the receiver. Once the message is written, it must ﬁrst be coded by a dynamic Huffman technique (DH coding). The encrypted decoding key, thanks to DES algorithm, is generated and stored into a ﬁle. The sender can then chose the DSA or RSA algorithms to encrypt the previous coded message. The generated two keys are stored into different ﬁles in case of RSA algorithm and into one ﬁle in case of DSA algorithm. The labeled button Send is used to send all data to the receiver by electronic mail. Fig. 6. Interface for receiving a message. C. Overview of Classes used in the implemented Cryptosystem The Java Cryptography Encryption [2] includes many classes to handle various algorithms in cryptography. For instance, the class javax.crypto.KeyGenerator is suitable for randomly generating a single key. It uses a getInstance() factory method taking as parameter the name of the algorithm of interest. The general syntax is Fig. 5. Interface for sending a message. KeyGenerator kg = KeyGenerator.getInstance(”AlgoName”); Once the receiver downloads all the ﬁles attached to the The secret key of the DES algorithm is generated thanks to message in his inbox related to the transaction, he can use the class javax.crypto.SecretKey. SecretKey sk = KeyGenera- the reception form in Fig. 6 for the decryption process. All tor.getInstance(”DES”).generatesKey(); ﬁles are stored in a speciﬁc folder. The receiver selects the The private and public keys of the RSA algorithm are gener- encrypted ﬁle and choses the appropriate decryption algorithm. ated thanks to the classes java.security.KeyPairGenerator and Then he clicks on the button Decryption. The receiver obtains java.security.KeyPair. the original plaintext message by clicking on the button DH Decoding for the dynamic Huffman decoding procedure. KeypairGenerator kp = KeyGenerator.getInstance(”RSA”); 158 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 3, March 2011 kp.initialize(512); R EFERENCES KeyPair keys = kp.generateKeyPair(); [1] E. C. Ezin, “Modeling data transmission through a channel based on huffman coding and encryptions methods,” International Journal of We develop a package which contains four classes. Computer Science and Information Security, vol. 8, no. 9, pp. 195–199, 2010. • The class DES has two methods cryptage() and de- [2] J. Knudsen, Java Cryptography. O’Reilly and Associates, Inc., 1998. cryptage() for data encryption and data decryption using [3] H. Delfs and H. Knebl, Introduction to Cryptography. Springer, 2007. DES algorithm. [4] J. L. Massey, “Some applications of source coding in cryptography,” Euopean Transaction on Telecommunication, vol. 5, pp. 421–429, 1994. • The class DES has many methods among which [5] N. Patra and S. Sankar, “Data reduction by huffman coding and getPublicKeyInBytes(), getPrivateKeyInBytes(), setPub- encryption by insertion of shufﬂe cyclic redundancy code,” Master’s licKey(), setPrivateKey() myKeys(), generateKeyPair()(), thesis, National Institute of Technology, Rouskela, 2007. [6] S. T. Klein et al., “Storing text retrieval systems on cdrom: Compression crypt(), decryptInBytes(), decryptInString(), signature- and encryption considerations,” in ACM Transactions on Information Series(), etc. Systems, vol. 7, no. 3, 1989. • The class DESKey has a method which generates a secret [7] S. T. Klein and A. S. Fraenkel, “Algorithmica 12,” 1989. [8] R. L. Rivest et al., “On breaking a huffman code,” in Proc. IEEE key. The latter is registered into a ﬁle. Transactions on Information Theory, vol. 42, no. 3, 1996. • The class RSAKey has a method which generates the [9] G. Simmons, Contemporary Cryptology - the Science of Information private and public keys registered into ﬁles. Integrity. IEEE Press, 1991. [10] M. E. Smid and D. K. Brandstad, “The data encryption standard: past and future,” in Contemporary Cryptology, the Science of Information VII. E VALUATION Integrity, vol. 46, pp. 43–64. [11] N. I. of Standards and Technology, “Data encryption standard,” in To test the implemented system, we imagine Bob as a user Federal Information Processing Standards, vol. 46, no. 3, 1977. of the interface (see Fig. 5) to send the plaintext message [12] D. R. Stinson, Cryptography: Theory and Practice, Third Edition cryptography handles a part of security to Alice. (Discrete Mathematics ans its applications). Taylor abd Francis Group LLC, 1995. The generated three ﬁles containing the keys should be sent [13] S. Landau, “Standing the test of time: Data encryption standards,” in to Alice through an electronic message within ﬁve minutes Notice of the American Mathematical Society, vol. 47, 2000, pp. 450– after sending the crypted message. Alice then can use the 459. [14] W. Difﬁe and M. Hellman, “New direction in cryptography,” in Trans- interface (see Fig. 6) to decrypt the original message coming action on Information Theory, vol. 22, 1976, pp. 644–654. from Bob. [15] A. J. Menezes and N. Koblitz, “A survey on public-key cryptography,” in SIAM Review, vol. 46, 2004, pp. 599–634. [16] R. L. Rivest et al., “A method for obtaining digital signatures and public- VIII. C ONCLUSION AND FURTHER WORKS key cryptosystem,” in Communication of the ACM, vol. 21, no. 2, 1978, pp. 120–126. In this paper, we implemented in Java a cryptosystem based [17] P. v. O. A. Menezes and S. Vanstone, Handbook of Applied Cryptogra- on dynamic Huffman coding and DES and RSA encryption phy. CRC Press, 1996. methods. The advantage of such a tool can be observed at [18] B. Schneier, Applied Cryptography. John Wiley & Sons Inc., 1996. [19] R. Boney, “Twenty years of attacks on the rsa cryptosystem,” in Notices two levels. of the American Mathematical Society, vol. 46, 1999, pp. 203–216. The use of dynamic Huffman coding algorithm introduces [20] J. K. Korpela, Unicode Explained. O’Reilly and Associates, Inc., 2006. a security level for decoding the plaintext message without [21] M. Welschenbach, Cryptography in C and C++, Second Edition. Apress, 2005. any knowledge about the symbol frequency. The key plaintext [22] N. Galbreath, Crptography for Internet and Database Applications. information to decode the coded message is encrypted before Wiley Publishing, Inc., 2002. its delivering to the receiver. The second level of security is introduced by DES and RSA AUTHOR PROFILE algorithms. When using the DES algorithm, the private key Eug` ne C. Ezin received his Ph.D e should be kept as secret as possible as it is for the symmetric degree with highest level of distinction encryption methods. When using the RSA algorithm, the in 2001 after research works carried out private generated and public keys also should be kept in a on neural and fuzzy systems for speech safe way. applications at the International Institute The drawback of the proposed system is the transmission for Advanced Scientiﬁc Studies in Italy. of the different keys in a safe mode. This is a general problem Since 2007, he has been a senior lecturer of symmetric and asymmetric encryptions. The idea to send in computer science. He is a reviewer of the crypted message and different keys within ﬁve minutes is Mexican International Conference on Artiﬁcial Intelligence. formulated in order to limit the risk of a man in the middle His research interests include high performance computing, trying to intercept the crypted message to discover its contents. neural network and fuzzy systems, signal processing, cryptog- Future works will include the security aspects of the whole raphy, modeling and simulation. He is currently in charge of system for data transmission. The channel coding for the data the master program in computer science and applied sciences transmission will also be explorated. e at the Institut de Math´ matiques et de Sciences Physiques of the Abomey-Calavi University in the Republic of Benin. ACKNOWLEDGMENT We thank anonymous reviewers for their review efforts. 159 http://sites.google.com/site/ijcsis/ ISSN 1947-5500

DOCUMENT INFO

Shared By:

Categories:

Tags:
IJCSIS, call for paper, journal computer science, research, google scholar, IEEE, Scirus, download, ArXiV, library, information security, internet, peer review, scribd, docstoc, cornell university, archive, Journal of Computing, DOAJ, Open Access, March 2011, Volume 9, No. 3, Impact Factor, engineering, international, proQuest, computing, computer, technology

Stats:

views: | 379 |

posted: | 4/9/2011 |

language: | English |

pages: | 6 |

OTHER DOCS BY ijcsis

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.