VIEWS: 303 PAGES: 3 CATEGORY: Emerging Technologies POSTED ON: 12/4/2010
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 8, November 2010 Steganography and Error-Correcting Codes M.B. Ould MEDENI and El Mamoun SOUIDI Laboratory of Mathematic Informatics and Applications University Mohammed V-Agdal, Faculty of Sciences Rabat ,BP 1014, Morocco Email : sbaimedeni@yahoo.fr, souidi@fsr.ac.ma Abstract—In this work, We study how we used the error- however, are more interested in the average number of correcting codes in steganographic protocols (sometimes also embedding changes rather than the worst case. In fact, the called the ”matrix encoding”), which uses a linear code as an concept of embedding efﬁciency-the average number of bits ingredient. Among all codes of a ﬁxed block length and ﬁxed dimension (and thus of a ﬁxed information rate), an optimal embedded per embedding change-has been frequently used code is one that makes most of the maximum length embeddable in steganography to compare and evaluate performance of (MLE). the steganographic protocols are close in error-correcting steganographic schemes. codes. We will clarify this link, which will lead us to a bound on the maximum capacity that can have a steganographic protocol There are two parameters which help to evaluate the and give us a way to build up optimal steganographic protocols performance of a steganographic protocol [n, k, ρ] over a Keywords: Steganography, Error-correcting code, average cover message of N symbols : the average distortion D = Ra , N distortion, matrix encoding, embeding efﬁcient. where Ra is the expected number of changes over uniformly t distributed messages; and the embedding rate E = N , which I. I NTRODUCTION is the amount of bits that can be hidden in a cover message. The goal of digital steganography is to modify a digital In general, for the same embedding rate a method is better object (cover) to encode and conceal a sequence of bits when the average distortion is smaller. (message) to facilitate covert communication. This process is fundamentally different from cryptography, where the The remainder of the paper is organized as follows. Section existence of a secret message may be suspected by anyone 2 gives a brief introduction to Error-correcting Codes. In who can observe the scrambled ciphertext while it is Section 3 we introduces the relations between error-correcting communicated. codes and steganography.We construct our approach and re- A common technique in steganography is to embed the port on experimental results in section 4. Section 5 gives a hidden message into a larger cover object (such as a digital conclusion. image, for example) by slightly distorting the cover object in a way that on one hand makes it possible for the intended II. BASICS O F C ODING T HEORY recipient to extract the hidden message, but on the other We now review some elementary concepts from coding hand makes it very hard for everybody else to detect the theory that are relevant for our study. A good introductory distortion of the cover object (i.e., to detect the existence of text to this subject is, for example, [7]. Throughout the text, the hidden message). The amount of noise that is naturally boldface symbols denote vectors or matrices. A binary code (inherently) present in the cover object determines the amount C is any subset of the space of all n-bit column vectors of distortion that can be introduced into the cover object x = (x1 , ..., xn ) ∈ {0, 1}n . The vectors in C are called before the distortion becomes detectable. codewords. The set {0, 1}n forms a linear vector space if An interesting steganographic method is known as matrix we deﬁne the sum of two vectors and a multiplication of a encoding, introduced by Crandall [2]. Matrix encoding vector by scalar using the usual arithmetics in the ﬁnite ﬁeld n requires the sender and the recipient to agree in advance GF (2) ; we will denote this ﬁeld by F2 . For any C, D ⊂ F2 n on a parity check matrix H, and the secret message is then and vector x, C + D = {y ∈ F2 |y = c + d, c ∈ C, d ∈ D} ; n extracted by the recipient as the syndrome (with respect to H) x + C = {y ∈ F2 |y = x + c, c ∈ C}. The Hamming weight of the received cover object. This method was made popular w of a vector x is deﬁned as the number of ones in x. The by Westfeld [1], who incorporated a speciﬁc implementation distance between two vectors x and y is the Hamming weight using Hamming codes in his F5 algorithm, which can embed t of their difference d(x, y) = w(x − y). For any x ∈ C and a bits of message in 2t − 1 cover symbols by changing, at most, positive real number r, we denote as B(x, r) the ball with n one of them. Matrix encoding using linear codes (syndrome center x and radius r, B(x, r) = {y ∈ F2 |d(x, y) ≤ r}. n coding) is a general approach to improving embedding We also deﬁne the distance between x and set C ⊂ F2 as efﬁciency of steganographic schemes. The covering radius of d(x, C) = minc∈C d(x, c). The covering radius R of C is the code corresponds to the maximal number of embedding deﬁned as R = maxx∈F2 d(x, C). The average distance to n changes needed to embed any message [8]. Steganographers, code C, deﬁned as Ra = 2−n x∈F n d(x, C), is the average 2 147 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 8, November 2010 n distance between a randomly selected vector from F2 and 3) modiﬁcations on s are translated on the cover-medium the code C. Clearly Ra ≤ R. to obtain the stego-medium. Here, we assume that the detectability of the embedding Linear codes are codes for which C is a linear vector increases with the number of symbols that must be changed n subspace of F2 . If C has dimension k, we call C a linear to go from v to s ( [5] for some examples of this framework). code of length n and dimension k (and codimension n − k), Syndrome coding deals with this number of changes. The or we say that C is an [n, k] code. Each linear code C of key idea is to use some syndrome computation to embed the dimension k has a basis consisting of k vectors. Writing the message into the cover-data. In fact, such a scheme uses a basis vectors as rows of an k × n matrix G, we obtain a linear code C, more precisely its cosets, to hide m. A word s generator matrix of C. Each codeword can be written as a hides the message m if s lies in a particular coset of C, related linear combination of rows from G. There are 2k codewords to m. Since cosets are uniquely identiﬁed by the so-called n in an [n, k] code. Given x, y ∈ F2 , we deﬁne their dot product syndromes, embedding/hiding consists exactly in searching s x.y = x1 y1 + x2 y2 + ... + xn yn , all operations in GF (2). We with syndrome m, close enough to v. say that x and y are orthogonal if x.y = 0. Given a code C, the dual code of C, denoted as C ⊥ , is the set of all vectors x that are orthogonal to all vectors in C. The dual code of a A. Matrix Encoding [n, k] code is a [n, n − k] code with an (n − k) × n generator We ﬁrst set up the notation and describe properly the matrix matrix H with the property that encoding framework and its inherent problems. Let v ∈ Fq n r denote the cover-data and m ∈ Fq the message. We are Hx = 0 ⇔ x ∈ C. (1) looking for two mappings, embedding Emb and extraction The matrix H is called the parity check matrix of C. For any Ext, such that n x ∈ F2 , the vector s = Hx is called the syndrome of x. For n r n−k each syndrome s ∈ F2 , the set C(s) = {x ∈ F2 |Hx = n ∀(v, m) ∈ Fq × Fq , Ext(Emb(v, m)) = m. (2) s} is called a coset. Note that C(0) = C. Obviously, cosets associated with different syndromes are disjoint. Also, from elementary linear algebra we know that every coset can be n r ∀(v, m) ∈ Fq × Fq , d(v, Emb(v, m)) ≤ T. (3) written as C(s) = x + C, where x ∈ C(s) arbitrary. Thus, there are 2n−k disjoint cosets, each consisting of 2k vectors. Equation (2) means that we want to recover the message in Any member of the coset C(s) with the smallest Hamming all cases; (3) means that we authorize the modiﬁcation of at weight is called a coset leader and will be denoted as eL (s). most T coordinates in the vector v. 1) Lemma: Given a coset C(s), for any x ∈ C(s), From Error Correcting Codes (Section 2), it is quite easy to d(x, C) = w(eL (s). Moreover, if d(x, C) = d(x, c ) for some show that the scheme deﬁned by c ∈ C, the vector x − c is a coset leader. Proof : d(x, C) = minc∈C w(x − c) = miny∈C(s) w(y) = w(eL (s)). The second equality follows from the fact that if c Emb(v, m) = v + D(m − E(v)) (4) goes through the code C, x − c goes through all members of the coset C(s). Ext(y) = E(y) = y × H t . (5) 2) Lemma: If C is an [n, k] code with a (n − k) × n parity check matrix H and covering radius R, then any syndrome D and E mean respectively the decoding function and the n−k s ∈ F2 can be written as a sum of at most R columns function of the syndrome. enables to embed messages of of H and R is the smallest such number. Thus, we can also length r = n − k in a cover-data of length n, while modifying deﬁne the covering radius as the maximal weight of all coset at most T = R elements of the cover-data. leaders. The parameter n−k represents the embedding efﬁciency, that R n Proof : Any x ∈ F2 belongs to exactly one coset C(s) and is, the number of embedded symbols per embedding changes. from Lemma 1 we know that d(x, C) = w(eL (s)). But the Linking symbols with bits is not simple, as naive solutions weight w(eL (s)) is the smallest number of columns in H that lead to bad results in terms of efﬁciency. For example, if must be added to obtain s. elements of Fq are viewed as blocks of L bits, modifying a symbol roughly leads to L bit ﬂips on average and L for the 2 III. L INEAR C ODES FOR S TEGANOGRAPHY worst case. The behavior of a steganographic algorithm can be sketched in the following way: A problem raised by the matrix encoding, as presented 1) a cover-medium is processed to extract a sequence of above, is that any position in the cover-data v can be changed. symbols v, sometimes called cover-data; In some cases, it is more reasonable to keep some coordinates 2) v is modiﬁed into s to embed the message m; s is unchanged because they would produce too big artifacts in the sometimes called the stego-data; stego-data. 148 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 8, No. 8, November 2010 1) Example: We now give an example of a proto- 3) Proposition: Denote by rL (n, ρ) the largest possible col steganography constructed from a linear single-error- value of r = n − k for [n, k, ρ] steganographic protocol, correcting code. This was also discussed, for example, in [3]. similarly let r(n, ρ) denote the largest possible value of Start from the matrix r = n − log(M ) or M the number of messages that can hide using binary coverwords of length n. Let us recall the 1 0 0 1 1 0 1 H= 0 1 0 1 0 1 1 following result 0 0 1 0 1 1 1 ρ ρ i i whose entries are elements of F2 . Extracting Scheme is deﬁned log( Cn ) − log(n) ≤ rL (n, ρ) ≤ r(n, ρ) ≤ log( Cn ) i=0 i=0 7 3 Ext : F2 → F2 4) Corollary: Let h(n, T ) be the maximal number of bits Ext(x1 , x2 , x3 , x4 , x5 , x6 , x7 ) = (y1 , y2 , y3 ), where y1 = embeddable by schemes using binary coverwords of length n x1 + x4 + x5 + x7 , y2 = x2 + x4 + x6 + x7 and y3 = with quality threshold T . We have x3 + x5 + x6 + x7 . This function can be described in terms of rL (n, T ) ≤ h(n, T ) ≤ r(n, T ) matrix H.In fact, yi , is the dot product of x and the i-th row of H. We claim that Ext is a extracting function of proto- Proof. Hence, an steganographic protocol (with quality col steganography (7, 3, 1) Embedding Scheme for example, threshold T ) is a slightly stronger condition than T -covering, Ext(0, 0, 1, 1, 0, 1, 0) = (1, 0, 0). Assume y = (1, 1, 1). We because steganographic protocol generates not a single, but claim that it is possible to replace x = (0, 0, 1, 1, 0, 1, 0) by |X| disjoint T -coverings. But in particular case of linear x such that Ext(x ) = (1, 1, 1) and d(x, x ) = 1. In fact, coverings these two notions coincide. we claim more: the coordinate where x has to be changed is uniquely determined. In our case, this is coordinate number V. C ONCLUSION 6, so x = (0, 0, 1, 1, 0, 0, 0), Here is the general embedding Application of error-correcting codes to data embedding rule: form Ext(x) + y, (in the example this is 011). Find the improves embedding efﬁciency and security of steganographic column of H which has these entries (in our example, this is schemes. In this paper, we show some relations between the sixth column). This marks the coordinate where x needs to steganographic algorithms and error-correcting codes. By us- be changed to embed payload y. This procedure indicates how ing these relations we give some bound on the performance H and Ext were constructed and how this can be generalized: of embedding schemes the columns of H are simply all nonzero 3-tuples in some R EFERENCES order. In general, we start from our choice of n and write a matrix H whose columns consist of all nonzero n-tuples. [1] R. Crandall : Some notes on steganography, Posted on Steganography Mailing List, 62 (1998), http://os.inf.tu-dresden.de/ westfeld/crandall.pdf. Then H has N = 2n − 1 columns. The extracting function, [2] Zhang. W, S. Li, A coding problem in steganography, Des. Codes N n Ext : F2 → F2 is deﬁned by way of the dot products with Cryptogr, 46(2009) pp 67-81. the rows of H. Finally it is clear that embedding efﬁciency= 3. [3] A. Westfeld : F5: A steganographic algorithm, High capacity despite better steganalysis. In: Moskowitz, I.S., (ed.) IH 2001 LNCS, vol. 2137, IV. A SYMPTOTICLY T IGHT B OUND ON THE pp. 289302. Springer, Heidelberg (2001). [4] G. J. Simmons : The prisoners problem and the subliminal channel, in P ERFORMANCE OF E MBEDDING S CHEMES Advances in Cryptology, pp. 5167, Plenum Press, New York, NY, USA, Our goal in this section is obtain steganographic proto- 1984. [5] C. Cachin : An information-theoretic model for steganography, in D. cols asymptotically optimal. Initially, the relationship updates Aucsmith (ed.): Information Hiding. 2nd International Workshop, LNCS in Section 3, between steganographic protocols and error- vol. 1525, Springer-Verlag Berlin Heidelberg (1998), 306-318. correcting codes, we used to translate the bounds on the error- [6] Y. Kim, Z. Duric, D. Richards : Modiﬁed matrix encoding technique for minimal distortion steganography, In: Camenisch, J.L., Collberg, C.S., correcting codes in bounds, upper and lower, of the maximum Johnson, N.F., Sallee, P. (eds.) IH 2006. LNCS, vol. 4437, pp. 314-327 number of messages in a schema. The bounds on the error- (2007). correcting codes are known to be achieved by linear codes, we [7] F.J. Mac Williams, N. Sloane : The Theory of Error Correcting Codes, North-Holland, Amsterdam, 1977. use this result to the next section to construct our protocols. [8] M.B. Medeni, EL. Souidi : A Steganography Schema and Error- 2) Proposition: [(Zhang [2], Theorem 6).] Correcting Codes, Journal of Theoretical and Applied Information Tech- The parameters [n, k, ρ] of a steganographic protocol de- nology, 7Vol18No1, pp 42-47 (2010). ﬁned over a ﬁeld Fq of cardinality q, verify that q k ≤ Vq (n, ρ). Proof. it sufﬁces to prove the result for proper n k steganographic protocols. Take x ∈ Fq . For all s ∈ Fq there exists y ∈ B(x, ρ) such that Ext(y) = s, hence k card(B(x, ρ)) ≥ card(Fq ) This bound is analogous to the Hamming bound in Coding Theory. Thus, we can refer to it as the steganographic Hamming bound. Protocols reaching equality are called maximum length embeddable (MLE) in [2] 149 http://sites.google.com/site/ijcsis/ ISSN 1947-5500