Steganography and Error-Correcting Codes

Document Sample
Steganography and Error-Correcting Codes Powered By Docstoc
					                                                       (IJCSIS) International Journal of Computer Science and Information Security,
                                                       Vol. 8, No. 8, November 2010

            Steganography and Error-Correcting Codes
                                       M.B. Ould MEDENI and El Mamoun SOUIDI
                                     Laboratory of Mathematic Informatics and Applications
                                      University Mohammed V-Agdal, Faculty of Sciences
                                                   Rabat ,BP 1014, Morocco
                                        Email :,

   Abstract—In this work, We study how we used the error-            however, are more interested in the average number of
correcting codes in steganographic protocols (sometimes also         embedding changes rather than the worst case. In fact, the
called the ”matrix encoding”), which uses a linear code as an        concept of embedding efficiency-the average number of bits
ingredient. Among all codes of a fixed block length and fixed
dimension (and thus of a fixed information rate), an optimal          embedded per embedding change-has been frequently used
code is one that makes most of the maximum length embeddable         in steganography to compare and evaluate performance of
(MLE). the steganographic protocols are close in error-correcting    steganographic schemes.
codes. We will clarify this link, which will lead us to a bound on
the maximum capacity that can have a steganographic protocol            There are two parameters which help to evaluate the
and give us a way to build up optimal steganographic protocols
                                                                     performance of a steganographic protocol [n, k, ρ] over a
   Keywords: Steganography, Error-correcting code, average
                                                                     cover message of N symbols : the average distortion D = Ra ,
distortion, matrix encoding, embeding efficient.
                                                                     where Ra is the expected number of changes over uniformly
                                                                     distributed messages; and the embedding rate E = N , which
                      I. I NTRODUCTION
                                                                     is the amount of bits that can be hidden in a cover message.
   The goal of digital steganography is to modify a digital          In general, for the same embedding rate a method is better
object (cover) to encode and conceal a sequence of bits              when the average distortion is smaller.
(message) to facilitate covert communication. This process
is fundamentally different from cryptography, where the                The remainder of the paper is organized as follows. Section
existence of a secret message may be suspected by anyone             2 gives a brief introduction to Error-correcting Codes. In
who can observe the scrambled ciphertext while it is                 Section 3 we introduces the relations between error-correcting
communicated.                                                        codes and steganography.We construct our approach and re-
A common technique in steganography is to embed the                  port on experimental results in section 4. Section 5 gives a
hidden message into a larger cover object (such as a digital         conclusion.
image, for example) by slightly distorting the cover object in
a way that on one hand makes it possible for the intended                            II. BASICS O F C ODING T HEORY
recipient to extract the hidden message, but on the other               We now review some elementary concepts from coding
hand makes it very hard for everybody else to detect the             theory that are relevant for our study. A good introductory
distortion of the cover object (i.e., to detect the existence of     text to this subject is, for example, [7]. Throughout the text,
the hidden message). The amount of noise that is naturally           boldface symbols denote vectors or matrices. A binary code
(inherently) present in the cover object determines the amount       C is any subset of the space of all n-bit column vectors
of distortion that can be introduced into the cover object           x = (x1 , ..., xn ) ∈ {0, 1}n . The vectors in C are called
before the distortion becomes detectable.                            codewords. The set {0, 1}n forms a linear vector space if
An interesting steganographic method is known as matrix              we define the sum of two vectors and a multiplication of a
encoding, introduced by Crandall [2]. Matrix encoding                vector by scalar using the usual arithmetics in the finite field
requires the sender and the recipient to agree in advance            GF (2) ; we will denote this field by F2 . For any C, D ⊂ F2
on a parity check matrix H, and the secret message is then           and vector x, C + D = {y ∈ F2 |y = c + d, c ∈ C, d ∈ D} ;
extracted by the recipient as the syndrome (with respect to H)       x + C = {y ∈ F2 |y = x + c, c ∈ C}. The Hamming weight
of the received cover object. This method was made popular           w of a vector x is defined as the number of ones in x. The
by Westfeld [1], who incorporated a specific implementation           distance between two vectors x and y is the Hamming weight
using Hamming codes in his F5 algorithm, which can embed t           of their difference d(x, y) = w(x − y). For any x ∈ C and a
bits of message in 2t − 1 cover symbols by changing, at most,        positive real number r, we denote as B(x, r) the ball with
one of them. Matrix encoding using linear codes (syndrome            center x and radius r, B(x, r) = {y ∈ F2 |d(x, y) ≤ r}.
coding) is a general approach to improving embedding                 We also define the distance between x and set C ⊂ F2 as
efficiency of steganographic schemes. The covering radius of          d(x, C) = minc∈C d(x, c). The covering radius R of C is
the code corresponds to the maximal number of embedding              defined as R = maxx∈F2 d(x, C). The average distance to

changes needed to embed any message [8]. Steganographers,            code C, defined as Ra = 2−n x∈F n d(x, C), is the average

                                                                                                  ISSN 1947-5500
                                                     (IJCSIS) International Journal of Computer Science and Information Security,
                                                     Vol. 8, No. 8, November 2010

distance between a randomly selected vector from F2 and               3) modifications on s are translated on the cover-medium
the code C. Clearly Ra ≤ R.                                              to obtain the stego-medium.
                                                                      Here, we assume that the detectability of the embedding
   Linear codes are codes for which C is a linear vector           increases with the number of symbols that must be changed
subspace of F2 . If C has dimension k, we call C a linear          to go from v to s ( [5] for some examples of this framework).
code of length n and dimension k (and codimension n − k),          Syndrome coding deals with this number of changes. The
or we say that C is an [n, k] code. Each linear code C of          key idea is to use some syndrome computation to embed the
dimension k has a basis consisting of k vectors. Writing the       message into the cover-data. In fact, such a scheme uses a
basis vectors as rows of an k × n matrix G, we obtain a            linear code C, more precisely its cosets, to hide m. A word s
generator matrix of C. Each codeword can be written as a           hides the message m if s lies in a particular coset of C, related
linear combination of rows from G. There are 2k codewords          to m. Since cosets are uniquely identified by the so-called
in an [n, k] code. Given x, y ∈ F2 , we define their dot product    syndromes, embedding/hiding consists exactly in searching s
x.y = x1 y1 + x2 y2 + ... + xn yn , all operations in GF (2). We   with syndrome m, close enough to v.
say that x and y are orthogonal if x.y = 0. Given a code C,
the dual code of C, denoted as C ⊥ , is the set of all vectors
x that are orthogonal to all vectors in C. The dual code of a      A. Matrix Encoding
[n, k] code is a [n, n − k] code with an (n − k) × n generator       We first set up the notation and describe properly the matrix
matrix H with the property that                                    encoding framework and its inherent problems. Let v ∈ Fq     n
                                                                   denote the cover-data and m ∈ Fq the message. We are
                      Hx = 0 ⇔ x ∈ C.                        (1)   looking for two mappings, embedding Emb and extraction
The matrix H is called the parity check matrix of C. For any       Ext, such that
x ∈ F2 , the vector s = Hx is called the syndrome of x. For                             n    r
each syndrome s ∈ F2 , the set C(s) = {x ∈ F2 |Hx =  n                       ∀(v, m) ∈ Fq × Fq , Ext(Emb(v, m)) = m.                   (2)
s} is called a coset. Note that C(0) = C. Obviously, cosets
associated with different syndromes are disjoint. Also, from
elementary linear algebra we know that every coset can be                               n    r
                                                                             ∀(v, m) ∈ Fq × Fq , d(v, Emb(v, m)) ≤ T.                  (3)
written as C(s) = x + C, where x ∈ C(s) arbitrary. Thus,
there are 2n−k disjoint cosets, each consisting of 2k vectors.     Equation (2) means that we want to recover the message in
Any member of the coset C(s) with the smallest Hamming             all cases; (3) means that we authorize the modification of at
weight is called a coset leader and will be denoted as eL (s).     most T coordinates in the vector v.
   1) Lemma: Given a coset C(s), for any x ∈ C(s),                 From Error Correcting Codes (Section 2), it is quite easy to
d(x, C) = w(eL (s). Moreover, if d(x, C) = d(x, c ) for some       show that the scheme defined by
c ∈ C, the vector x − c is a coset leader.
   Proof : d(x, C) = minc∈C w(x − c) = miny∈C(s) w(y) =
w(eL (s)). The second equality follows from the fact that if c                     Emb(v, m) = v + D(m − E(v))                         (4)
goes through the code C, x − c goes through all members of
the coset C(s).                                                                        Ext(y) = E(y) = y × H t .                       (5)
   2) Lemma: If C is an [n, k] code with a (n − k) × n parity
check matrix H and covering radius R, then any syndrome            D and E mean respectively the decoding function and the
s ∈ F2      can be written as a sum of at most R columns           function of the syndrome. enables to embed messages of
of H and R is the smallest such number. Thus, we can also          length r = n − k in a cover-data of length n, while modifying
define the covering radius as the maximal weight of all coset       at most T = R elements of the cover-data.
leaders.                                                           The parameter n−k represents the embedding efficiency, that
   Proof : Any x ∈ F2 belongs to exactly one coset C(s) and        is, the number of embedded symbols per embedding changes.
from Lemma 1 we know that d(x, C) = w(eL (s)). But the             Linking symbols with bits is not simple, as naive solutions
weight w(eL (s)) is the smallest number of columns in H that       lead to bad results in terms of efficiency. For example, if
must be added to obtain s.                                         elements of Fq are viewed as blocks of L bits, modifying a
                                                                   symbol roughly leads to L bit flips on average and L for the
                                                                   worst case.
   The behavior of a steganographic algorithm can be sketched
in the following way:                                                 A problem raised by the matrix encoding, as presented
   1) a cover-medium is processed to extract a sequence of         above, is that any position in the cover-data v can be changed.
      symbols v, sometimes called cover-data;                      In some cases, it is more reasonable to keep some coordinates
   2) v is modified into s to embed the message m; s is             unchanged because they would produce too big artifacts in the
      sometimes called the stego-data;                             stego-data.

                                                                                                ISSN 1947-5500
                                                        (IJCSIS) International Journal of Computer Science and Information Security,
                                                        Vol. 8, No. 8, November 2010

  1) Example: We now give an example of a proto-                          3) Proposition: Denote by rL (n, ρ) the largest possible
col steganography constructed from a linear single-error-              value of r = n − k for [n, k, ρ] steganographic protocol,
correcting code. This was also discussed, for example, in [3].         similarly let r(n, ρ) denote the largest possible value of
Start from the matrix                                                  r = n − log(M ) or M the number of messages that can
                                                                       hide using binary coverwords of length n. Let us recall the
                                              
                      1 0 0 1 1 0 1
             H= 0 1 0 1 0 1 1                                        following result
                      0 0 1 0 1 1 1                                              ρ                                                          ρ
                                                                                       i                                                          i
whose entries are elements of F2 . Extracting Scheme is defined           log(         Cn ) − log(n) ≤ rL (n, ρ) ≤ r(n, ρ) ≤ log(                 Cn )
                                                                                i=0                                                        i=0
                                7    3
                         Ext : F2 → F2
                                                                         4) Corollary: Let h(n, T ) be the maximal number of bits
Ext(x1 , x2 , x3 , x4 , x5 , x6 , x7 ) = (y1 , y2 , y3 ), where y1 =   embeddable by schemes using binary coverwords of length n
x1 + x4 + x5 + x7 , y2 = x2 + x4 + x6 + x7 and y3 =                    with quality threshold T . We have
x3 + x5 + x6 + x7 . This function can be described in terms of
                                                                                            rL (n, T ) ≤ h(n, T ) ≤ r(n, T )
matrix H.In fact, yi , is the dot product of x and the i-th row
of H. We claim that Ext is a extracting function of proto-                Proof. Hence, an steganographic protocol (with quality
col steganography (7, 3, 1) Embedding Scheme for example,              threshold T ) is a slightly stronger condition than T -covering,
Ext(0, 0, 1, 1, 0, 1, 0) = (1, 0, 0). Assume y = (1, 1, 1). We         because steganographic protocol generates not a single, but
claim that it is possible to replace x = (0, 0, 1, 1, 0, 1, 0) by      |X| disjoint T -coverings. But in particular case of linear
x such that Ext(x ) = (1, 1, 1) and d(x, x ) = 1. In fact,             coverings these two notions coincide.
we claim more: the coordinate where x has to be changed is
uniquely determined. In our case, this is coordinate number                                        V. C ONCLUSION
6, so x = (0, 0, 1, 1, 0, 0, 0), Here is the general embedding            Application of error-correcting codes to data embedding
rule: form Ext(x) + y, (in the example this is 011). Find the          improves embedding efficiency and security of steganographic
column of H which has these entries (in our example, this is           schemes. In this paper, we show some relations between
the sixth column). This marks the coordinate where x needs to          steganographic algorithms and error-correcting codes. By us-
be changed to embed payload y. This procedure indicates how            ing these relations we give some bound on the performance
H and Ext were constructed and how this can be generalized:            of embedding schemes
the columns of H are simply all nonzero 3-tuples in some
                                                                                                      R EFERENCES
order. In general, we start from our choice of n and write
a matrix H whose columns consist of all nonzero n-tuples.              [1] R. Crandall : Some notes on steganography, Posted on Steganography
                                                                           Mailing List, 62 (1998), westfeld/crandall.pdf.
Then H has N = 2n − 1 columns. The extracting function,                [2] Zhang. W, S. Li, A coding problem in steganography, Des. Codes
         N         n
Ext : F2 → F2 is defined by way of the dot products with                    Cryptogr, 46(2009) pp 67-81.
the rows of H. Finally it is clear that embedding efficiency= 3.        [3] A. Westfeld : F5: A steganographic algorithm, High capacity despite
                                                                           better steganalysis. In: Moskowitz, I.S., (ed.) IH 2001 LNCS, vol. 2137,
         IV. A SYMPTOTICLY T IGHT B OUND ON THE                            pp. 289302. Springer, Heidelberg (2001).
                                                                       [4] G. J. Simmons : The prisoners problem and the subliminal channel, in
          P ERFORMANCE OF E MBEDDING S CHEMES                              Advances in Cryptology, pp. 5167, Plenum Press, New York, NY, USA,
   Our goal in this section is obtain steganographic proto-                1984.
                                                                       [5] C. Cachin : An information-theoretic model for steganography, in D.
cols asymptotically optimal. Initially, the relationship updates           Aucsmith (ed.): Information Hiding. 2nd International Workshop, LNCS
in Section 3, between steganographic protocols and error-                  vol. 1525, Springer-Verlag Berlin Heidelberg (1998), 306-318.
correcting codes, we used to translate the bounds on the error-        [6] Y. Kim, Z. Duric, D. Richards : Modified matrix encoding technique for
                                                                           minimal distortion steganography, In: Camenisch, J.L., Collberg, C.S.,
correcting codes in bounds, upper and lower, of the maximum                Johnson, N.F., Sallee, P. (eds.) IH 2006. LNCS, vol. 4437, pp. 314-327
number of messages in a schema. The bounds on the error-                   (2007).
correcting codes are known to be achieved by linear codes, we          [7] F.J. Mac Williams, N. Sloane : The Theory of Error Correcting Codes,
                                                                           North-Holland, Amsterdam, 1977.
use this result to the next section to construct our protocols.        [8] M.B. Medeni, EL. Souidi : A Steganography Schema and Error-
   2) Proposition: [(Zhang [2], Theorem 6).]                               Correcting Codes, Journal of Theoretical and Applied Information Tech-
   The parameters [n, k, ρ] of a steganographic protocol de-               nology, 7Vol18No1, pp 42-47 (2010).
fined over a field Fq of cardinality q, verify that q k ≤ Vq (n, ρ).
   Proof. it suffices to prove the result for proper
                                           n                    k
steganographic protocols. Take x ∈ Fq . For all s ∈ Fq
there exists y ∈ B(x, ρ) such that Ext(y) = s, hence
card(B(x, ρ)) ≥ card(Fq )

This bound is analogous to the Hamming bound in
Coding Theory. Thus, we can refer to it as the steganographic
Hamming bound. Protocols reaching equality are called
maximum length embeddable (MLE) in [2]

                                                                                                       ISSN 1947-5500