Steganography and Error-Correcting Codes by ijcsis

VIEWS: 303 PAGES: 3

• pg 1
```									                                                       (IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 8, November 2010

Steganography and Error-Correcting Codes
M.B. Ould MEDENI and El Mamoun SOUIDI
Laboratory of Mathematic Informatics and Applications
University Mohammed V-Agdal, Faculty of Sciences
Rabat ,BP 1014, Morocco
Email : sbaimedeni@yahoo.fr, souidi@fsr.ac.ma

Abstract—In this work, We study how we used the error-            however, are more interested in the average number of
correcting codes in steganographic protocols (sometimes also         embedding changes rather than the worst case. In fact, the
called the ”matrix encoding”), which uses a linear code as an        concept of embedding efﬁciency-the average number of bits
ingredient. Among all codes of a ﬁxed block length and ﬁxed
dimension (and thus of a ﬁxed information rate), an optimal          embedded per embedding change-has been frequently used
code is one that makes most of the maximum length embeddable         in steganography to compare and evaluate performance of
(MLE). the steganographic protocols are close in error-correcting    steganographic schemes.
codes. We will clarify this link, which will lead us to a bound on
the maximum capacity that can have a steganographic protocol            There are two parameters which help to evaluate the
and give us a way to build up optimal steganographic protocols
performance of a steganographic protocol [n, k, ρ] over a
Keywords: Steganography, Error-correcting code, average
cover message of N symbols : the average distortion D = Ra ,
N
distortion, matrix encoding, embeding efﬁcient.
where Ra is the expected number of changes over uniformly
t
distributed messages; and the embedding rate E = N , which
I. I NTRODUCTION
is the amount of bits that can be hidden in a cover message.
The goal of digital steganography is to modify a digital          In general, for the same embedding rate a method is better
object (cover) to encode and conceal a sequence of bits              when the average distortion is smaller.
(message) to facilitate covert communication. This process
is fundamentally different from cryptography, where the                The remainder of the paper is organized as follows. Section
existence of a secret message may be suspected by anyone             2 gives a brief introduction to Error-correcting Codes. In
who can observe the scrambled ciphertext while it is                 Section 3 we introduces the relations between error-correcting
communicated.                                                        codes and steganography.We construct our approach and re-
A common technique in steganography is to embed the                  port on experimental results in section 4. Section 5 gives a
hidden message into a larger cover object (such as a digital         conclusion.
image, for example) by slightly distorting the cover object in
a way that on one hand makes it possible for the intended                            II. BASICS O F C ODING T HEORY
recipient to extract the hidden message, but on the other               We now review some elementary concepts from coding
hand makes it very hard for everybody else to detect the             theory that are relevant for our study. A good introductory
distortion of the cover object (i.e., to detect the existence of     text to this subject is, for example, [7]. Throughout the text,
the hidden message). The amount of noise that is naturally           boldface symbols denote vectors or matrices. A binary code
(inherently) present in the cover object determines the amount       C is any subset of the space of all n-bit column vectors
of distortion that can be introduced into the cover object           x = (x1 , ..., xn ) ∈ {0, 1}n . The vectors in C are called
before the distortion becomes detectable.                            codewords. The set {0, 1}n forms a linear vector space if
An interesting steganographic method is known as matrix              we deﬁne the sum of two vectors and a multiplication of a
encoding, introduced by Crandall [2]. Matrix encoding                vector by scalar using the usual arithmetics in the ﬁnite ﬁeld
n
requires the sender and the recipient to agree in advance            GF (2) ; we will denote this ﬁeld by F2 . For any C, D ⊂ F2
n
on a parity check matrix H, and the secret message is then           and vector x, C + D = {y ∈ F2 |y = c + d, c ∈ C, d ∈ D} ;
n
extracted by the recipient as the syndrome (with respect to H)       x + C = {y ∈ F2 |y = x + c, c ∈ C}. The Hamming weight
of the received cover object. This method was made popular           w of a vector x is deﬁned as the number of ones in x. The
by Westfeld [1], who incorporated a speciﬁc implementation           distance between two vectors x and y is the Hamming weight
using Hamming codes in his F5 algorithm, which can embed t           of their difference d(x, y) = w(x − y). For any x ∈ C and a
bits of message in 2t − 1 cover symbols by changing, at most,        positive real number r, we denote as B(x, r) the ball with
n
one of them. Matrix encoding using linear codes (syndrome            center x and radius r, B(x, r) = {y ∈ F2 |d(x, y) ≤ r}.
n
coding) is a general approach to improving embedding                 We also deﬁne the distance between x and set C ⊂ F2 as
efﬁciency of steganographic schemes. The covering radius of          d(x, C) = minc∈C d(x, c). The covering radius R of C is
the code corresponds to the maximal number of embedding              deﬁned as R = maxx∈F2 d(x, C). The average distance to
n

changes needed to embed any message [8]. Steganographers,            code C, deﬁned as Ra = 2−n x∈F n d(x, C), is the average
2

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 8, November 2010

n
distance between a randomly selected vector from F2 and               3) modiﬁcations on s are translated on the cover-medium
the code C. Clearly Ra ≤ R.                                              to obtain the stego-medium.
Here, we assume that the detectability of the embedding
Linear codes are codes for which C is a linear vector           increases with the number of symbols that must be changed
n
subspace of F2 . If C has dimension k, we call C a linear          to go from v to s ( [5] for some examples of this framework).
code of length n and dimension k (and codimension n − k),          Syndrome coding deals with this number of changes. The
or we say that C is an [n, k] code. Each linear code C of          key idea is to use some syndrome computation to embed the
dimension k has a basis consisting of k vectors. Writing the       message into the cover-data. In fact, such a scheme uses a
basis vectors as rows of an k × n matrix G, we obtain a            linear code C, more precisely its cosets, to hide m. A word s
generator matrix of C. Each codeword can be written as a           hides the message m if s lies in a particular coset of C, related
linear combination of rows from G. There are 2k codewords          to m. Since cosets are uniquely identiﬁed by the so-called
n
in an [n, k] code. Given x, y ∈ F2 , we deﬁne their dot product    syndromes, embedding/hiding consists exactly in searching s
x.y = x1 y1 + x2 y2 + ... + xn yn , all operations in GF (2). We   with syndrome m, close enough to v.
say that x and y are orthogonal if x.y = 0. Given a code C,
the dual code of C, denoted as C ⊥ , is the set of all vectors
x that are orthogonal to all vectors in C. The dual code of a      A. Matrix Encoding
[n, k] code is a [n, n − k] code with an (n − k) × n generator       We ﬁrst set up the notation and describe properly the matrix
matrix H with the property that                                    encoding framework and its inherent problems. Let v ∈ Fq     n
r
denote the cover-data and m ∈ Fq the message. We are
Hx = 0 ⇔ x ∈ C.                        (1)   looking for two mappings, embedding Emb and extraction
The matrix H is called the parity check matrix of C. For any       Ext, such that
n
x ∈ F2 , the vector s = Hx is called the syndrome of x. For                             n    r
n−k
each syndrome s ∈ F2 , the set C(s) = {x ∈ F2 |Hx =  n                       ∀(v, m) ∈ Fq × Fq , Ext(Emb(v, m)) = m.                   (2)
s} is called a coset. Note that C(0) = C. Obviously, cosets
associated with different syndromes are disjoint. Also, from
elementary linear algebra we know that every coset can be                               n    r
∀(v, m) ∈ Fq × Fq , d(v, Emb(v, m)) ≤ T.                  (3)
written as C(s) = x + C, where x ∈ C(s) arbitrary. Thus,
there are 2n−k disjoint cosets, each consisting of 2k vectors.     Equation (2) means that we want to recover the message in
Any member of the coset C(s) with the smallest Hamming             all cases; (3) means that we authorize the modiﬁcation of at
weight is called a coset leader and will be denoted as eL (s).     most T coordinates in the vector v.
1) Lemma: Given a coset C(s), for any x ∈ C(s),                 From Error Correcting Codes (Section 2), it is quite easy to
d(x, C) = w(eL (s). Moreover, if d(x, C) = d(x, c ) for some       show that the scheme deﬁned by
c ∈ C, the vector x − c is a coset leader.
Proof : d(x, C) = minc∈C w(x − c) = miny∈C(s) w(y) =
w(eL (s)). The second equality follows from the fact that if c                     Emb(v, m) = v + D(m − E(v))                         (4)
goes through the code C, x − c goes through all members of
the coset C(s).                                                                        Ext(y) = E(y) = y × H t .                       (5)
2) Lemma: If C is an [n, k] code with a (n − k) × n parity
check matrix H and covering radius R, then any syndrome            D and E mean respectively the decoding function and the
n−k
s ∈ F2      can be written as a sum of at most R columns           function of the syndrome. enables to embed messages of
of H and R is the smallest such number. Thus, we can also          length r = n − k in a cover-data of length n, while modifying
deﬁne the covering radius as the maximal weight of all coset       at most T = R elements of the cover-data.
leaders.                                                           The parameter n−k represents the embedding efﬁciency, that
R
n
Proof : Any x ∈ F2 belongs to exactly one coset C(s) and        is, the number of embedded symbols per embedding changes.
from Lemma 1 we know that d(x, C) = w(eL (s)). But the             Linking symbols with bits is not simple, as naive solutions
weight w(eL (s)) is the smallest number of columns in H that       lead to bad results in terms of efﬁciency. For example, if
must be added to obtain s.                                         elements of Fq are viewed as blocks of L bits, modifying a
symbol roughly leads to L bit ﬂips on average and L for the
2
III. L INEAR C ODES FOR S TEGANOGRAPHY
worst case.
The behavior of a steganographic algorithm can be sketched
in the following way:                                                 A problem raised by the matrix encoding, as presented
1) a cover-medium is processed to extract a sequence of         above, is that any position in the cover-data v can be changed.
symbols v, sometimes called cover-data;                      In some cases, it is more reasonable to keep some coordinates
2) v is modiﬁed into s to embed the message m; s is             unchanged because they would produce too big artifacts in the
sometimes called the stego-data;                             stego-data.

ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 8, November 2010

1) Example: We now give an example of a proto-                          3) Proposition: Denote by rL (n, ρ) the largest possible
col steganography constructed from a linear single-error-              value of r = n − k for [n, k, ρ] steganographic protocol,
correcting code. This was also discussed, for example, in [3].         similarly let r(n, ρ) denote the largest possible value of
Start from the matrix                                                  r = n − log(M ) or M the number of messages that can
hide using binary coverwords of length n. Let us recall the
                            
1 0 0 1 1 0 1
H= 0 1 0 1 0 1 1                                        following result
0 0 1 0 1 1 1                                              ρ                                                          ρ
i                                                          i
whose entries are elements of F2 . Extracting Scheme is deﬁned           log(         Cn ) − log(n) ≤ rL (n, ρ) ≤ r(n, ρ) ≤ log(                 Cn )
i=0                                                        i=0
7    3
Ext : F2 → F2
4) Corollary: Let h(n, T ) be the maximal number of bits
Ext(x1 , x2 , x3 , x4 , x5 , x6 , x7 ) = (y1 , y2 , y3 ), where y1 =   embeddable by schemes using binary coverwords of length n
x1 + x4 + x5 + x7 , y2 = x2 + x4 + x6 + x7 and y3 =                    with quality threshold T . We have
x3 + x5 + x6 + x7 . This function can be described in terms of
rL (n, T ) ≤ h(n, T ) ≤ r(n, T )
matrix H.In fact, yi , is the dot product of x and the i-th row
of H. We claim that Ext is a extracting function of proto-                Proof. Hence, an steganographic protocol (with quality
col steganography (7, 3, 1) Embedding Scheme for example,              threshold T ) is a slightly stronger condition than T -covering,
Ext(0, 0, 1, 1, 0, 1, 0) = (1, 0, 0). Assume y = (1, 1, 1). We         because steganographic protocol generates not a single, but
claim that it is possible to replace x = (0, 0, 1, 1, 0, 1, 0) by      |X| disjoint T -coverings. But in particular case of linear
x such that Ext(x ) = (1, 1, 1) and d(x, x ) = 1. In fact,             coverings these two notions coincide.
we claim more: the coordinate where x has to be changed is
uniquely determined. In our case, this is coordinate number                                        V. C ONCLUSION
6, so x = (0, 0, 1, 1, 0, 0, 0), Here is the general embedding            Application of error-correcting codes to data embedding
rule: form Ext(x) + y, (in the example this is 011). Find the          improves embedding efﬁciency and security of steganographic
column of H which has these entries (in our example, this is           schemes. In this paper, we show some relations between
the sixth column). This marks the coordinate where x needs to          steganographic algorithms and error-correcting codes. By us-
be changed to embed payload y. This procedure indicates how            ing these relations we give some bound on the performance
H and Ext were constructed and how this can be generalized:            of embedding schemes
the columns of H are simply all nonzero 3-tuples in some
R EFERENCES
order. In general, we start from our choice of n and write
a matrix H whose columns consist of all nonzero n-tuples.              [1] R. Crandall : Some notes on steganography, Posted on Steganography
Mailing List, 62 (1998), http://os.inf.tu-dresden.de/ westfeld/crandall.pdf.
Then H has N = 2n − 1 columns. The extracting function,                [2] Zhang. W, S. Li, A coding problem in steganography, Des. Codes
N         n
Ext : F2 → F2 is deﬁned by way of the dot products with                    Cryptogr, 46(2009) pp 67-81.
the rows of H. Finally it is clear that embedding efﬁciency= 3.        [3] A. Westfeld : F5: A steganographic algorithm, High capacity despite
better steganalysis. In: Moskowitz, I.S., (ed.) IH 2001 LNCS, vol. 2137,
IV. A SYMPTOTICLY T IGHT B OUND ON THE                            pp. 289302. Springer, Heidelberg (2001).
[4] G. J. Simmons : The prisoners problem and the subliminal channel, in
P ERFORMANCE OF E MBEDDING S CHEMES                              Advances in Cryptology, pp. 5167, Plenum Press, New York, NY, USA,
Our goal in this section is obtain steganographic proto-                1984.
[5] C. Cachin : An information-theoretic model for steganography, in D.
cols asymptotically optimal. Initially, the relationship updates           Aucsmith (ed.): Information Hiding. 2nd International Workshop, LNCS
in Section 3, between steganographic protocols and error-                  vol. 1525, Springer-Verlag Berlin Heidelberg (1998), 306-318.
correcting codes, we used to translate the bounds on the error-        [6] Y. Kim, Z. Duric, D. Richards : Modiﬁed matrix encoding technique for
minimal distortion steganography, In: Camenisch, J.L., Collberg, C.S.,
correcting codes in bounds, upper and lower, of the maximum                Johnson, N.F., Sallee, P. (eds.) IH 2006. LNCS, vol. 4437, pp. 314-327
number of messages in a schema. The bounds on the error-                   (2007).
correcting codes are known to be achieved by linear codes, we          [7] F.J. Mac Williams, N. Sloane : The Theory of Error Correcting Codes,
North-Holland, Amsterdam, 1977.
use this result to the next section to construct our protocols.        [8] M.B. Medeni, EL. Souidi : A Steganography Schema and Error-
2) Proposition: [(Zhang [2], Theorem 6).]                               Correcting Codes, Journal of Theoretical and Applied Information Tech-
The parameters [n, k, ρ] of a steganographic protocol de-               nology, 7Vol18No1, pp 42-47 (2010).
ﬁned over a ﬁeld Fq of cardinality q, verify that q k ≤ Vq (n, ρ).
Proof. it sufﬁces to prove the result for proper
n                    k
steganographic protocols. Take x ∈ Fq . For all s ∈ Fq
there exists y ∈ B(x, ρ) such that Ext(y) = s, hence
k
card(B(x, ρ)) ≥ card(Fq )

This bound is analogous to the Hamming bound in
Coding Theory. Thus, we can refer to it as the steganographic
Hamming bound. Protocols reaching equality are called
maximum length embeddable (MLE) in [2]