Docstoc

A Privacy-Preserving Remote Data Integrity

Document Sample
A Privacy-Preserving Remote Data Integrity Powered By Docstoc
					1432                                                 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,                    VOL. 23,   NO. 9,      SEPTEMBER 2011




Concise Papers                                               __________________________________________________________________________________________


    A Privacy-Preserving Remote Data Integrity                                          [9], [13], [14], [15] support public verifiability, by which anyone (not
                                                                                        just the client) can perform the integrity checking operation. The
    Checking Protocol with Data Dynamics and
                                                                                        protocols in [9], [13], [14], [15] support privacy against third-party
                Public Verifiability                                                    verifiers. We compare the proposed protocol with selected
                                                                                        previous protocols (see Table 1).
         Zhuo Hao, Sheng Zhong, Member, IEEE, and                                           In this paper, we have the following main contributions:
                Nenghai Yu, Member, IEEE
                                                                                            .   We propose a remote data integrity checking protocol for
                                                                                                cloud storage, which can be viewed as an adaptation of
Abstract—Remote data integrity checking is a crucial technology in cloud                            ´
                                                                                                Sebe et al.’s protocol [1]. The proposed protocol inherits the
computing. Recently, many works focus on providing data dynamics and/or public
                                                                                                support of data dynamics from [1], and supports public
verifiability to this type of protocols. Existing protocols can support both features
                                                                 ´
with the help of a third-party auditor. In a previous work, Sebe et al. [1] propose a
                                                                                                verifiability and privacy against third-party verifiers, while
remote data integrity checking protocol that supports data dynamics. In this paper,             at the same time it doesn’t need to use a third-party auditor.
                  ´
we adapt Sebe et al.’s protocol to support public verifiability. The proposed               .   We give a security analysis of the proposed protocol,
protocol supports public verifiability without help of a third-party auditor. In                which shows that it is secure against the untrusted server
addition, the proposed protocol does not leak any private information to third-party            and private against third-party verifiers.
verifiers. Through a formal analysis, we show the correctness and security of the           .   We have theoretically analyzed and experimentally tested
protocol. After that, through theoretical analysis and experimental results, we                 the efficiency of the protocol. Both theoretical and experi-
demonstrate that the proposed protocol has a good performance.                                  mental results demonstrate that our protocol is efficient.
                                                                                            The rest of this paper is organized as follows. In Section 2,
Index Terms—Data integrity, data dynamics, public verifiability, privacy.
                                                                                        technical preliminaries are presented. In Section 3, the proposed
                                         Ç                                              remote data integrity checking protocol is presented. In Section 4, a
                                                                                        formal analysis of the proposed protocol is presented. In Section 5,
1      INTRODUCTION                                                                     we describe the support of data dynamics of the proposed
STORING data in the cloud has become a trend [2]. An increasing                         protocol. In Section 6, the protocol’s complexity is analyzed in
number of clients store their important data in remote servers in                       the aspects of communication, computation, and storage costs;
the cloud, without leaving a copy in their local computers.                             furthermore, experimental results are presented for the efficiency
Sometimes the data stored in the cloud is so important that the                         of the protocol. And finally, conclusions and possible future work
clients must ensure it is not lost or corrupted. While it is easy to                    are presented in Section 7.
check data integrity after completely downloading the data to be
checked, downloading large amounts of data just for checking data
integrity is a waste of communication bandwidth. Hence, a lot of                        2       TECHNICAL PRELIMINARIES
works [1], [3], [4], [5], [6], [7], [8], [9] have been done on designing                We consider a cloud storage system in which there are a client and
remote data integrity checking protocols, which allow data
                                                                                        an untrusted server. The client stores her data in the server without
integrity to be checked without completely downloading the data.
                                                                                        keeping a local copy. Hence, it is of critical importance that the
    Remote data integrity checking is first introduced in [10], [11],
                                                                                        client should be able to verify the integrity of the data stored in
which independently propose RSA-based methods for solving this
                                                                                        the remote untrusted server. If the server modifies any part of the
problem. After that Shah et al. [12] propose a remote storage
                                                                                        client’s data, the client should be able to detect it; furthermore, any
auditing method based on precomputed challenge-response pairs.
                                                                                        third-party verifier should also be able to detect it. In case a third-
Recently, many works [1], [3], [4], [5], [6], [7], [8], [9], [13], [14], [15]
                                                                                        party verifier verifies the integrity of the client’s data, the data
focus on providing three advanced features for remote data
                                                                                        should be kept private against the third-party verifier. Below we
integrity checking protocols: data dynamics [5], [6], [8], [14], public
                                                                                        present a formal statement of the problem.
verifiability [3], [8], [9], [14], and privacy against verifiers [9], [14]. The
                                                                                            Problem formulation. Denote by m the file that will be stored
protocols in [5], [6], [7], [8], [14] support data dynamics at the block
                                                                                        in the untrusted server, which is divided into n blocks of equal
level, including block insertion, block modification, and block
                                                                                        lengths: m ¼ m1 m2 . . . mn , where n ¼ djmj=le. Here, l is the length
deletion. The protocol of [3] supports data append operation. In
                                                                                        of each file block. Denote by fK ðÁÞ a pseudo-random function
addition, [1] can be easily adapted to support data dynamics.
                                                                                        which is defined as
Protocols in [9], [13] can be adapted to support data dynamics by
using the techniques of [8]. On the other hand, protocols in [3], [8],
                                                                                                         f : f0; 1gk  f0; 1glog2 ðnÞ ! f0; 1gd ;
                                                                                        in which k and d are two security parameters. Furthermore, denote
. Z. Hao and N. Yu are with the Department of Electronic Engineering and                the length of N in bits by jNj.
  Information Science, University of Science and Technology of China                       We need to design a remote data integrity checking protocol
  (USTC), Hefei, Anhui 230027, China. E-mail: hzhuo@mail.ustc.edu.cn,
  ynh@ustc.edu.cn.                                                                      that includes the following five functions: SetUp, TagGen,
. S. Zhong is with the Department of Computer Science and Engineering,                  Challenge, GenProof, and CheckProof.
  SUNY Buffalo, 201 Bell Hall, Amherst NY 14260.                                           SetUpð1k Þ ! ðpk; skÞ. Given the security parameter k, this
  E-mail: szhong@buffalo.edu.                                                           function generates the public key pk and the secret key sk. pk is
Manuscript received 21 May 2010; revised 6 Oct. 2010; accepted 25 Nov.                  public to everyone, while sk is kept secret by the client.
2010; published online 7 Mar. 2011.                                                        TagGenðpk; sk; mÞ ! Dm . Given pk, sk and m, this function
Recommended for acceptance by C. Clifton.
                                                                                        computes a verification tag Dm and makes it publicly known to
For information on obtaining reprints of this article, please send e-mail to:
tkde@computer.org, and reference IEEECS Log Number TKDE-2010-05-0295.                   everyone. This tag will be used for public verification of data
Digital Object Identifier no. 10.1109/TKDE.2011.62.                                     integrity.
       1041-4347/11/$26.00 ß 2011 IEEE   Published by the IEEE Computer Society
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,                 VOL. 23,   NO. 9,    SEPTEMBER 2011                                                       1433


                                                                   TABLE 1
                                        Comparisons between the Proposed Protocol and Previous Protocols




    Challengeðpk; Dm Þ ! chal. Using this function, the verifier                    polynomial time) adversary A, the probability that A wins the game
generates a challenge chal to request for the integrity proof of file               on a collection of file blocks is negligibly close to the probability that
m. The verifier sends chal to the server.                                           the challenger can extract these file blocks by a knowledge extractor E.
    GenProofðpk; Dm ; m; chalÞ ! R. Using this function, the server                 When the verifier is not the client herself, the protocol must
computes a response R to the challenge chal. The server sends R                 ensure that no private information about the client’s data is leaked
back to the verifier.                                                           to the third-party verifier. We formalize this requirement using the
    CheckProofðpk; Dm ; chal; RÞ ! f‘‘success;’’‘‘failure’’g. The veri-         simulation paradigm [16].
fier checks the validity of the response R. If it is valid, the function            Before we proceed to the definition of this requirement, we
outputs “success,” otherwise the function outputs “failure.” The                introduce some related notations. Let f ¼ ðf1 ; f2 Þ be a PPT
secret key sk is not needed in the CheckProof function.                         functionality and let Å be a two-party protocol for computing f.
    Security requirements. There are two security requirements for              During the execution of Å, denote the view of the first (resp.,
the remote data integrity checking protocol: security against the               second) party by viewÅ ðx; yÞ (resp., viewÅ ðx; yÞ). viewÅ ðx; yÞ (resp.,
                                                                                                           1                      2              1
server with public verifiability, and privacy against third-party               viewÅ ðx; yÞ) includes ðx; r1 ; m1 ; . . . ; m1 Þ (resp., ðx; r2 ; m2 ; . . . ; m2 Þ)
                                                                                     2                              1         t                     1            t
verifiers. We first give the definition of security against the server          where r1 (resp., r2 ) represents the outcome of the first (resp.,
with public verifiability. In this definition, we have two entities: a          second) party’s internal coin tosses, and m1 (resp., m2 ) represents
                                                                                                                                     i            i
challenger that stands for either the client or any third-party                 the ith message it has received. Denote the output of the first
verifier, and an adversary that stands for the untrusted server.                (resp., second) party during the execution of Å on ðx; yÞ by
Definition 1 (Security against the Server with Public Verifia-                  outputÅ ðx; yÞ (resp., outputÅ ðx; yÞ), which is implicit in the party’s
                                                                                        1                       2
  bility [3]). We consider a game between a challenger and an                   own view of the execution. We denote the verifier and the server
  adversary that has four phases: Setup, Query, Challenge, and Forge.           by V and P, respectively.
                                                                                Definition 2 (Privacy against Semihonest Behavior [16]). For a
      .    Setup. The challenger runs the SetUp function, and gets the
           ðpk; skÞ. The challenger sends pk to the adversary and keeps           deterministic functionality f, Å is said to privately compute f, if there
           sk secret.                                                             exist probabilistic polynomial time algorithms, denoted S1 and S2 ,
      .    Query. The adversary adaptively selects some file blocks               such that
           mi ; i ¼ 1; 2; . . . ; n and queries the verification tags from                                                c È           É
           the challenger. The challenger computes a verification tag                       fS1 ðx; f1 ðx; yÞÞgx;y2f0;1gà  viewÅ ðx; yÞ x;y2f0;1gà ;
                                                                                                                                1
           Di for each of these blocks and sends Di ; i ¼ 1; 2; . . . ; n to                                              c È           É
                                                                                            fS2 ðy; f2 ðx; yÞÞgx;y2f0;1gà  viewÅ ðx; yÞ x;y2f0;1gà :
           the adversary. According to the protocol formulation,                                                                2

           Dm ¼ fD1 ; D2 ; . . . ; Dn g.                                                        c
                                                                                    Note that  denotes computational indistinguishability.
      .    Challenge. The challenger generates the chal for the file
           blocks fm1 ; m2 ; . . . ; mn g and sends it to the adversary.
                                                                                   From Definition 2, we define the privacy against third-party
      .    Forge. The adversary computes a response R to prove the
                                                                                verifiers, which is given in Definition 3.
           integrity of the requested file blocks.
      If CheckProofðpk; Dm ; chal; RÞ ¼ ‘‘success; ’’ then the adversary        Definition 3 (Privacy against Third-Party Verifiers). For the remote
   has won the game. The remote data integrity checking protocol is said          data integrity checking protocol Å, if there exists a PPT simulator S V
   to be secure against the server if for any PPT (probabilistic                  such that
1434                                             IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,                           VOL. 23,   NO. 9,        SEPTEMBER 2011

                                       c   È              É                                                          Y
                                                                                                                     n
          fS V ðx; fV ðx; yÞÞgx;y2f0;1gà  viewÅ ðx; yÞ
                                               V          x;y2f0;1gÃ
                                                                       ;                                       P ¼         ðDai mod NÞ mod N
                                                                                                                             i
                                                                                                                     i¼1
    then Å is a protocol that ensures privacy against third-party verifiers.
                                                                                                              R0 ¼ P mod N:
                                                                                                                       s


    Data dynamics at block level. Data dynamics means after                            After that the verifier checks whether R0 ¼ R. If R0 ¼ R, output
clients store their data at the remote server, they can dynamically                    “success.” Otherwise the verification fails and the verifier
update their data at later times. At the block level, the main                         outputs “failure.”
operations are block insertion, block modification, and block                              Note that in the TagGen function, we make all the blocks
deletion. Moreover, when data is updated, the verification                             distinct by adding random numbers in blocks with the same value.
metadata also needs to be updated. The updating overhead should
                                                                                       If the server still tries to save its storage space, then the only way is
be made as small as possible.
                                                                                       by breaking the prime factorization of N, or equally, getting a
    Homomorphic verifiable tags. In our construction, we use a
                                                                                       multiple of ðNÞ. The hardness of breaking large number
RSA-based homomorphic verifiable tags (HVT) [3], which is defined
as follows: let N ¼ pq be one publicly known RSA modulus. We                           factorization makes the proposed protocol secure against the
know that fe : e 2 Z N and gcdðe; NÞ ¼ 1g forms a multiplicative
                      Z                                                                untrusted server. We put the formal analysis of the proposed
group. Denote this group by Z Ã . Denote an element in Z Ã with a
                                ZN                        ZN                           protocol in Section 4.
large order by g. The RSA-based HVT for message mi is defined as
Tag ðmi Þ ¼ gmi mod N. Its homomorphic property can be deduced
                                                                                       4    CORRECTNESS AND SECURITY ANALYSIS
from its definition. When Tag ðmi Þ and Tag ðmj Þ are tags of mi and
mj , respectively, the tag for mi þ mj can be generated by                             In this section, we first show that the proposed protocol is correct
computing Tag ðmi þ mj Þ ¼ Tag ðmi Þ Á Tag ðmj Þ ¼ gmi þmj mod N.                      in the sense that the server can pass the verification of data
                                                                                       integrity as long as both the client and the server are honest. Then
                                                                                       we show that the protocol is secure against the untrusted server.
3      THE PROPOSED REMOTE DATA INTEGRITY                                              These two theorems together guarantee that, assuming the client is
       CHECKING PROTOCOL                                                               honest, if and only if the server has access to the complete and
In this section, we describe the proposed remote data integrity                        uncorrupted data, it can pass the verification process successfully.
checking protocol. Just as formulated in Section 2, the proposed                       Finally, we show that the proposed protocol is private against
protocol has functions SetUp, TagGen, Challenge, GenProof, and                         third-party verifiers.
CheckProof, as well as functions for data dynamics. In the
                                                                                       Theorem 1. If both the client and the server are honest, then the server
following we present the former five functions of the proposed
                                                                                         can pass the verification successfully.
protocol. We leave the functions for data dynamics to Section 5.
     SetUp ð1k Þ ! ðpk; skÞ. Let N ¼ pq be one publicly known RSA                      Proof. We prove this theorem by showing that R and R0 should be
modulus, in which p ¼ 2p0 þ 1; q ¼ 2q0 þ 1 are two large primes. p0                       equal if all the data blocks are kept completely at the server. From
and q0 are also primes. In addition, all the quadratic residues                           the TagGenðmÞ function, we get that Di ¼ ðgmi Þ mod N; i 2 ½1; nŠ.
modulo N form a multiplicative cyclic group, which we denote by                           Then, we get
QRN . Denote the generator of QRN by g.1 Since the order of QRN is
                                                                                                       Y
                                                                                                       n                               Pn
p0 q0 , the order of g is also p0 q0 . Let pk ¼ ðN; gÞ and sk ¼ ðp; qÞ. pk is                                                                    ai mi
                                                                                                 P ¼         ðDai mod NÞ mod N ¼ g
                                                                                                               i
                                                                                                                                           i¼1           mod N:
then released to be publicly known to everyone, and sk is kept                                         i¼1
secret by the client.
     TagGenðpk; sk; mÞ ! Dm . For each file block mi ; i 2 ½1; nŠ, the                       Then,
client computes the block tag as                                                                                              Pn
                                                                                                        R0 ¼ P s mod N ¼ gs i¼1 ai mi mod N
                            Di ¼ ðgmi Þ mod N:                                                                 Pn
                                                                                                           ¼ gs i¼1 ai mi mod N ¼ R:
Without loss of generality, we assume that each block is unique. If
in some particular applications, there exist blocks with the same                            This completes the proof.                                               u
                                                                                                                                                                     t
value, then we differentiate them by adding a random number
v 2 Z N in each of them. Let Dm ¼ fD1 ; D2 ; . . . ; Dn g. After finishing
     Z                                                                                     Before we proceed to Theorem 2, we first review the KEA1-r
computing all the block tags, the client sends the file m to the                       assumption, which has been investigated in [17], [18], and adapted
remote server, and releases Dm to be publicly known to everyone.                       to the RSA setting in [3].
    Challengeðpk; Dm Þ ! chal. In order to verify the integrity of the                 Definition 4 KEA1-r(Knowledge of Exponent Assumption [3]).
file m, the verifier generates a random key r 2 ½1; 2k À 1Š and a                        For any adversary A taking input ðN; g; gs Þ and returning ðC; Y Þ
random group element s 2 Z N nf0g. The verifier then computes
                                   Z                                                                                             "
                                                                                         with Y ¼ C s , there exists “extractor” A, which given the same input
gs ¼ gs mod N and sends chal ¼ hr; gs i to the server.                                                                    c
                                                                                         as A returns c such that C ¼ g .
    GenProof ðpk; Dm ; m; chalÞ ! R. When the server receives
chal ¼ hr; gs i, it generates a sequence of block indexes a1 ; a2 ; . . . ; an
                                                                                           Our proof of Theorem 2 needs a lemma from [19] on
by calling fr ðiÞ for i 2 ½1; nŠ iteratively. Then, the server computes
                                                                                       solving prime factorization when a multiple of 0 ðnÞ is known.
                                   Pn
                                                                                       Denote the prime factorization of N by p1 v1 . . . pt vt . Then,
                         R ¼ ðgs Þ i¼1 ai mi mod N;
                                                                                       0 ðnÞ ¼ lcm ðp1 À 1; . . . ; pt À 1Þ, in which lcm denotes least
and sends R to the verifier.                                                           common multiple.
   CheckProofðpk; Dm ; chal; RÞ ! f‘‘success;’’‘‘failure’’g. When the
                                                                                       Lemma 1 [19]. Let g be any function that satisfies the following two
verifier receives R from the server, she computes fai gi¼1;...;n as the
                                                                                         conditions: 1) 0 ðnÞjgðnÞ, 2) jgðnÞj ¼ Oðjnjk Þ for some constant k.
server does in the GenProof step. Then, the verifier computes P
                                                                                         Then “prime factorization” is polynomial time reducible to g.
and R0 as follows:
                                                                                         Furthermore, the cost of solving the prime factorization of n is
   1. A simple way to compute g is to let g ¼ b2 , in which b
                                                                       R
                                                                           À Z Ã and
                                                                             ZN          Oðjnjkþ4 MðjnjÞÞ, in which MðjnjÞ denotes the cost of multiplying
gcdðb Æ 1; NÞ ¼ 1.                                                                       two integers of binary length jnj.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,                     VOL. 23,   NO. 9,    SEPTEMBER 2011                                              1435


Theorem 2. Under the KEA1-r and the large integer factorization                         for each mà ; i ¼ 1; 2; . . . ; n, mà ¼ mi , then B has successfully
                                                                                                      i                     i
  assumptions, the proposed protocol is secure against the untrusted                    extracted all the file blocks mi ; i ¼ 1; 2; . . . ; n. However, if for
  server.                                                                               some i, mà 6¼ mi , then we show that B can successfully compute
                                                                                                    i
Proof. Just as in the security formulation, we denote the adversary                     the prime factorization of N. Without loss of generality, we
   by A and the challenger by B. What we want to prove is that for                      assume mi is larger than mà . Then, B can get a multiple of ðNÞ
                                                                                                                         i
   any PPT adversary who wins the data possession game on                               from mà ¼ mi mod p0 q0 , which we denote by k1 ðNÞ. From
                                                                                                  i
   some file blocks, the challenger can construct a knowledge                           Lemma 1, B can solve the prime factorization of N with the cost of
   extractor E that extracts these file blocks. Equivalently, if E                      Oððjk1 j þ jðNÞjÞjNj4 MðjNjÞÞ. Because A is a PPT adversary, the
   cannot extract these file blocks, the challenger can break the                       length of k1 is bounded by OðjNjk2 Þ for some constant k2 . From
   integer factorization problem.                                                       the above we can see that if any file block cannot be extracted,
       For the large integer factorization problem, B is given a                        then B can construct a knowledge extractor E to extract the prime
   large integer N, which is product of two large primes p and                          factorization of N in probabilistic polynomial time.
   q. Here, p ¼ 2p0 þ 1 and q ¼ 2q0 þ 1. B tries to solve the prime                        In conclusion, under the KEA1-r and large integer factoriza-
   factorization of N.                                                                  tion assumptions, the proposed protocol guarantees the data
       B simulates the protocol environment for A with the                              integrity against an untrusted server.                                t
                                                                                                                                                              u
   following steps:                                                                 Theorem 3 (Privacy against Third-Party Verifiers). Under the
                                                                                      semihonest model [16], a third party verifier cannot get any
      .     Setup. B generates a random generator of QRN . Denote
                                                                                      information about the client’s data m from the protocol execution.
            the generator by g. B sends pk ¼ ðN; gÞ to A.
      .     Query. A adaptively selects some file blocks mi ; i ¼                     Hence, the protocol is private against third-party verifiers.
            1; 2; . . . ; n and queries the verification tags from B. B             Proof. (Sketch) In this proof, we construct a simulator for the view
            computes a verification tag Di ¼ gmi mod N for each of                     of the verifier, and show that the output of the simulator is
            these blocks and sends fDi ; i ¼ 1; 2; . . . ; ng to A. Let                computationally indistinguishable with the view of the verifier.
            Dm ¼ fD1 ; D2 ; . . . ; Dn g. Dm is made publicly known to                 Due to space limitation, we put the detailed proof in the full
            everyone.                                                                  version of this paper [20].                                     u
                                                                                                                                                       t
      .     Challenge. B generates a chal for the file blocks
            fm1 ; m2 ; . . . ; mn g and sends it to A. The generation
            method is the same with that in the Challenge function                  5       DATA DYNAMICS
            described in Section 3. Let chal ¼ hr; gs i.                            The proposed protocol supports data dynamics at the block level
      .     Forge. A computes a response R to prove the integrity                   in the same way as [1]. In the following, we show how our
            of the requested file blocks.                                           protocol supports block modification. Due to space limitation, we
       If CheckProofðpk; Dm ; chal; RÞ ¼ ‘‘success; ’’ then the ad-                 describe the support of block insertion and block deletion in the
   versary has won the game. Note that A is given ðN; g; gs Þ as
                                    Pn                                              full version [20].
   input, and outputs R ¼ gs i¼1 ai mi mod N, in which ai ¼ fr ðiÞ
                                                                                        .    Block modification. Assume that the client wants to
   for n i 2 ½1; nŠ. Because A can naturally computes P ¼
    P                                                                                        modify the ith block mi of her file. Denote the modified
   g i¼1 ai mi mod N from Dm , P is also treated as A’s output.
                                                                                             data block by mà . Then the server updates mi to mà . Next,
                                                                                                               i                                    i
   So A is given ðN; g; gs Þ as input, and outputs ðR; P Þ that                              the client computes a new block tag for the updated block,
   satisfies R ¼ P s . From the KEA1-r assumption, B can                                                   Ã
                                                                                             i.e., DÃ ¼ gmi mod N.
                                                                                                    i
                                 "
   construct an extractor A, which given the same input as                              From the above we can see that the correspondence relationship
   A, n outputs c which satisfies P ¼ gc mod N. As P ¼
    P                                                                               between the block and the digest does not change after the data
                                         P
   g i¼1 ai mi mod N, B extracts c ¼ n ai mi mod p0 q0 .
                                           i¼1                                      updating, i.e., Di ¼ gmi mod N; i ¼ 1; 2; . . . ; djmj=le. So, the data
       Now B generates n challenges hr1 ; gs1 i, hr2 ; gs2 i . . . ; hrn ; gsn i    integrity is still protected. If the client wants to make sure that the
   using the method described in Section 3. B computes aj ¼ frj ðiÞ    i            file has really been updated, she can launch a proof request
   for i 2 ½1; nŠ and j 2 ½1; nŠ. Because fr1 ; r2 ; . . . ; rn g are chosen by     immediately by sending a challenge to the server. Any block that is
                                             j   j            j
   B, now B chooses them so that fa1 ; a2 ; . . . ; an g; j ¼ 1; 2; . . . ; n       updated is given a novel random number, so that each block
   satisfy the following equation:                                                  remains unique. Therefore, the server cannot delete any block
                         2 1                   3                                    without being detected.
                           a1 a1 . . . a1
                                    2       n
                         6 a1 a2 . . . an
                             2      2       2 7
                         6                     7
                     det6 .        .   .   . 7 6¼ 0:                        ð1Þ
                         4 ..      .
                                   .   .
                                       .   . 5
                                           .                                        6       COMPLEXITY ANALYSIS AND EXPERIMENTAL
                           an an . . . an
                             1      2       n
                                                                                            RESULTS
      Here det½ÁŠ denotes the determinant of a matrix. B challenges                 In this section, we first present a complexity analysis of the
   A for n times. On the jth time, B challenges A with frj ; gsj g.                 communication, computation, and storage costs of the proposed
   From the response of A, B extracts cj ¼ aj m1 þ aj m2 þ Á Á Á þ                  protocol. After that, we present the experimental results.
                                                 1          2
   aj mn mod p0 q0 .
    n
                                                                                    6.1      Communication, Computation, and Storage Costs
      When (1) holds, the following system of linear equations has
   a unique solution.                                                               The communication, computation, and storage costs of the client,
                                                                                    the server and the verifier are analyzed and shown in Table 2. The
           8 1
           > a1 m1 þ a1 m2 þ Á Á Á þ a1 mn ¼ c1 mod p0 q0 ;
           > 2         2              n                                             detailed analyses are omitted due to space limitation. Interested
           >
           < a m1 þ a2 m2 þ Á Á Á þ a2 mn ¼ c2 mod p0 q0 ;
               1       2              n                                             readers can refer to [20] for them.
                                    .                           ð2Þ                     For the data dynamics, the computation cost for block insertion
           >
           >                        .
                                    .
           >
           : n                                                                      or block modification is just one modular exponentiation, which is
             a1 m1 þ an m2 þ Á Á Á þ an mn ¼ cn mod p0 q0 :
                       2              n
                                                                                    Texp ðjNj; NÞ.
       By solving the above equation set, for each file block                           Storage of the block tags. Because the n block tags
   mi ; i ¼ 1; 2; . . . ; n, B gets mà which satisfies mà ¼ mi mod p0 q0 . If
                                     i                  i                           D1 ; D2 ; . . . ; Dn are made publicly known to everyone, they can
1436                                        IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,            VOL. 23,   NO. 9,   SEPTEMBER 2011


                                                              TABLE 2
               Upper Bounds of the Communication, Computation, and Storage Costs at the Client, the Server, and the Verifier




be stored at the server, the client or the verifier. If the tags are       From Table 3 we can see that when the file length is 225 bits
stored at the server, then traditional integrity protection methods     (4 MB) and the block size is 218 bits (32 KB), the computation cost at
such as digital signatures can be used to protect them from being       the verifier is 173.39 ms, and the computation cost at the server is
tampered with by the server. The storage cost of the block tags is      2304.39 ms.
upper bounded by djmj=lejNj bits. In this case, when the data              From Table 4, we can see that the computation cost at the server
integrity checking is performed, the tags are transmitted back to       does not increase much when the file length increases. But the
the verifier from the server. As the tags have been signed by the       computation cost at the verifier increases nearly proportionally
                                                                        with the increasing file length. We note that this is consistent with
client’s private key, the server cannot tamper with them. This will
                                                                        the theoretical analysis in Section 6.1. When the file length
incur communication costs that are linear to the number of blocks.
                                                                        increases, the times of exponential operations at the verifier
However, because the tags are relatively small compared with the
                                                                        increase proportionally. However, the server performs only one
original data, the incurred communication costs are acceptable          exponential operation no matter how large the file is. Therefore,
with respect to all the good features the proposed protocol has. If     the server’s burden is not increased much with larger files. As to
the tags are stored at the verifier or the client, then these           the verifier’s load, our scheme can be efficient when the file length
communication costs are mitigated. However, this will cause a           is not huge. However, when the file length is very large, our
storage cost of OðnÞ at the verifier or the client, which is the same   protocol can be easily extended into a probabilistic one by using
       ´
as Sebe et al.’s protocol [1].                                          the probabilistic framework proposed in [3]. In that case, the
                                                                        extended protocol provides probabilistic data possession guaran-
6.2    Experimental Results                                             tee while its other good features are still kept.
In the experiment, we measure the computation costs at the verifier
and the server when the file length is fixed and the block size is
changed. The results are shown in Table 3. After that, we measure
                                                                        7      CONCLUSIONS AND FUTURE WORK
the computation costs when the file length changes and the block        In this paper, we propose a new remote data integrity checking
size is fixed. The results are shown in Table 4. We also measure the    protocol for cloud storage. The proposed protocol is suitable for
client’s preprocessing costs, which are shown in Table 5. We            providing integrity protection of customers’ important data. The
                                                                        proposed protocol supports data insertion, modification, and
choose k ¼ d ¼ 128. The proposed protocol is implemented on a
                                                                        deletion at the block level, and also supports public verifiability.
laptop with Intel Core2 Duo 2.00 GHz CPU and 1.99 GB memory.
                                                                        The proposed protocol is proved to be secure against an untrusted
All the programs are written in the C++ language with the
                                                                        server. It is also private against third-party verifiers. Both
assistance of MIRACL library [21].                                      theoretical analysis and experimental results demonstrate that
                                                                        the proposed protocol has very good efficiency in the aspects of
                                                                        communication, computation, and storage costs.
                             TABLE 3                                        Currently, we are still working on extending the protocol to
        Computation Costs at the Verifier and the Server with
                    jNj ¼ 1024 and jmj ¼ 4 MB
                                                                        support data level dynamics. The difficulty is that there is no clear
                                                                        mapping relationship between the data and the tags. In the current
                                                                        construction, data level dynamics can be supported by using block
                                                                        level dynamics. Whenever a piece of data is modified, the
                                                                        corresponding blocks and tags are updated. However, this can
                                                                        bring unnecessary computation and communication costs. We aim
                                                                        to achieve data level dynamics at minimal costs in our future work.


                            TABLE 4                                                                      TABLE 5
 Computation Costs at the Verifier and the Server with Different File       The Client’s Preprocessing Time with Different Block Lengths when
                  Lengths and Fixed Block Size                                                  jNj ¼ 1024 and jmj ¼ 4 MB
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,                             VOL. 23,   NO. 9,   SEPTEMBER 2011                                                 1437

                                                                                            . For more information on this or any other computing topic, please visit our
ACKNOWLEDGMENTS                                                                             Digital Library at www.computer.org/publications/dlib.
This work was supported by NSF CNS-0845149, NSF CCF-0915374,
and Knowledge Innovation Program of Chinese Academy of
Sciences (No. YYYJ-1013). A full version of this paper is available
on http://www.cse.buffalo.edu/tech-reports/2010-11.pdf.
S. Zhong is the corresponding author for this paper.


REFERENCES
[1]    F. Sebe, J. Domingo-Ferrer, A. Martinez-Balleste, Y. Deswarte, and J.-J.
       Quisquater, “Efficient Remote Data Possession Checking in Critical
       Information Infrastructures,” IEEE Trans. Knowledge and Data Eng.,
       vol. 20, no. 8, pp. 1034-1038, Aug. 2008.
[2]    R. Buyya, C.S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud
       Computing and Emerging IT Platforms: Vision, Hype, and Reality for
       Delivering Computing as the Fifth Utility,” Future Generation Computer
       Systems, vol. 25, no. 6, pp. 599-616, 2009.
[3]    G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and
       D. Song, “Provable Data Possession at Untrusted Stores,” Proc. 14th ACM
       Conf. Computer and Comm. Security (CCS ’07), pp. 598-609, 2007.
[4]    R. Curtmola, O. Khan, R. Burns, and G. Ateniese, “MR-PDP: Multiple-
       Replica Provable Data Possession,” Proc. 28th Int’l Conf. Distributed
       Computing Systems (ICDCS ’08), 2008.
[5]    G. Ateniese, R. Di Pietro, L.V. Mancini, and G. Tsudik, “Scalable and
       Efficient Provable Data Possession,” Proc. Fourth ACM Int’l Conf. Security
       and Privacy in Comm. Networks (SecureComm ’08), 2008.
[6]                        ¨ ¸¨
       C. Erway, A. Kupcu, C. Papamanthou, and R. Tamassia, “Dynamic
       Provable Data Possession,” Proc. 16th ACM Conf. Computer and Comm.
       Security (CCS ’09), pp. 213-222, 2009.
[7]    C. Wang, Q. Wang, K. Ren, and W. Lou, “Ensuring Data Storage Security in
       Cloud Computing,” Proc. 17th Int’l Workshop Quality of Service (IWQoS ’09),
       pp. 1-9, July 2009.
[8]    Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, “Enabling Public Verifiability
       and Data Dynamics for Storage Security in Cloud Computing,” Proc. 14th
       European Conf. Research in Computer Security (ESORICS), Sept. 2009.
[9]    C. Wang, Q. Wang, K. Ren, and W. Lou, “Privacy-Preserving Public
       Auditing for Data Storage Security in Cloud Computing,” Proc. IEEE
       INFOCOM, Mar. 2010.
[10]   Y. Deswarte and J.-J. Quisquater, “Remote Integrity Checking,” Proc. Sixth
       Conf. Integrity and Internal Control in Information Systems (IICIS ’04), pp. 1-11,
       2004.
[11]   D.L.G. Filho and P.S.L.M. Barreto, “Demonstrating Data Possession and
       Uncheatable Data Transfer.” Cryptology ePrint Archive, Report 2006/150,
       http://eprint.iacr.org/, 2006.
[12]   M.A. Shah, M. Baker, J.C. Mogul, and R. Swaminathan, “Auditing to Keep
       Online Storage Services Honest,” Proc. 11th USENIX Workshop Hot Topics in
       Operating Systems (HOTOS), 2007.
[13]   C. Wang, S.S.-M. Chow, Q. Wang, K. Ren, and W. Lou, “Privacy-Preserving
       Public Auditing for Secure Cloud Storage,” Cryptology ePrint Archive,
       Report 2009/579, http://eprint.iacr.org/, 2009.
[14]   Y. Zhu, H. Wang, Z. Hu, G.-J. Ahn, H. Hu, and S.S. Yau, “Cooperative
       Provable Data Possession,” Cryptology ePrint Archive, Report 2010/234,
       http://eprint.iacr.org/, 2010.
[15]   Z. Hao and N. Yu, “A Multiple-Replica Remote Data Possession
       Checking Protocol with Public Verifiability,” Proc. Second Int’l Data,
       Privacy and E-Commerce Symp. (ISDPE ’10), 2010.
[16]   O. Goldreich, Foundations of Cryptography. Cambridge Univ. Press, 2004.
[17]            ˚
       I. Damgard, “Towards Practical Public Key Systems Secure against Chosen
       Ciphertext Attacks,” Proc. 11th Ann. Int’l Cryptology Conf. Advances in
       Cryptology (CRYPTO ’91), 1992.
[18]   M. Bellare and A. Palacio, “The Knowledge-of-Exponent Assumptions and
       3-Round Zero-Knowledge Protocols,” Proc. Cryptology Conf. Advances in
       Cryptology (CRYPTO ’04), pp. 273-289, 2004.
[19]   G.L. Miller, “Riemann’s Hypothesis and Tests for Primality,” Proc. Seventh
       Ann. ACM Symp. Theory of Computing (STOC ’75), pp. 234-239, 1975.
[20]   Z. Hao, S. Zhong, and N. Yu, “A Privacy-Preserving Remote Data Integrity
       Checking Protocol with Data Dynamics and Public Verifiability,” Technical
       Report 2010-11, SUNY Buffalo CSE Dept., http://www.cse.buffalo.edu/
       tech-reports/2010-11.pdf, 2010.
[21]   Multiprecision Integer and Rational Arithmetic C/C++ Library, http://
       www.shamus.ie/, 2011.

				
DOCUMENT INFO
Shared By:
Stats:
views:32
posted:8/7/2012
language:Unknown
pages:6