VIEWS: 32 PAGES: 6 CATEGORY: Software POSTED ON: 8/7/2012
1432 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 9, SEPTEMBER 2011 Concise Papers __________________________________________________________________________________________ A Privacy-Preserving Remote Data Integrity [9], [13], [14], [15] support public verifiability, by which anyone (not just the client) can perform the integrity checking operation. The Checking Protocol with Data Dynamics and protocols in [9], [13], [14], [15] support privacy against third-party Public Verifiability verifiers. We compare the proposed protocol with selected previous protocols (see Table 1). Zhuo Hao, Sheng Zhong, Member, IEEE, and In this paper, we have the following main contributions: Nenghai Yu, Member, IEEE . We propose a remote data integrity checking protocol for cloud storage, which can be viewed as an adaptation of Abstract—Remote data integrity checking is a crucial technology in cloud ´ Sebe et al.’s protocol [1]. The proposed protocol inherits the computing. Recently, many works focus on providing data dynamics and/or public support of data dynamics from [1], and supports public verifiability to this type of protocols. Existing protocols can support both features ´ with the help of a third-party auditor. In a previous work, Sebe et al. [1] propose a verifiability and privacy against third-party verifiers, while remote data integrity checking protocol that supports data dynamics. In this paper, at the same time it doesn’t need to use a third-party auditor. ´ we adapt Sebe et al.’s protocol to support public verifiability. The proposed . We give a security analysis of the proposed protocol, protocol supports public verifiability without help of a third-party auditor. In which shows that it is secure against the untrusted server addition, the proposed protocol does not leak any private information to third-party and private against third-party verifiers. verifiers. Through a formal analysis, we show the correctness and security of the . We have theoretically analyzed and experimentally tested protocol. After that, through theoretical analysis and experimental results, we the efficiency of the protocol. Both theoretical and experi- demonstrate that the proposed protocol has a good performance. mental results demonstrate that our protocol is efficient. The rest of this paper is organized as follows. In Section 2, Index Terms—Data integrity, data dynamics, public verifiability, privacy. technical preliminaries are presented. In Section 3, the proposed Ç remote data integrity checking protocol is presented. In Section 4, a formal analysis of the proposed protocol is presented. In Section 5, 1 INTRODUCTION we describe the support of data dynamics of the proposed STORING data in the cloud has become a trend [2]. An increasing protocol. In Section 6, the protocol’s complexity is analyzed in number of clients store their important data in remote servers in the aspects of communication, computation, and storage costs; the cloud, without leaving a copy in their local computers. furthermore, experimental results are presented for the efficiency Sometimes the data stored in the cloud is so important that the of the protocol. And finally, conclusions and possible future work clients must ensure it is not lost or corrupted. While it is easy to are presented in Section 7. check data integrity after completely downloading the data to be checked, downloading large amounts of data just for checking data integrity is a waste of communication bandwidth. Hence, a lot of 2 TECHNICAL PRELIMINARIES works [1], [3], [4], [5], [6], [7], [8], [9] have been done on designing We consider a cloud storage system in which there are a client and remote data integrity checking protocols, which allow data an untrusted server. The client stores her data in the server without integrity to be checked without completely downloading the data. keeping a local copy. Hence, it is of critical importance that the Remote data integrity checking is first introduced in [10], [11], client should be able to verify the integrity of the data stored in which independently propose RSA-based methods for solving this the remote untrusted server. If the server modifies any part of the problem. After that Shah et al. [12] propose a remote storage client’s data, the client should be able to detect it; furthermore, any auditing method based on precomputed challenge-response pairs. third-party verifier should also be able to detect it. In case a third- Recently, many works [1], [3], [4], [5], [6], [7], [8], [9], [13], [14], [15] party verifier verifies the integrity of the client’s data, the data focus on providing three advanced features for remote data should be kept private against the third-party verifier. Below we integrity checking protocols: data dynamics [5], [6], [8], [14], public present a formal statement of the problem. verifiability [3], [8], [9], [14], and privacy against verifiers [9], [14]. The Problem formulation. Denote by m the file that will be stored protocols in [5], [6], [7], [8], [14] support data dynamics at the block in the untrusted server, which is divided into n blocks of equal level, including block insertion, block modification, and block lengths: m ¼ m1 m2 . . . mn , where n ¼ djmj=le. Here, l is the length deletion. The protocol of [3] supports data append operation. In of each file block. Denote by fK ðÁÞ a pseudo-random function addition, [1] can be easily adapted to support data dynamics. which is defined as Protocols in [9], [13] can be adapted to support data dynamics by using the techniques of [8]. On the other hand, protocols in [3], [8], f : f0; 1gk Â f0; 1glog2 ðnÞ ! f0; 1gd ; in which k and d are two security parameters. Furthermore, denote . Z. Hao and N. Yu are with the Department of Electronic Engineering and the length of N in bits by jNj. Information Science, University of Science and Technology of China We need to design a remote data integrity checking protocol (USTC), Hefei, Anhui 230027, China. E-mail: hzhuo@mail.ustc.edu.cn, ynh@ustc.edu.cn. that includes the following five functions: SetUp, TagGen, . S. Zhong is with the Department of Computer Science and Engineering, Challenge, GenProof, and CheckProof. SUNY Buffalo, 201 Bell Hall, Amherst NY 14260. SetUpð1k Þ ! ðpk; skÞ. Given the security parameter k, this E-mail: szhong@buffalo.edu. function generates the public key pk and the secret key sk. pk is Manuscript received 21 May 2010; revised 6 Oct. 2010; accepted 25 Nov. public to everyone, while sk is kept secret by the client. 2010; published online 7 Mar. 2011. TagGenðpk; sk; mÞ ! Dm . Given pk, sk and m, this function Recommended for acceptance by C. Clifton. computes a verification tag Dm and makes it publicly known to For information on obtaining reprints of this article, please send e-mail to: tkde@computer.org, and reference IEEECS Log Number TKDE-2010-05-0295. everyone. This tag will be used for public verification of data Digital Object Identifier no. 10.1109/TKDE.2011.62. integrity. 1041-4347/11/$26.00 ß 2011 IEEE Published by the IEEE Computer Society IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 9, SEPTEMBER 2011 1433 TABLE 1 Comparisons between the Proposed Protocol and Previous Protocols Challengeðpk; Dm Þ ! chal. Using this function, the verifier polynomial time) adversary A, the probability that A wins the game generates a challenge chal to request for the integrity proof of file on a collection of file blocks is negligibly close to the probability that m. The verifier sends chal to the server. the challenger can extract these file blocks by a knowledge extractor E. GenProofðpk; Dm ; m; chalÞ ! R. Using this function, the server When the verifier is not the client herself, the protocol must computes a response R to the challenge chal. The server sends R ensure that no private information about the client’s data is leaked back to the verifier. to the third-party verifier. We formalize this requirement using the CheckProofðpk; Dm ; chal; RÞ ! f‘‘success;’’‘‘failure’’g. The veri- simulation paradigm [16]. fier checks the validity of the response R. If it is valid, the function Before we proceed to the definition of this requirement, we outputs “success,” otherwise the function outputs “failure.” The introduce some related notations. Let f ¼ ðf1 ; f2 Þ be a PPT secret key sk is not needed in the CheckProof function. functionality and let Å be a two-party protocol for computing f. Security requirements. There are two security requirements for During the execution of Å, denote the view of the first (resp., the remote data integrity checking protocol: security against the second) party by viewÅ ðx; yÞ (resp., viewÅ ðx; yÞ). viewÅ ðx; yÞ (resp., 1 2 1 server with public verifiability, and privacy against third-party viewÅ ðx; yÞ) includes ðx; r1 ; m1 ; . . . ; m1 Þ (resp., ðx; r2 ; m2 ; . . . ; m2 Þ) 2 1 t 1 t verifiers. We first give the definition of security against the server where r1 (resp., r2 ) represents the outcome of the first (resp., with public verifiability. In this definition, we have two entities: a second) party’s internal coin tosses, and m1 (resp., m2 ) represents i i challenger that stands for either the client or any third-party the ith message it has received. Denote the output of the first verifier, and an adversary that stands for the untrusted server. (resp., second) party during the execution of Å on ðx; yÞ by Definition 1 (Security against the Server with Public Verifia- outputÅ ðx; yÞ (resp., outputÅ ðx; yÞ), which is implicit in the party’s 1 2 bility [3]). We consider a game between a challenger and an own view of the execution. We denote the verifier and the server adversary that has four phases: Setup, Query, Challenge, and Forge. by V and P, respectively. Definition 2 (Privacy against Semihonest Behavior [16]). For a . Setup. The challenger runs the SetUp function, and gets the ðpk; skÞ. The challenger sends pk to the adversary and keeps deterministic functionality f, Å is said to privately compute f, if there sk secret. exist probabilistic polynomial time algorithms, denoted S1 and S2 , . Query. The adversary adaptively selects some file blocks such that mi ; i ¼ 1; 2; . . . ; n and queries the verification tags from c È É the challenger. The challenger computes a verification tag fS1 ðx; f1 ðx; yÞÞgx;y2f0;1gÃ viewÅ ðx; yÞ x;y2f0;1gÃ ; 1 Di for each of these blocks and sends Di ; i ¼ 1; 2; . . . ; n to c È É fS2 ðy; f2 ðx; yÞÞgx;y2f0;1gÃ viewÅ ðx; yÞ x;y2f0;1gÃ : the adversary. According to the protocol formulation, 2 Dm ¼ fD1 ; D2 ; . . . ; Dn g. c Note that denotes computational indistinguishability. . Challenge. The challenger generates the chal for the file blocks fm1 ; m2 ; . . . ; mn g and sends it to the adversary. From Definition 2, we define the privacy against third-party . Forge. The adversary computes a response R to prove the verifiers, which is given in Definition 3. integrity of the requested file blocks. If CheckProofðpk; Dm ; chal; RÞ ¼ ‘‘success; ’’ then the adversary Definition 3 (Privacy against Third-Party Verifiers). For the remote has won the game. The remote data integrity checking protocol is said data integrity checking protocol Å, if there exists a PPT simulator S V to be secure against the server if for any PPT (probabilistic such that 1434 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 9, SEPTEMBER 2011 c È É Y n fS V ðx; fV ðx; yÞÞgx;y2f0;1gÃ viewÅ ðx; yÞ V x;y2f0;1gÃ ; P ¼ ðDai mod NÞ mod N i i¼1 then Å is a protocol that ensures privacy against third-party verifiers. R0 ¼ P mod N: s Data dynamics at block level. Data dynamics means after After that the verifier checks whether R0 ¼ R. If R0 ¼ R, output clients store their data at the remote server, they can dynamically “success.” Otherwise the verification fails and the verifier update their data at later times. At the block level, the main outputs “failure.” operations are block insertion, block modification, and block Note that in the TagGen function, we make all the blocks deletion. Moreover, when data is updated, the verification distinct by adding random numbers in blocks with the same value. metadata also needs to be updated. The updating overhead should If the server still tries to save its storage space, then the only way is be made as small as possible. by breaking the prime factorization of N, or equally, getting a Homomorphic verifiable tags. In our construction, we use a multiple of ðNÞ. The hardness of breaking large number RSA-based homomorphic verifiable tags (HVT) [3], which is defined as follows: let N ¼ pq be one publicly known RSA modulus. We factorization makes the proposed protocol secure against the know that fe : e 2 Z N and gcdðe; NÞ ¼ 1g forms a multiplicative Z untrusted server. We put the formal analysis of the proposed group. Denote this group by Z Ã . Denote an element in Z Ã with a ZN ZN protocol in Section 4. large order by g. The RSA-based HVT for message mi is defined as Tag ðmi Þ ¼ gmi mod N. Its homomorphic property can be deduced 4 CORRECTNESS AND SECURITY ANALYSIS from its definition. When Tag ðmi Þ and Tag ðmj Þ are tags of mi and mj , respectively, the tag for mi þ mj can be generated by In this section, we first show that the proposed protocol is correct computing Tag ðmi þ mj Þ ¼ Tag ðmi Þ Á Tag ðmj Þ ¼ gmi þmj mod N. in the sense that the server can pass the verification of data integrity as long as both the client and the server are honest. Then we show that the protocol is secure against the untrusted server. 3 THE PROPOSED REMOTE DATA INTEGRITY These two theorems together guarantee that, assuming the client is CHECKING PROTOCOL honest, if and only if the server has access to the complete and In this section, we describe the proposed remote data integrity uncorrupted data, it can pass the verification process successfully. checking protocol. Just as formulated in Section 2, the proposed Finally, we show that the proposed protocol is private against protocol has functions SetUp, TagGen, Challenge, GenProof, and third-party verifiers. CheckProof, as well as functions for data dynamics. In the Theorem 1. If both the client and the server are honest, then the server following we present the former five functions of the proposed can pass the verification successfully. protocol. We leave the functions for data dynamics to Section 5. SetUp ð1k Þ ! ðpk; skÞ. Let N ¼ pq be one publicly known RSA Proof. We prove this theorem by showing that R and R0 should be modulus, in which p ¼ 2p0 þ 1; q ¼ 2q0 þ 1 are two large primes. p0 equal if all the data blocks are kept completely at the server. From and q0 are also primes. In addition, all the quadratic residues the TagGenðmÞ function, we get that Di ¼ ðgmi Þ mod N; i 2 ½1; n. modulo N form a multiplicative cyclic group, which we denote by Then, we get QRN . Denote the generator of QRN by g.1 Since the order of QRN is Y n Pn p0 q0 , the order of g is also p0 q0 . Let pk ¼ ðN; gÞ and sk ¼ ðp; qÞ. pk is ai mi P ¼ ðDai mod NÞ mod N ¼ g i i¼1 mod N: then released to be publicly known to everyone, and sk is kept i¼1 secret by the client. TagGenðpk; sk; mÞ ! Dm . For each file block mi ; i 2 ½1; n, the Then, client computes the block tag as Pn R0 ¼ P s mod N ¼ gs i¼1 ai mi mod N Di ¼ ðgmi Þ mod N: Pn ¼ gs i¼1 ai mi mod N ¼ R: Without loss of generality, we assume that each block is unique. If in some particular applications, there exist blocks with the same This completes the proof. u t value, then we differentiate them by adding a random number v 2 Z N in each of them. Let Dm ¼ fD1 ; D2 ; . . . ; Dn g. After finishing Z Before we proceed to Theorem 2, we first review the KEA1-r computing all the block tags, the client sends the file m to the assumption, which has been investigated in [17], [18], and adapted remote server, and releases Dm to be publicly known to everyone. to the RSA setting in [3]. Challengeðpk; Dm Þ ! chal. In order to verify the integrity of the Definition 4 KEA1-r(Knowledge of Exponent Assumption [3]). file m, the verifier generates a random key r 2 ½1; 2k À 1 and a For any adversary A taking input ðN; g; gs Þ and returning ðC; Y Þ random group element s 2 Z N nf0g. The verifier then computes Z " with Y ¼ C s , there exists “extractor” A, which given the same input gs ¼ gs mod N and sends chal ¼ hr; gs i to the server. c as A returns c such that C ¼ g . GenProof ðpk; Dm ; m; chalÞ ! R. When the server receives chal ¼ hr; gs i, it generates a sequence of block indexes a1 ; a2 ; . . . ; an Our proof of Theorem 2 needs a lemma from [19] on by calling fr ðiÞ for i 2 ½1; n iteratively. Then, the server computes solving prime factorization when a multiple of 0 ðnÞ is known. Pn Denote the prime factorization of N by p1 v1 . . . pt vt . Then, R ¼ ðgs Þ i¼1 ai mi mod N; 0 ðnÞ ¼ lcm ðp1 À 1; . . . ; pt À 1Þ, in which lcm denotes least and sends R to the verifier. common multiple. CheckProofðpk; Dm ; chal; RÞ ! f‘‘success;’’‘‘failure’’g. When the Lemma 1 [19]. Let g be any function that satisfies the following two verifier receives R from the server, she computes fai gi¼1;...;n as the conditions: 1) 0 ðnÞjgðnÞ, 2) jgðnÞj ¼ Oðjnjk Þ for some constant k. server does in the GenProof step. Then, the verifier computes P Then “prime factorization” is polynomial time reducible to g. and R0 as follows: Furthermore, the cost of solving the prime factorization of n is 1. A simple way to compute g is to let g ¼ b2 , in which b R À Z Ã and ZN Oðjnjkþ4 MðjnjÞÞ, in which MðjnjÞ denotes the cost of multiplying gcdðb Æ 1; NÞ ¼ 1. two integers of binary length jnj. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 9, SEPTEMBER 2011 1435 Theorem 2. Under the KEA1-r and the large integer factorization for each mÃ ; i ¼ 1; 2; . . . ; n, mÃ ¼ mi , then B has successfully i i assumptions, the proposed protocol is secure against the untrusted extracted all the file blocks mi ; i ¼ 1; 2; . . . ; n. However, if for server. some i, mÃ 6¼ mi , then we show that B can successfully compute i Proof. Just as in the security formulation, we denote the adversary the prime factorization of N. Without loss of generality, we by A and the challenger by B. What we want to prove is that for assume mi is larger than mÃ . Then, B can get a multiple of ðNÞ i any PPT adversary who wins the data possession game on from mÃ ¼ mi mod p0 q0 , which we denote by k1 ðNÞ. From i some file blocks, the challenger can construct a knowledge Lemma 1, B can solve the prime factorization of N with the cost of extractor E that extracts these file blocks. Equivalently, if E Oððjk1 j þ jðNÞjÞjNj4 MðjNjÞÞ. Because A is a PPT adversary, the cannot extract these file blocks, the challenger can break the length of k1 is bounded by OðjNjk2 Þ for some constant k2 . From integer factorization problem. the above we can see that if any file block cannot be extracted, For the large integer factorization problem, B is given a then B can construct a knowledge extractor E to extract the prime large integer N, which is product of two large primes p and factorization of N in probabilistic polynomial time. q. Here, p ¼ 2p0 þ 1 and q ¼ 2q0 þ 1. B tries to solve the prime In conclusion, under the KEA1-r and large integer factoriza- factorization of N. tion assumptions, the proposed protocol guarantees the data B simulates the protocol environment for A with the integrity against an untrusted server. t u following steps: Theorem 3 (Privacy against Third-Party Verifiers). Under the semihonest model [16], a third party verifier cannot get any . Setup. B generates a random generator of QRN . Denote information about the client’s data m from the protocol execution. the generator by g. B sends pk ¼ ðN; gÞ to A. . Query. A adaptively selects some file blocks mi ; i ¼ Hence, the protocol is private against third-party verifiers. 1; 2; . . . ; n and queries the verification tags from B. B Proof. (Sketch) In this proof, we construct a simulator for the view computes a verification tag Di ¼ gmi mod N for each of of the verifier, and show that the output of the simulator is these blocks and sends fDi ; i ¼ 1; 2; . . . ; ng to A. Let computationally indistinguishable with the view of the verifier. Dm ¼ fD1 ; D2 ; . . . ; Dn g. Dm is made publicly known to Due to space limitation, we put the detailed proof in the full everyone. version of this paper [20]. u t . Challenge. B generates a chal for the file blocks fm1 ; m2 ; . . . ; mn g and sends it to A. The generation method is the same with that in the Challenge function 5 DATA DYNAMICS described in Section 3. Let chal ¼ hr; gs i. The proposed protocol supports data dynamics at the block level . Forge. A computes a response R to prove the integrity in the same way as [1]. In the following, we show how our of the requested file blocks. protocol supports block modification. Due to space limitation, we If CheckProofðpk; Dm ; chal; RÞ ¼ ‘‘success; ’’ then the ad- describe the support of block insertion and block deletion in the versary has won the game. Note that A is given ðN; g; gs Þ as Pn full version [20]. input, and outputs R ¼ gs i¼1 ai mi mod N, in which ai ¼ fr ðiÞ . Block modification. Assume that the client wants to for n i 2 ½1; n. Because A can naturally computes P ¼ P modify the ith block mi of her file. Denote the modified g i¼1 ai mi mod N from Dm , P is also treated as A’s output. data block by mÃ . Then the server updates mi to mÃ . Next, i i So A is given ðN; g; gs Þ as input, and outputs ðR; P Þ that the client computes a new block tag for the updated block, satisfies R ¼ P s . From the KEA1-r assumption, B can Ã i.e., DÃ ¼ gmi mod N. i " construct an extractor A, which given the same input as From the above we can see that the correspondence relationship A, n outputs c which satisfies P ¼ gc mod N. As P ¼ P between the block and the digest does not change after the data P g i¼1 ai mi mod N, B extracts c ¼ n ai mi mod p0 q0 . i¼1 updating, i.e., Di ¼ gmi mod N; i ¼ 1; 2; . . . ; djmj=le. So, the data Now B generates n challenges hr1 ; gs1 i, hr2 ; gs2 i . . . ; hrn ; gsn i integrity is still protected. If the client wants to make sure that the using the method described in Section 3. B computes aj ¼ frj ðiÞ i file has really been updated, she can launch a proof request for i 2 ½1; n and j 2 ½1; n. Because fr1 ; r2 ; . . . ; rn g are chosen by immediately by sending a challenge to the server. Any block that is j j j B, now B chooses them so that fa1 ; a2 ; . . . ; an g; j ¼ 1; 2; . . . ; n updated is given a novel random number, so that each block satisfy the following equation: remains unique. Therefore, the server cannot delete any block 2 1 3 without being detected. a1 a1 . . . a1 2 n 6 a1 a2 . . . an 2 2 2 7 6 7 det6 . . . . 7 6¼ 0: ð1Þ 4 .. . . . . . 5 . 6 COMPLEXITY ANALYSIS AND EXPERIMENTAL an an . . . an 1 2 n RESULTS Here det½Á denotes the determinant of a matrix. B challenges In this section, we first present a complexity analysis of the A for n times. On the jth time, B challenges A with frj ; gsj g. communication, computation, and storage costs of the proposed From the response of A, B extracts cj ¼ aj m1 þ aj m2 þ Á Á Á þ protocol. After that, we present the experimental results. 1 2 aj mn mod p0 q0 . n 6.1 Communication, Computation, and Storage Costs When (1) holds, the following system of linear equations has a unique solution. The communication, computation, and storage costs of the client, the server and the verifier are analyzed and shown in Table 2. The 8 1 > a1 m1 þ a1 m2 þ Á Á Á þ a1 mn ¼ c1 mod p0 q0 ; > 2 2 n detailed analyses are omitted due to space limitation. Interested > < a m1 þ a2 m2 þ Á Á Á þ a2 mn ¼ c2 mod p0 q0 ; 1 2 n readers can refer to [20] for them. . ð2Þ For the data dynamics, the computation cost for block insertion > > . . > : n or block modification is just one modular exponentiation, which is a1 m1 þ an m2 þ Á Á Á þ an mn ¼ cn mod p0 q0 : 2 n Texp ðjNj; NÞ. By solving the above equation set, for each file block Storage of the block tags. Because the n block tags mi ; i ¼ 1; 2; . . . ; n, B gets mÃ which satisfies mÃ ¼ mi mod p0 q0 . If i i D1 ; D2 ; . . . ; Dn are made publicly known to everyone, they can 1436 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 9, SEPTEMBER 2011 TABLE 2 Upper Bounds of the Communication, Computation, and Storage Costs at the Client, the Server, and the Verifier be stored at the server, the client or the verifier. If the tags are From Table 3 we can see that when the file length is 225 bits stored at the server, then traditional integrity protection methods (4 MB) and the block size is 218 bits (32 KB), the computation cost at such as digital signatures can be used to protect them from being the verifier is 173.39 ms, and the computation cost at the server is tampered with by the server. The storage cost of the block tags is 2304.39 ms. upper bounded by djmj=lejNj bits. In this case, when the data From Table 4, we can see that the computation cost at the server integrity checking is performed, the tags are transmitted back to does not increase much when the file length increases. But the the verifier from the server. As the tags have been signed by the computation cost at the verifier increases nearly proportionally with the increasing file length. We note that this is consistent with client’s private key, the server cannot tamper with them. This will the theoretical analysis in Section 6.1. When the file length incur communication costs that are linear to the number of blocks. increases, the times of exponential operations at the verifier However, because the tags are relatively small compared with the increase proportionally. However, the server performs only one original data, the incurred communication costs are acceptable exponential operation no matter how large the file is. Therefore, with respect to all the good features the proposed protocol has. If the server’s burden is not increased much with larger files. As to the tags are stored at the verifier or the client, then these the verifier’s load, our scheme can be efficient when the file length communication costs are mitigated. However, this will cause a is not huge. However, when the file length is very large, our storage cost of OðnÞ at the verifier or the client, which is the same protocol can be easily extended into a probabilistic one by using ´ as Sebe et al.’s protocol [1]. the probabilistic framework proposed in [3]. In that case, the extended protocol provides probabilistic data possession guaran- 6.2 Experimental Results tee while its other good features are still kept. In the experiment, we measure the computation costs at the verifier and the server when the file length is fixed and the block size is changed. The results are shown in Table 3. After that, we measure 7 CONCLUSIONS AND FUTURE WORK the computation costs when the file length changes and the block In this paper, we propose a new remote data integrity checking size is fixed. The results are shown in Table 4. We also measure the protocol for cloud storage. The proposed protocol is suitable for client’s preprocessing costs, which are shown in Table 5. We providing integrity protection of customers’ important data. The proposed protocol supports data insertion, modification, and choose k ¼ d ¼ 128. The proposed protocol is implemented on a deletion at the block level, and also supports public verifiability. laptop with Intel Core2 Duo 2.00 GHz CPU and 1.99 GB memory. The proposed protocol is proved to be secure against an untrusted All the programs are written in the C++ language with the server. It is also private against third-party verifiers. Both assistance of MIRACL library [21]. theoretical analysis and experimental results demonstrate that the proposed protocol has very good efficiency in the aspects of communication, computation, and storage costs. TABLE 3 Currently, we are still working on extending the protocol to Computation Costs at the Verifier and the Server with jNj ¼ 1024 and jmj ¼ 4 MB support data level dynamics. The difficulty is that there is no clear mapping relationship between the data and the tags. In the current construction, data level dynamics can be supported by using block level dynamics. Whenever a piece of data is modified, the corresponding blocks and tags are updated. However, this can bring unnecessary computation and communication costs. We aim to achieve data level dynamics at minimal costs in our future work. TABLE 4 TABLE 5 Computation Costs at the Verifier and the Server with Different File The Client’s Preprocessing Time with Different Block Lengths when Lengths and Fixed Block Size jNj ¼ 1024 and jmj ¼ 4 MB IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 9, SEPTEMBER 2011 1437 . For more information on this or any other computing topic, please visit our ACKNOWLEDGMENTS Digital Library at www.computer.org/publications/dlib. This work was supported by NSF CNS-0845149, NSF CCF-0915374, and Knowledge Innovation Program of Chinese Academy of Sciences (No. YYYJ-1013). A full version of this paper is available on http://www.cse.buffalo.edu/tech-reports/2010-11.pdf. S. Zhong is the corresponding author for this paper. REFERENCES [1] F. Sebe, J. Domingo-Ferrer, A. Martinez-Balleste, Y. Deswarte, and J.-J. Quisquater, “Efficient Remote Data Possession Checking in Critical Information Infrastructures,” IEEE Trans. Knowledge and Data Eng., vol. 20, no. 8, pp. 1034-1038, Aug. 2008. [2] R. Buyya, C.S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud Computing and Emerging IT Platforms: Vision, Hype, and Reality for Delivering Computing as the Fifth Utility,” Future Generation Computer Systems, vol. 25, no. 6, pp. 599-616, 2009. [3] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and D. Song, “Provable Data Possession at Untrusted Stores,” Proc. 14th ACM Conf. Computer and Comm. Security (CCS ’07), pp. 598-609, 2007. [4] R. Curtmola, O. Khan, R. Burns, and G. Ateniese, “MR-PDP: Multiple- Replica Provable Data Possession,” Proc. 28th Int’l Conf. Distributed Computing Systems (ICDCS ’08), 2008. [5] G. Ateniese, R. Di Pietro, L.V. Mancini, and G. Tsudik, “Scalable and Efficient Provable Data Possession,” Proc. Fourth ACM Int’l Conf. Security and Privacy in Comm. Networks (SecureComm ’08), 2008. [6] ¨ ¸¨ C. Erway, A. Kupcu, C. Papamanthou, and R. Tamassia, “Dynamic Provable Data Possession,” Proc. 16th ACM Conf. Computer and Comm. Security (CCS ’09), pp. 213-222, 2009. [7] C. Wang, Q. Wang, K. Ren, and W. Lou, “Ensuring Data Storage Security in Cloud Computing,” Proc. 17th Int’l Workshop Quality of Service (IWQoS ’09), pp. 1-9, July 2009. [8] Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, “Enabling Public Verifiability and Data Dynamics for Storage Security in Cloud Computing,” Proc. 14th European Conf. Research in Computer Security (ESORICS), Sept. 2009. [9] C. Wang, Q. Wang, K. Ren, and W. Lou, “Privacy-Preserving Public Auditing for Data Storage Security in Cloud Computing,” Proc. IEEE INFOCOM, Mar. 2010. [10] Y. Deswarte and J.-J. Quisquater, “Remote Integrity Checking,” Proc. Sixth Conf. Integrity and Internal Control in Information Systems (IICIS ’04), pp. 1-11, 2004. [11] D.L.G. Filho and P.S.L.M. Barreto, “Demonstrating Data Possession and Uncheatable Data Transfer.” Cryptology ePrint Archive, Report 2006/150, http://eprint.iacr.org/, 2006. [12] M.A. Shah, M. Baker, J.C. Mogul, and R. Swaminathan, “Auditing to Keep Online Storage Services Honest,” Proc. 11th USENIX Workshop Hot Topics in Operating Systems (HOTOS), 2007. [13] C. Wang, S.S.-M. Chow, Q. Wang, K. Ren, and W. Lou, “Privacy-Preserving Public Auditing for Secure Cloud Storage,” Cryptology ePrint Archive, Report 2009/579, http://eprint.iacr.org/, 2009. [14] Y. Zhu, H. Wang, Z. Hu, G.-J. Ahn, H. Hu, and S.S. Yau, “Cooperative Provable Data Possession,” Cryptology ePrint Archive, Report 2010/234, http://eprint.iacr.org/, 2010. [15] Z. Hao and N. Yu, “A Multiple-Replica Remote Data Possession Checking Protocol with Public Verifiability,” Proc. Second Int’l Data, Privacy and E-Commerce Symp. (ISDPE ’10), 2010. [16] O. Goldreich, Foundations of Cryptography. Cambridge Univ. Press, 2004. [17] ˚ I. Damgard, “Towards Practical Public Key Systems Secure against Chosen Ciphertext Attacks,” Proc. 11th Ann. Int’l Cryptology Conf. Advances in Cryptology (CRYPTO ’91), 1992. [18] M. Bellare and A. Palacio, “The Knowledge-of-Exponent Assumptions and 3-Round Zero-Knowledge Protocols,” Proc. Cryptology Conf. Advances in Cryptology (CRYPTO ’04), pp. 273-289, 2004. [19] G.L. Miller, “Riemann’s Hypothesis and Tests for Primality,” Proc. Seventh Ann. ACM Symp. Theory of Computing (STOC ’75), pp. 234-239, 1975. [20] Z. Hao, S. Zhong, and N. Yu, “A Privacy-Preserving Remote Data Integrity Checking Protocol with Data Dynamics and Public Verifiability,” Technical Report 2010-11, SUNY Buffalo CSE Dept., http://www.cse.buffalo.edu/ tech-reports/2010-11.pdf, 2010. [21] Multiprecision Integer and Rational Arithmetic C/C++ Library, http:// www.shamus.ie/, 2011.