Data Deduplication

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 2757 Experts worldwide ranked by ideXlab platform

Ali Miri - One of the best experts on this subject based on the ideXlab platform.

  • Secure Textual Data Deduplication Scheme Based on Data Encoding and Compression
    2019 IEEE 10th Annual Information Technology Electronics and Mobile Communication Conference (IEMCON), 2019
    Co-Authors: Ali Miri, Fatema Rashid
    Abstract:

    As the need for storage has grown exponentially in recent years, cloud storage has been providing a solution to this need by providing users expanded capacity and access. Providing adequate security and privacy, and lowering storage costs are some of the key challenges facing this solution. A common practice used by cloud service providers (CSPs)-Data Deduplication - identifies identical copies of users’ Data, and removing all, but one copy to lower required storage overhead. However, this can result in serious privacy concerns. In this paper, we formulate a new secure Deduplication scheme for textual Data. Our proposed method uses Data encoding and compression techniques that not only result in reduce storage space required, but also in saving in required transmission bandwidth. The security of the Data against the semi-honest CSP and malicious users is ensured by using Burrows Wheel Transform encoding scheme. The encoded Data is further compressed to gain effective savings in terms of storage and reduced size of the Data. Data encoding and Data compression techniques are combined together to realize secure and efficient Data Deduplication. Through our scheme, the CSP will not only achieve huge storage space savings through Data compression and Data Deduplication, but can also provide the users a satisfactory level of security for their Data in the cloud.

  • PST - Secure image Data Deduplication through compressive sensing
    2016 14th Annual Conference on Privacy Security and Trust (PST), 2016
    Co-Authors: Fatema Rashid, Ali Miri
    Abstract:

    Data generated and stored worldwide is increasing multifold every year, with images and media content accounting for a large portion of this Data. In addition to volume, ensuring adequate security is an important challenge that needs to be addressed. In this paper, we propose an efficient, secure Data-storage approach based on compressive sensing. Our approach uses Data Deduplication to remove identical copies of Data. Our experimental results show significant storage savings, while providing strong level security.

  • FPS - Privacy-Preserving Public Auditing in Cloud Computing with Data Deduplication
    Foundations and Practice of Security, 2015
    Co-Authors: Naelah Alkhojandi, Ali Miri
    Abstract:

    Storage represents one of the most commonly used cloud services. Data integrity and storage efficiency are two key requirements when storing users’ Data. Public auditability, where users can employ a Third Party Audithor (TPA) to ensure Data integrity, and efficient Data Deduplication which can be used to eliminate duplicate Data and their corresponding authentication tags before sending the Data to the cloud, offer possible solutions to address these requirements. In this paper, we propose a privacy-preserving public auditing scheme with Data Deduplication. We also present an extension of our proposed scheme that enables the TPA to perform multiple auditing tasks at the same time. Security and computational analyses for both cases are also presented.

  • Privacy-Preserving Public Auditing in Cloud Computing with Data Deduplication
    Foundations and Practice of Security, 2015
    Co-Authors: Naelah Alkhojandi, Ali Miri
    Abstract:

    Storage represents one of the most commonly used cloud services. Data integrity and storage efficiency are two key requirements when storing users’ Data. Public auditability, where users can employ a Third Party Audithor (TPA) to ensure Data integrity, and efficient Data Deduplication which can be used to eliminate duplicate Data and their corresponding authentication tags before sending the Data to the cloud, offer possible solutions to address these requirements. In this paper, we propose a privacy-preserving public auditing scheme with Data Deduplication. We also present an extension of our proposed scheme that enables the TPA to perform multiple auditing tasks at the same time. Security and computational analyses for both cases are also presented.

  • CASCON - Proof of retrieval and ownership protocols for enterprise-level Data Deduplication
    2013
    Co-Authors: Fatema Rashid, Ali Miri, Isaac Woungang
    Abstract:

    The cloud computing paradigm is emerging as the next big thing in the world of information technology. Cloud technology offers a completely new set of benefits and savings in terms of computational costs, storage costs, bandwidth and transmission costs to its users. Cloud storage represents one of the most popular cloud services used. Data Deduplication is a promising practice which facilitates saving high volumes of storage by allowing the cloud provider to store only a single copy of duplicated Data. Client-side Data Deduplication offers additional savings in terms of bandwidth and storage. Applying Data Deduplication across enterprises also allows the cloud storage providers to apply Data Deduplication across users from different domains, providing additional savings. However, some of the advantages of cloud storage may be lost if additional steps are not taken to address some of the security and privacy issues associated with remotely stored Data. Since users outsource their Data to the cloud, they have to ensure the integrity of their Data and its privacy from the cloud storage provider who now has complete access to it. In this paper, we present a solution for assuring Data integrity in terms of proof of retrievability and ownership in the context of cross-user client-side Data Deduplication for medium- and small-sized enterprises. We propose a secure scheme which enables cloud service users to run their proof of retrievability with minimum storage and computational overheads in the case of honest-but-curious cloud storage providers. At the same time, the cloud storage provider will also be able to save digital storage by practising cross-enterprise Data Deduplication. We extend our scheme to include a proof of ownership scheme to assist the cloud in authenticating the user as the owner of the Data before releasing it. Our scheme does not introduce any additional structural or storage overheads to either of the parties.

Huafei Zhu - One of the best experts on this subject based on the ideXlab platform.

  • private Data Deduplication protocols in cloud storage
    ACM Symposium on Applied Computing, 2012
    Co-Authors: Yonggang Wen, Huafei Zhu
    Abstract:

    In this paper, a new notion which we call private Data Deduplication protocol, a Deduplication technique for private Data storage is introduced and formalized. Intuitively, a private Data Deduplication protocol allows a client who holds a private Data proves to a server who holds a summary string of the Data that he/she is the owner of that Data without revealing further information to the server. Our notion can be viewed as a complement of the state-of-the-art public Data Deduplication protocols of Halevi et al [7]. The security of private Data Deduplication protocols is formalized in the simulation-based framework in the context of two-party computations. A construction of private Deduplication protocols based on the standard cryptographic assumptions is then presented and analyzed. We show that the proposed private Data Deduplication protocol is provably secure assuming that the underlying hash function is collision-resilient, the discrete logarithm is hard and the erasure coding algorithm can erasure up to α-fraction of the bits in the presence of malicious adversaries in the presence of malicious adversaries. To the best our knowledge this is the first Deduplication protocol for private Data storage.

  • SAC - Private Data Deduplication protocols in cloud storage
    Proceedings of the 27th Annual ACM Symposium on Applied Computing - SAC '12, 2012
    Co-Authors: Yonggang Wen, Huafei Zhu
    Abstract:

    In this paper, a new notion which we call private Data Deduplication protocol, a Deduplication technique for private Data storage is introduced and formalized. Intuitively, a private Data Deduplication protocol allows a client who holds a private Data proves to a server who holds a summary string of the Data that he/she is the owner of that Data without revealing further information to the server. Our notion can be viewed as a complement of the state-of-the-art public Data Deduplication protocols of Halevi et al [7]. The security of private Data Deduplication protocols is formalized in the simulation-based framework in the context of two-party computations. A construction of private Deduplication protocols based on the standard cryptographic assumptions is then presented and analyzed. We show that the proposed private Data Deduplication protocol is provably secure assuming that the underlying hash function is collision-resilient, the discrete logarithm is hard and the erasure coding algorithm can erasure up to α-fraction of the bits in the presence of malicious adversaries in the presence of malicious adversaries. To the best our knowledge this is the first Deduplication protocol for private Data storage.

Yonggang Wen - One of the best experts on this subject based on the ideXlab platform.

  • private Data Deduplication protocols in cloud storage
    ACM Symposium on Applied Computing, 2012
    Co-Authors: Yonggang Wen, Huafei Zhu
    Abstract:

    In this paper, a new notion which we call private Data Deduplication protocol, a Deduplication technique for private Data storage is introduced and formalized. Intuitively, a private Data Deduplication protocol allows a client who holds a private Data proves to a server who holds a summary string of the Data that he/she is the owner of that Data without revealing further information to the server. Our notion can be viewed as a complement of the state-of-the-art public Data Deduplication protocols of Halevi et al [7]. The security of private Data Deduplication protocols is formalized in the simulation-based framework in the context of two-party computations. A construction of private Deduplication protocols based on the standard cryptographic assumptions is then presented and analyzed. We show that the proposed private Data Deduplication protocol is provably secure assuming that the underlying hash function is collision-resilient, the discrete logarithm is hard and the erasure coding algorithm can erasure up to α-fraction of the bits in the presence of malicious adversaries in the presence of malicious adversaries. To the best our knowledge this is the first Deduplication protocol for private Data storage.

  • SAC - Private Data Deduplication protocols in cloud storage
    Proceedings of the 27th Annual ACM Symposium on Applied Computing - SAC '12, 2012
    Co-Authors: Yonggang Wen, Huafei Zhu
    Abstract:

    In this paper, a new notion which we call private Data Deduplication protocol, a Deduplication technique for private Data storage is introduced and formalized. Intuitively, a private Data Deduplication protocol allows a client who holds a private Data proves to a server who holds a summary string of the Data that he/she is the owner of that Data without revealing further information to the server. Our notion can be viewed as a complement of the state-of-the-art public Data Deduplication protocols of Halevi et al [7]. The security of private Data Deduplication protocols is formalized in the simulation-based framework in the context of two-party computations. A construction of private Deduplication protocols based on the standard cryptographic assumptions is then presented and analyzed. We show that the proposed private Data Deduplication protocol is provably secure assuming that the underlying hash function is collision-resilient, the discrete logarithm is hard and the erasure coding algorithm can erasure up to α-fraction of the bits in the presence of malicious adversaries in the presence of malicious adversaries. To the best our knowledge this is the first Deduplication protocol for private Data storage.

Sudipta Sengupta - One of the best experts on this subject based on the ideXlab platform.

  • primary Data Deduplication large scale study and system design
    USENIX Annual Technical Conference, 2012
    Co-Authors: Ahmed Elshimi, Ran Kalach, Ankit Kumar, Adi Oltean, Jin Li, Sudipta Sengupta
    Abstract:

    We present a large scale study of primary Data Deduplication and use the findings to drive the design of a new primary Data Deduplication system implemented in the Windows Server 2012 operating system. File Data was analyzed from 15 globally distributed file servers hosting Data for over 2000 users in a large multinational corporation. The findings are used to arrive at a chunking and compression approach which maximizes Deduplication savings while minimizing the generated metaData and producing a uniform chunk size distribution. Scaling of Deduplication processing with Data size is achieved using a RAM frugal chunk hash index and Data partitioning - so that memory, CPU, and disk seek resources remain available to fulfill the primary workload of serving IO. We present the architecture of a new primary Data Deduplication system and evaluate the Deduplication performance and chunking aspects of the system.

  • USENIX Annual Technical Conference - Primary Data Deduplication-large scale study and system design
    2012
    Co-Authors: Ahmed El-shimi, Ran Kalach, Ankit Kumar, Adi Oltean, Sudipta Sengupta
    Abstract:

    We present a large scale study of primary Data Deduplication and use the findings to drive the design of a new primary Data Deduplication system implemented in the Windows Server 2012 operating system. File Data was analyzed from 15 globally distributed file servers hosting Data for over 2000 users in a large multinational corporation. The findings are used to arrive at a chunking and compression approach which maximizes Deduplication savings while minimizing the generated metaData and producing a uniform chunk size distribution. Scaling of Deduplication processing with Data size is achieved using a RAM frugal chunk hash index and Data partitioning - so that memory, CPU, and disk seek resources remain available to fulfill the primary workload of serving IO. We present the architecture of a new primary Data Deduplication system and evaluate the Deduplication performance and chunking aspects of the system.

Tao Jiang - One of the best experts on this subject based on the ideXlab platform.

  • Secure Cloud Data Deduplication with Efficient Re-encryption
    IEEE Transactions on Services Computing, 2019
    Co-Authors: Haoran Yuan, Xiaofeng Chen, Tao Jiang, Jianfeng Wang, Robert H. Deng
    Abstract:

    Data Deduplication technique has been widely adopted by commercial cloud storage providers, which is important in coping with the explosive growth of Data. To further protect the security of users' sensitive Data in the outsourced storage mode, many secure Data Deduplication schemes have been designed and applied in various scenarios. Among these schemes, secure and efficient re-encryption for encrypted Data Deduplication attracted the attention of many scholars, and many solutions have been designed to support dynamic ownership management. In this paper, we focus on the re-encryption Deduplication storage system and show that the recently designed lightweight rekeying-aware encrypted Deduplication scheme is vulnerable to the stub-reserved attack. Furthermore, we propose a secure Data Deduplication scheme with efficient re-encryption based on the convergent all-or-nothing-transform (CAONT) and randomly sampled bits from the Bloom filter. Due to the property of hash function, our scheme can resist the stub-reserved attack and guarantee the Data privacy of Data owners' sensitive Data. Moreover, instead of re-encrypting the entire package, Data owners are only required to re-encrypt a small part of it through the CAONT, thereby effectively reducing the computation overhead. Finally, security analysis and experimental results show that our scheme is secure and efficient in re-encryption.

  • DedupDUM: Secure and scalable Data Deduplication with dynamic user management
    Information Sciences, 2018
    Co-Authors: Haoran Yuan, Xiaofeng Chen, Tao Jiang, Xiaoyu Zhang, Zheng Yan, Yang Xiang
    Abstract:

    Abstract Data Deduplication on cloud enables the cloud servers to store a cope of Data and eliminate redundant one so that a goal to save storage space and network bandwidth is realized. Recently, many research works which are concerning to the privacy-preserving problem of dynamic ownership management in the secure Data Deduplication setting are published. However, to our knowledge, the existing schemes are not efficient when the cloud user joining and revocation frequently go on, especially in the absence of a trusted third party in practical cloud storage systems. In this paper, we propose a secure and scalable Data Deduplication scheme with dynamic user management, which updates dynamic group users in a secure way and restricts the unauthorized cloud users from the sensitive Data owned by valid users. To further mitigate the communication overhead, the pre-verified accessing control technology is adopted, which prevents the unauthorized cloud users from downloading Data. In other words, our present scheme also ensures that only the valid cloud users are able to download and decrypt the ciphertext from the cloud server. All this reduces the communication overhead in our scheme implementation.

  • Secure and Efficient Cloud Data Deduplication with Ownership Management
    IEEE Transactions on Services Computing, 2017
    Co-Authors: Shunrong Jiang, Tao Jiang, Liangmin Wang
    Abstract:

    Data Deduplication has been widely used in cloud storage to reduce storage space and communication overhead by eliminating redundant Data and storing only one copy for them. In order to achieve secure Data Deduplication, the convergent encryption scheme and many of its variants are proposed. However, most of these schemes do not consider or cannot address the efficiently dynamic ownership changes and the secure Proof-of-Ownership (PoW), simultaneously. In this paper, we propose a secure Data Deduplication scheme with efficient PoW process for dynamic ownership management. Specially, our scheme supports both cross-user file-level and inside-user block-level Data Deduplication. During the file-level Deduplication, we construct a new PoW scheme to ensure the tag consistency and achieve the mutual ownership verification. Moreover, we design a lazy update strategy to achieve efficient ownership management. For inside-user block-level Deduplication, the user-aided key is used to realize convergent key management and reduce the key storage space. Finally, the security and performance analysis demonstrate that our scheme can ensure Data confidentiality and tag consistency, and it is efficient in Data ownership management.

  • Secure and Efficient Cloud Data Deduplication With Randomized Tag
    IEEE Transactions on Information Forensics and Security, 2017
    Co-Authors: Tao Jiang, Xiaofeng Chen, Willy Susilo, Wenjing Lou
    Abstract:

    Cross-client Data Deduplication has been widely used to eliminate redundant storage overhead in cloud storage system. Recently, Abadi et al. introduced the primitive of MLE2 with nice security properties for secure and efficient Data Deduplication. However, besides the computationally expensive non-interactive zero-knowledge proofs, their fully randomized scheme (R-MLE2) requires the inefficient equality-testing algorithm to identify all duplicate ciphertexts. Thus, an interesting challenging problem is how to reduce the overhead of R-MLE2 and propose an efficient construction for R-MLE2. In this paper, we introduce a new primitive called $\mu \text{R}$ -MLE2, which gives a partial positive answer for this challenging problem. We propose two schemes: static scheme and dynamic scheme, where the latter one allows tree adjustment by increasing some computation cost. Our main trick is to use the interactive protocol based on static or dynamic decision trees. The advantage gained from it is, by interacting with clients, the server will reduce the time complexity of Deduplication equality test from linear time to efficient logarithmic time over the whole Data items in the Database. The security analysis and the performance evaluation show that our schemes are Path-PRV-CDA2 secure and achieve several orders of magnitude higher performance for Data equality test than R-MLE2 scheme when the number of Data items is relatively large.