With increasing popularity of cloud storage, efficiently proving the integrity of data stored at an untrusted server has become significant. Authenticated Skip Lists and Rank-based Authenticated Skip Lists (RBASL) have been used to provide support for provable data updates in cloud storage. However, in a dynamic file scenario, an RBASL falls short when updates are not proportional to a fixed block size; such an update, even if small, may result in O(n) block updates for a file with n blocks. To overcome this problem, we introduce FlexList: Flexible Length-Based Authenticated Skip List, and present various optimizations on the four types of skip lists (regular, authenticated, rank-based authenticated, and FlexList). We build such a structure in O(n) time, parallelize this operation, compute one single proof to answer multiple (non-)membership queries and obtain efficiency gains of 35%, 35% and 40% in terms of proof time, energy, and size, respectively. We propose a method of handling multiple updates at once, achieving efficiency gains of up to 60% at the server and 90% at the client side. We also deployed our implementation of FlexDPDP on the PlanetLab, demonstrating that FlexDPDP performs comparable to the most efficient static storage scheme while providing dynamic data support.
Data intensive applications require extreme scaling of their underlying storage systems. Such scaling, together with the fact that storage systems must be implemented in actual data centers, increases the risk of data loss from failures of underlying components. Accurate engineering requires quantitatively predicting reliability, but this remains challenging, due to the need to account for extreme scale, redundancy scheme type and strength, distribution architecture, and component dependencies. This paper introduces CQSim-R, a tool suite for predicting the reliability of large scale storage system designs and deployments. CQSim-R includes (a) direct calculations based on an only-drives-fail failure model and (b) an event-based simulator for detailed prediction that handles failures of and failure dependencies among arbitrary (drive or non-drive) components. These are based on a common combinatorial framework for modeling placement strategies. The paper demonstrates CQSim-R using models of common storage systems, including replicated and erasure coded designs. New results, such as the poor reliability scaling of spread-placed systems and a quantification of the impact of data center distribution and rack-awareness on reliability, demonstrate the usefulness and generality of the tools. Analysis and empirical studies show the tools' soundness, performance, and scalability.