Can We Group Storage? Statistical Techniques to Identify Predictive Groupings in Storage System Accesses

Note from Editor-in-Chief

I am delighted to announce that ACM Transactions on Storage is now being indexed by Thomson-Reuters through the Science Citation Index Expanded (SciSearch), Journal Citation Reports/Science Edition, Current Contents/Engineering Computing and Technology. This is in recognition of the quality of the articles appearing in Transactions on Storage, and I would like to extend my gratitude and appreciation to all of authors who have made this possible. As the old adage says, "a rising tide raises all boats" and the increased visibility of Transactions on Storage increases the impact of your article, and will attract even higher quality articles to our journal.

I want to again thank all of the authors who have made this possible, and express my hope that the growing impact of Transactions on Storage will continue. Computer Science is now a mature discipline, and the importance of journal publication is increasing. By sending your best work to Transactions on Storage you contribute to the field and to the standing of this journal.

Forthcoming Articles

H-Scale: A Fast Approach to Scale Disk Arrays via Hybrid Stripe Deployment

SWANS: An Inter-Disk Wear-Leveling Strategy for RAID-0 Structured SSD Arrays

Efficient Deduplication in a Distributed Primary Storage Infrastructure

A large amount of duplicate data typically exists across volumes of virtual machines in cloud computing infrastructures. Deduplication allows reclaiming these duplicates while improving the cost-effectiveness of large scale multi-tenant infrastructures. However, traditional archival and backup deduplication systems impose prohibitive storage overhead for virtual machines hosting latency-sensitive applications. Primary deduplication systems reduce such penalty but rely on special cluster filesystems, centralized components, or restrictive workload assumptions. Also, some of these systems reduce storage overhead by confining deduplication to off-peak periods that, may be scarce in a cloud environment. We present DEDIS, a dependable and fully-decentralized system that performs cluster-wide background deduplication of virtual machines primary volumes. DEDIS works on top of any unsophisticated storage backend, centralized or distributed, as long as it exports a basic shared block device interface. Also, DEDIS does not rely on data locality assumptions and incorporates novel optimizations for reducing deduplication overhead and increasing its reliability. The evaluation of DEDIS open-source prototype shows that minimal I/O overhead is achievable even when deduplication and intensive storage I/O are executed simultaneously. Also, our design scales out and allows collocating DEDIS components and virtual machines in the same servers thus, sparing the need of additional hardware.

Does RAID Improve Lifetime of SSD Arrays?

LDM: Log Disk Mirroring with Improved Performance and Reliability for SSD-based Disk Arrays

Economic forces, driven by the desire to introduce flash-based Solid State Drives (SSDs) into the high-end storage market have resulted in the hybrid storage systems in the Cloud. However, A single flash-based (SSD) can not satisfy the performance, reliability and capacity requirements of enterprise or HPC storage systems in the Cloud. While an array of SSDs organized in a RAID structure, such as RAID5, provides the potential for high storage capacity and bandwidth, the reliability and performance problems will likely result from the parity update operations. In this paper, we propose a Log Disk Mirroring scheme (short for LDM) to improve the performance and reliability of SSD-based disk arrays. LDM is a hybrid disk array architecture that consists of several SSDs and two hard disk drives (HDDs). Our prototype implementation of the LDM array and the performance evaluations show that LDM array significantly outperforms the pure SSD-based disk arrays by a factor of 20.4 on average, and outperforms HPDA by a factor of 5.0 on average. The reliability analysis shows that the MTTDL of the LDM array is 2.7 times and 1.7 times better than that of pure SSD-based disk arrays and HPDA disk array.

MultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores

OS-level virtualization is an efficient method for server consolidation. However, the sharing of kernel services among the co-located virtualized environments (VEs) incurs performance interference between each other. Especially, interference effects within the shared I/O stack would lead to severe performance degradations on many-core platforms incorporating fast storage technologies (e.g., non-volatile memories). This article presents MultiLanes, a virtualized storage system for OS-level virtualization on many cores. MultiLanes builds an isolated I/O stack on top of a virtualized storage device for each VE to eliminate contention on kernel data structures and locks between them, thus scaling them to many cores. Moreover, the overhead of storage device virtualization is tuned to be negligible so that MultiLanes can deliver competitive performance against Linux. Apart from scalability, MultiLanes also delivers flexibility and security to all the VEs, as the virtualized storage device allows each VE to run its own guest file system. The evaluation of our prototype system built for Linux container (LXC) on a 32-core machine with both a RAM disk and a flash-based SSD demonstrates MultiLanes scales much better than Linux in micro- and macro-benchmarks, bringing significant performance improvements.

Classifying Data to Reduce Long Term Data Movement in Shingled Write Disks

Improving Flash-based Disk Cache with Lazy Adaptive Replacement

Efficient Memory-mapped I/O on Fast Storage Device

In operating systems, memory-mapped I/O (mmio) is an important access method that maps data to a memory. When mmio is used, hot data reside in the memory and clod data do in storage device, and data placement depends on the virtual memory subsystem of the operating system. Since the performance of storage has a direct impact on the performance of mmio, it is widely expected that better storage will lead to better performance. However, the expectation is limited when fast storage is used because the virtual memory subsystem does not reflect the feature of fast storage. In this article, we examine the mmio path to determine the influence of fast storage. We find that the overhead of the virtual memory subsystem, negligible on the HDD, prevents applications from using the full performance of fast storage. We present several optimization techniques and modify the Linux kernel to implement those techniques. Experimental results show that our optimized mmio has up to 7x better performance that the original. Compared with a system that has enough memory to keep all data, we achieve 92% performance of the resource-rich system. This implies that our system can effectively extend the main memory with fast storage.

TrueErase: Leveraging an Auxiliary Data Path for Per-file Secure Deletion

One important aspect of privacy is the ability to securely delete sensitive data from electronic storage in such a way that it cannot be recovered; we call this action secure deletion. Short of physically destroying the entire storage medium, existing software secure-deletion solutions tend to be piecemeal at best  they may only work for one type of storage or file system, may force the user to delete all files instead of selected ones, may require the added complexities of encryption and key storage, may require extensive changes and additions to the computers operating system or storage firmware, and may not handle system crashes gracefully. We present TrueErase, a holistic secure-deletion framework for individual systems that contain sensitive data. Through design, implementation, verification, and evaluation on both a hard drive and NAND flash, TrueErase shows that it is possible to construct a per-file, secure-deletion framework that can accommodate different storage media and legacy file systems, require limited changes to legacy systems, and handle common crash scenarios. TrueErase can serve as a building block by cryptographic systems that securely delete information by erasing encryption keys. The overhead is dependent on spatial locality, number of sensitive files, and workload (computational- or I/O-bound).

Storage Workload Identification

Workload identification is an important problem for cloud providers to solve because 1) providers can leverage this information to co-locate similar workloads in order to make the system more predictable 2) providers can identify workloads and subsequently give guidance to the subscribers as to associated best practices for provisioning those workloads. Historically, people have identified workloads by looking at their read/write ratios, random/sequential ratios, block size and inter-arrival frequency. Researchers are aware that workload characteristics change over time and that one cannot just take a point in time view of a workload because that will incorrectly characterize workload behavior. Increasingly, manual detection of workload signature is becoming harder because 1) it is difficult for a human to detect a pattern, and 2) representing a workload signature by a tuple consisting of {\it average} values for each of the signature components leads to a large error. In this paper, we present workload signature detection and matching algorithm that is able to correctly identify workload signatures and match them with other similar workload signatures. We have tested our algorithm on nine different workloads generated using publicly available traces and on real customer workloads running in field to show robustness of our approach.

TrueErase: Leveraging an Auxiliary Data Path for Per-file Secure Deletion

One important aspect of privacy is the ability to securely delete sensitive data from electronic storage in such a way that it cannot be recovered; we call this action secure deletion. Short of physically destroying the entire storage medium, existing system-based secure-deletion solutions tend to be piecemeal at best  they may only work for one type of storage or file system, may force the user to delete all files instead of selected ones, may require the added complexities of encryption and key storage, may require extensive changes and additions to the computers operating system or storage firmware, and may not handle system crashes gracefully. NAND flash storage is particularly troublesome, because its storage firmware makes it nearly impossible for the operating system to securely delete files and/or verify that the files have indeed been erased. We present TrueErase, a holistic secure-deletion framework suitable for individual systems that contain sensitive data. Through its design, implementation, verification, and evaluation on both a hard drive and NAND flash, TrueErase shows that it is possible to construct a per-file, encryption-free, secure-deletion framework that can accommodate different storage media and legacy file systems, require limited changes to legacy systems, and handle common crash scenarios.

Internal Parallelism of Flash Memory based Solid State Drives

Flash memory based solid state drives (SSDs) have shown a great potential to change todays storage infrastructure fundamentally. A unique merit of an SSD is its internal parallelism. In this paper we present a comprehensive study on understanding and exploiting internal parallelism of SSDs for high-speed data processing. Through extensive experiments and thorough analysis, we find that exploiting internal parallelism can not only substantially improve I/O performance (e.g., 7.2x) but also lead to many surprising side effects and dynamics, which have a strong implication to system and application designers. Based on these findings, we also present a set of case studies in database management systems, a typical data-intensive application, and show that exploiting internal parallelism can substantially improve system performance, and in the meantime, it also changes the equation for optimizing application performance and calls for a careful reconsideration on various design choices.

Efficient Dynamic Provable Possession of Remote Data via Update Trees

Storage Workload Identification

A User-Friendly Log Viewer for Storage Systems

For customers with remote support, the system collects and transmits logs to a central enterprise repository, where these are monitored for alerts, problem forecasting and troubleshooting. Very large log files limit the interpretability for the support engineers. For an engineer, a large volume of log messages may not pose any problem. Often it is desired to present the log messages in a comprehensive manner where a person can view the important messages first and then go into details if required. In this paper, we present a user-friendly log viewer where we first hide the unimportant messages from the log file. Messages with low utility are considered inconsequential as their removal does not impact the end user for the aforesaid purpose such as problem forecasting or troubleshooting. We relate the utility of a message to the probability of its appearance in the due context. We present machine learning based techniques that computes the usefulness of individual messages in a log file. (30% to 55%), with minimal error rates ( 7% to 20%). When limited user feedback is available, we show modifications to the technique to learn the user intent and accordingly further reduce the error.


