ACM DL

ACM Transactions on

Storage (TOS)

Menu
Latest Articles

TDDFS: A Tier-Aware Data Deduplication-Based File System

With the rapid increase in the amount of data produced and the development of new types of storage devices, storage tiering continues to be a popular way to achieve a good tradeoff between performance and cost-effectiveness. In a basic two-tier storage system, a storage tier with higher performance and typically higher cost (the fast tier) is used... (more)

NEWS

  • CALL FOR PAPERS
    Special Issue on Computational Storage

  • TOS EiC Professor Sam H. Noh of UNIST named as ACM Distinguished Member
    A complete list of 2017 ACM Distinguished Members can be found here.

  • TOS Editor-in-Chief featured in "People of ACM"

  • Sam H. Noh is Editor-in-Chief of ACM Transactions on Storage (TOS) - featured in the periodic series "People of ACM", full article available here
    November 01, 2016
     

  • ACM Transaction on Storage (TOS) welcomes Sam H. Noh as its new Editor-in-Chief for a 3-year term, effective August 1, 2016.
    Sam H. Noh is a professor and Head of the School of the Electrical and Computer Engineering School at UNIST (Ulsan National Institute of Science and Technology) in Ulsan, Korea and a leader in the use of new memory technology such as flash memory and non-volatile memory in storage.
    - August 01, 2016

Forthcoming Articles

TOS 15:1 - EiC Message

REGISTOR: A Platform for Unstructured Data Processing Inside SSD Storage

This paper presents REGISTOR, a platform for regular expression grabbing inside storage. The main idea of Registor is accelerating regular expression (regex) search inside storage where large data set is stored, eliminating the I/O bottleneck problem. A special hardware engine for regex search is designed and augmented inside SSD that processes data on-the-fly during data transmission from NAND flash to host. In order to make the speed of regex search match the internal bus speed of modern SSD, a deep pipeline structure is designed in Registor hardware consisting of file semantics extractor, matching candidates finder, regex matching units (REMUs) and results organizer. Furthermore, each stage of the pipeline makes use of maximal parallelism possible. To make Registor readily usable by high level applications, we have developed a set of APIs and libraries in Linux allowing Registor to process files in SSD by recombining separate data blocks into files efficiently. A working prototype of Registor has been built in our newly designed NVMe-SSD. Extensive experiments and analyses have been carried out to show that Registor achieves high throughput, reduces I/O bandwidth requirement by up to 97% and CPU utilization by as much as 82% for regex search in large data set.

Performance and Resource Utilization of FUSE User-Space File Systems

Traditionally, file systems were implemented as part of OS kernels. As complexity of file systems grew, many new file systems began being developed in user space. Low performance is considered the main disadvantage of user-space file systems but the extent of this problem has never been explored systematically. As a result, the topic of user-space file systems remains rather controversial: while some consider user-space file systems a "toy" not to be used in production, others develop full-fledged production file systems in user space. In this article we analyze the design and implementation of the most widely known user-space file system framework, FUSE, for Linux. We then characterize its performance and resource utilization for a wide range of workloads. Our experiments indicate that depending on the workload and hardware used, throughput degradation caused by FUSE can be completely imperceptible or as high as -83%, even when optimized; latencies of FUSE file system operations can increase from none to 4x when compared to in-kernel Ext4. On the resource utilization side, FUSE can increase relative CPU utilization by up to 31% and underutilize disk bandwidth by as much as -80% compared to Ext4.

Introduction to the Special Issue on ACM International Systems and Storage Conference (SYSTOR) 2018

Lerna: Parallelizing Dependent Loops Using Speculation

We present Lerna, an end-to-end tool that automatically and transparently detects and extracts parallelism from data dependent sequential loops using speculation combined with a set of techniques including code profiling, dependency analysis, instrumentation, and adaptive execution. Speculation is needed to avoid conservative actions and detect actual conflicts. Lerna targets applications that are hard-to-parallelize due to data dependency. Our experimental study involves the parallelization of 13 applications with data dependencies. Results on a 24-core machine show an average of 2.7x speedup for micro-benchmarks and 2.5x for the macro-benchmarks.

All ACM Journals | See Full Journal Index

Search TOS
enter search term and/or author name