Conference

FAST’17

Authors

Alex Conway, Ainesh Bakshi, Yizheng Jiao, Yang Zhan, Michael A. Bender, William Jannen, Rob Johnson, Bradley C. Kuszmaul, Donald E. Porter, Jun Yuan, Martin Farach-Colton

Summary

Fragmented, or age

Fragmentation occurs when logically continuous file blocks become scattered on disk. Reading these files requires additional seeks. On HDDs, a few seeks can have an outsized effect on performance because of mechanical seeks. Even on SSDs, which do not perform mechanical seeks, the fragmentation can harm performance.

A framework for aging

  • Natural transfer size (NTS): the amount of sequential data that must be transferred per I/O in order to obtain some fixed fraction of maximum throughput. They conclude that a reasonable NTS for both the SSDs and HDDs they measured is 4MiB.
  • B-trees: The aging profile of a B-tree depends on the leaf size. If the leaves are much smaller than the NTS, then the B-tree will age as the leaces are split and merged, and thus moved around on the storage devices. Making leaves as large as the NTS increases write amplification.
  • Write-Once or Update-in-Place File systems: Sequential order is very difficult to maintain. Delayed allocation is an attempt to solve the problem.
  • $B^\epsilon$-trees: $B^\epsilon$-trees batch changes to the file system in a sequence of cascading logs, one per node of the tree.

Fragmentation microbenchmarks

  • Intrafile fragmentation: ext4 both tries to keep files and eventually larger file fragments sequential, whereas Btrfs and F2FS interleace the round robin chunks on the end of the sequential data. ext4 and XFS manage to keep the files in larger extents. ZFS keeps the files in multiple chunks through the test. No fragmentation occurs on BetrFS, because this small amount of data fits entirely in a leaf.
  • Interfile fragmentation: On HDD, all file systems except BetrFS and XFS show a precipitous performance decline even if only a small percentage of the files are copied out of order. On SSD, BetrFS is the only file system with stable fast performance.

How to prevent aging

  • Rewrite to keep related data in large blocks
  • Batch updates to avoid too much writing

Contributions

  • They give a simple, fast and portable method for aging file systems
  • They show that fragmentation over time is a first-order performance concern, and that this is true even on modern hardware, such as SSDs, and on modern file systems.
  • They show that aging is not inevitable