Improve Distributed Storage System TCO with Host-Managed SMR HDDs

webinar

Author(s)/Presenter(s):

Albert Chen

Library Content Type

Presentation

Library Release Date

Focus Areas

Abstract

Host-managed shingled magnetic recording (HM-SMR) devices write data sequentially to the platter and overlap new tracks on parts of previously written tracks. This results in higher track density to enable higher capacity points. The increase in drive capacity leads to lower total cost of ownership through fewer devices and servers as well as reduced maintenance, power and cooling costs. HM-SMR devices also have performance advantages. Since the host takes responsibility and control of device state and data placement, it can optimize to reduce tail latency, increase throughput, and manage performance at scale. However, these advantages come at a cost as HM-SMR devices have a more complex and restrictive usage model compared to conventional HDDs. They require hosts to write sequentially, align IOs to device zone boundaries, and actively monitor and set zone states. In addition, host system software and hardware must be able to recognize and support the newly defined HM-SMR device type and zone management commands. Currently, there are a spectrum of tools available to enable HM-SMR in the storage stack such as SG_IO, libzbc, f2fs and dm-zoned. However, they require users to modify their applications or to be on a specific kernel version with additional modules. In addition, they cannot be easily containerized/virtualized to fit into today’s software-defined environments. These dependencies and restrictions result in confusion, friction and disruption to the user experience that frustrates both storage vendors and users. In this talk, we will share our experience in creating a device friendly storage system to enable applications to use HM-SMR devices without modification nor worrying about kernel dependencies. This independence allows for easy containerization to fit seamlessly into existing workflows and orchestration frameworks. We will demonstrate how a HM-SMR solution with Ceph, Hadoop and Minio can be enabled with just 2 commands in the command line interface (CLI). This presentation will introduce a novel row/column architecture and log structured data layout that minimize IO contention and latency while preventing hot write areas. We will see real life examples on how host software with HM-SMR can reduce long tail latency and increase performance consistency by eliminating device background work and expensive unnecessary seeks to enable devices to perform at their best. Finally, we will discuss benchmark performance results comparing our solution with HM-SMR versus legacy filesystems (e.g. xfs and ext4) with CMR drive.

Learning Objectives

How to enable Host-Managed SMR HDDs on distributed systems such as Ceph, Hadoop and Minio,How to enable Host-Managed SMR HDDs without application changes nor kernel modification,What is device friendly IO? How can systems conform and take advantage of it?,How to design a SW storage system that is scalable and can minimize IO contention and prevent hot write areas?,How to design a SW storage system that can fit seamlessly into modern virtualization & orchestration frameworks?