A large Cluster Architecture with Efficient Caching Coherency, Intelligent Management, and High Performance for a Low-Cost Storage Node | SNIA

Abstract

Using a data cache coherency model that employs the concept of logical unit ownership within a cluster of storage nodes allows for optimization of performance for ultra-low latency, even on low-cost storage hardware lacking a high-speed interconnect between nodes. This is accomplished by limiting optimal I/O access to any one logical unit to a single storage node with a local high-speed data cache. However, this also implies use of Asymmetric Logical Unit Access (ALUA). Providing resilient / fault-tolerant access to shared storage at the logical unit level to initiators in a multi-node compute cluster is difficult when using ALUA. Balancing I/O workload is more complex than with a symmetric active/active model where I/O is routed equally among all storage nodes in the cluster. Handling SAN connectivity faults requiring rerouting of I/O is also complex and the associated ALUA state changes may imply I/O response time spikes. Thrash of ALUA states may occur on any one logical unit if multiple compute nodes in the cluster disagree as to the optimal paths to access that logical unit. We have demonstrated an architecture for a low-cost storage node that uses implicit ALUA and intelligent management of logical unit ownership to implement a highly efficient data cache coherency model and lean I/O path while solving many of the issues plaguing ALUA solutions.