RPMP: A Remote Persistent Memory Pool to Accelerate Data Analytics and AI

webinar

Author(s)/Presenter(s):

Jian Zhang

Library Content Type

Presentation

Library Release Date

Focus Areas

Abstract

Persistent memory represents a new class of memory storage technology that offers high performance, high capacity and data persistence at lower cost to bridge the performance and cost gap between DRAM and SSDs. There are a broad usage scenario of persistent memory for data analytics and AI workloads, however, remote access to persistent memory posed lots of challenges to persistent memory applications. RDMA is an attractive technology can be used for remote memory access. It leverages an RDMA network cards to offload the data movement from the CPU to each system’s network adapter, which improve application performance, utilization, and is capable of enable the applications to take full advantages of persistent memory devices. In this work, we proposed an innovative distributed storage system that leverages persistent memory as storage medium with a key-value storage engine, efficient RDMA powered network messenger as network layer, and used consistent hash algorithm to provided configurable data available and durability to up level applications. RPMP provides low level key-value APIs make it suitable for performance critical applications, it also implements several optimizations including a circular buffer to improve write performance, a persistent memory RDMA memory region combined technology to improve read performance. Experiment performance numbers will be also presented, we will present the micro benchmark performance of the key-value store as well as decision support query performance of RPMP as fully disaggregated shuffle solution in Spark based data analytics.

Learning Objectives

Persistent Memory,Persistent Memory over Fabric,Data analytics