Abstract
Cloud native deployment has become one of the major trends for large scale Big Data analytics. Compared to on-premise data center, cloud offers much stronger scalability and higher elasticity to Big Data applications. However, cloud is also considered to be less performant than on-premise alternatives due to virtualization and cluster resource disaggregation.
We present a new cloud native Spark application architecture backed by persistent memory technology. The key ingredient of this architecture is a novel acceleration engine that uses Intel’s 3DXPoint technology as disaggregated external memory resource. We discuss how the performance of multiple aspects of data processing can be improved using this new architecture.
As a key takeaway, audience will gain understanding on the benefits of latest persistent memory technology, and how such new technology could be leveraged in cloud data processing architecture.