SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA
At SDC 2023 last year, we presented Homa, a new datacenter protocol invented by John Ousterhout at Stanford University. While not TCP API compatible, overall, Homa provides significant advantages over TCP when it comes to tail latency and infrastructure efficiency in real-life networks.
We were told to come back with answers regarding Homa: A) can Homa and TCP co-exist in the same network peacefully? And, how can Homa be accelerated to become more useful and applicable for networked storage?
In our presentation this year we will provide those answers and new collaboration results.
First, we will show network simulation analysis for traffic comprising Homa and TCP. Because of TCP’s congestion management both protocols not only can co-exist but also be complemented in order to get the best from both: TCP for data streams plus Homa for (shorter) messages with improved tail-end latencies.
We then look into John’s open source implementation for the Linux Kernel, HomaModule and how this can be combined with another open source project, FPGA NIC Corundum.io, for FPGA acceleration of the Homa protocol.
We present experimental results for when applying known concepts of CPU offloading to Homa, especially receive side scaling (RSS), segmentation and large receive offloading. Applying the offloads to a traffic pattern with significant traffic portion of payload chunks >> MTU, both the slowdown and RTT decreases by 5x. The outcome is a Reliable, Rapid, Request-Response Protocol with benefits in storage networking and potential use for networking GPU clusters.
Learning Objectives: Tail Latencies in TCP, Computational Burden of TCP, Homa - A Replacement for TCP From Stanford University, (4RP) als alternatives to TCP.
Upon completion, participant will be able to understand Tail Latencies in TCP, Computational Burden of TCP
Upon completion, participant will be able to understand Homa, Stanford’s Reliable, Rapid Request-Response Protocol
Upon completion, participant will be able to compare FPGA based network protocol acceleration for Homa from state-of-the art techniques for TCP
This presentation will outline the architectures for the top three platforms in each of these two categories, Von Neumann and Programmable Logic. Showing how vendors like NVIDIA, Pensando, Marvell, Achronix, Intel, and Xilinx have chosen to architect their solutions. We will then weigh the merits and benefits of each approach while also highlighting the performance bottlenecks. By the end of the presentation, it may be fairly clear where the industry is headed, and which solutions may eventually win out.
Marginal links and congestion have plagued storage fabrics for years and many independent solutions have been tried. The Fibre Channel industry has been keenly aware of this issue and, over the course of the last two years, has created the architectural foundation for a common ecosystem solution. Fabric Notifications employs a simple message system to provide registered participants with information about key events in the fabric that are used to automatically address link integrity and congestion issues. This new technology has been embraced by the Fibre Channel community and has demonstrated a significant improvement in addressing the nagging issues. In this informative session, storage experts will discuss the evolution of this technology and how it is a step toward a truly autonomous SAN.
Accessing data within an enterprise spread across geo-diverse locations challenges work productivity, time, and IT resources. Existing methods require data to be transferred and replicated leading to delayed business insights or even insights based on stale data. In addition, having to send a copy of data to every user that requires it leads to copy sprawl, data management challenges and compromised data security. Even with data transfer, data arrives at the destination in an unpredictable time and performance varies with data type/size and application. Utilizing network data mover appliances and applications is so cumbersome that even physical transportation of media or data is considered an acceptable solution. In summary, there is a need to access data without replication across geographic distances. It is commonly accepted that remote application execution is not possible when a high Bandwidth Delay Product (BDP), i.e., the product of a link's capacity (in [x]bits per second) and its round-trip delay time (in [x]seconds), exists between the compute location and the data location. The BDP between the compute and data when they are within a data center, on a LAN, or in the same Cloud provider, is acceptable for most common applications. However, as soon as the BDP increases traversing a WAN over even minor latencies, these same applications are functionally unusable due to the time required to get the data to the location of the compute. BDP has major impact on traditional networks using the TCP/ IP protocol; various methods have been used to optimize or tune TCP/IP to make it more efficient with moderate results. The Vcinity Data Access Platform™ (VDAP) enables enterprises to instantly access and operate on data sets over any distance, with local performance, and without copying them. This is accomplished by transforming the enterprise WAN into a Global LAN enabling local application performance on data over global distances. This capability lets enterprises leverage modern business tools such as machine learning modeling and artificial intelligence innovations leading to more efficient business processes and a greater competitive advantage. The concept of turning WAN into Global LAN and enabling a Global Fabric is achieved through RDMA over WAN. With over 32 patents as the underpinning to the VDAP portfolio, it’s the individual advancements that in aggregate achieve application reach-in to data over distance.
The solution is made up of an RDMA based network fabric at Layer 2 or layer 3 incorporating the following:
The result is the ability to sustain 90%+ of theoretical data throughput on global distances to provide applications the same performance experience as when the data is local to the application. As seen above the Vcinity Data Access Platform integrates with other HPC technologies and high-speed storage. It attaches to a standard NAS or transitional high-speed storage tier and, when connected over the MAN/ WAN, provides a high-performance, geo-diverse data exchange. VDAP enables a global federated data platform for accessing data without replication, using global namespace and network-mapped drive volumes across geographically distributed enterprise storage.
Introduced in 2021, Fabric Notifications developed into a key resiliency solution for storage networks. Providing the ability to respond to events impacting the Fabric, Fabric Notifications enhances the overall user experience when deploying traditional Fibre Channel SANs. In this session, the improvements in the user experience are profiled for solutions using Linux, AIX, and PowerPath. The latest developments in both standards and market solutions are also provided as an update to this exciting new technology.
NVMe over IP is a technology able to provide complete and scalable SAN solutions. Security is of paramount importance for SANs and the fundamental methods to secure NMVe over IP fabrics (i.e., DH-HMAC-CHAP authentication and TLS secure channel) have been defined. However, the security provisioning of these methods does not scale yet to large fabrics. This presentation will explain these scalability constraints and will explain how an additional mechanism, called Authentication Verification Entity, is able to make security for NVMe over IP fabrics simple, scalable, and effectively deployable.
Costs and risks of implementing High-Performance Embedded Systems such as Centralized Car Servers for Autonomous Vehicles can be reduced when borrowing from modern datacenter technology. Therefore, PCIe and Multi-Gigabit Ethernet have become a foundation for automotive in-vehicle infrastructure. While the needs for storage in automotive are somewhat relaxed, compared to datacenters, automotive has a need for “unconventional” storage connectivity like many sensors to few CPUs to single SSD. And, unlike datacenters, automotive comes with strong constraints for size, weight and power, and real-time guaranties. Time-Sensitive Networking (TSN) is an evolving set of IEEE standards for Ethernet based networks. It brings time synchronization (IEEE 802.1AS) along with low bounded latency via traffic scheduling (IEEE 802.1Qbv) and/or traffic shaping (IEEE 802.1Qav) and ultra-reliability via frame replication and elimination (IEEE 802.1CB), and thus, is one of the best positioned options for in-vehicle networking. In combination with reliable transports such as TCP/IP this enables deterministic networking for distributed systems. In our presentation we will describe the needs of modern automotive networking and storage architectures, and will share approaches for converging (real-time) Ethernet and PCIe as a common fabric for reliable and cost-efficient implementations. We showcase first performance results of a proof-of-concept implementation. We close with an outlook of the potential benefits of TSN and Deterministic Networking to other composable storage architectures.