Blog

Unlocking CXL's Potential Q&A

Unlocking CXL's Potential Q&A

Apr 22, 2025

Compute Express Link® (CXL) is a groundbreaking technology that expands server memory beyond established limits and boosts bandwidth. CXL enables seamless memory sharing, reduces costs by optimizing resource utilization, and supports different memory types. In our Unlocking CXL’s Potential webinar, our speakers Arthur Sainio, SNIA Persistent Memory Special Interest Group Co-Chair, Jim Handy of Objective Analysis, Mahesh Natu of the CXL Consortium, and Torry Steed of SMART Modular Technologies discussed how CXL is transforming computing systems with its economic and performance benefits, and explored its future impact across various market segments

You can learn more about CXL development by attending CXL DevCon, April 29-30 2025 in Santa Clara CA. And as our webinar discussed, access and program CXL memory modules located in the SNIA Innovation Center with the training materials provided in the virtual SNIA Programming Workshop and Hackathon at www.snia.org/pmhackathon.

The audience was highly engaged and asked many interesting questions. Our Q&A takes care of the answers! Feel free to reach out to us at askcms@snia.org if you have more questions.

Q: How will the connections of a CXL switch be made physically possible with multiple hosts and multiple endpoints in real time?

A: It’s similar to PCI Express, in that you could have switches that have multiple upstream ports that connect to multiple CPUs, and they can have multiple downstream ports that connect to multiple devices. It's really a well-known thing that PCIe has championed so we don't see any challenges making those connections in the architecture.

Q: Is there standards-based work being done for CXL/PCIe over co-packaged optics for disaggregated computing?

A: That’s a really good question. There is work being done in PCI Express for transporting PCI over Optics (optical interface) so once that happens we will probably just leverage that. CXL leverages all of the things that PCIe does in terms of the physical layer or the form factor when possible so I expect that will happen as that is where we are heading to.

Q: Will Trusted Security Protocol (TSP) be used with Intel Trust Domain Extensions (TDX) and AMD Secure Encrypted Virtualization-Secure Nested Paging (SEV-SNP) with TSP?

A: Both technologies work with TSP. We have had great participation from Intel and AMD when defining TSP. What TSP does is define the interface between the CPU aspect of the TDX and the device. So that's the piece that is not part of the CPU architecture – the CPU definition - because it's outside of that and that's the piece that the TSP builds. Device vendors can build devices that then will follow the TSP specification and will be compatible with both of these technologies. There are other CPU vendors which have similar confidential compute technologies, and I think they can also be compatible with TSP.

Q: What is TDISP and what is the difference from TSP?

A: TDISP, or TEE Device Interface Security Protocol, was developed by PCI Express. Again, it solves the same problem for PCI Express meaning it will allow technologies like TDX to inspect a PCIe device, verify the device is in healthy good condition, and bring it into the test boundary of the TD. TSP does something similar for CXL devices. Obviously, with CXL being coherent it's a different problem to solve but I think we have solved the problem, and it's ready to be deployed.

Q: Why are we not seeing CXL Type-1 or -2 adoption by the industry?

A: We think that it is coming - it's just not here yet. The big interest initially has just been Type-3 which is pure memory expansion but we are starting to see storage eventually moving to CXL as well. There's definite benefits there, and then I think we are starting to see memory with processing as well so it's coming in a next wave of CXL adoption.

The whole ecosystem has to come together and CXL is actually pretty new. We’re sure that there's an awful lot of work being done that has not been announced from mostly hardware manufacturers but also there needs to be an awful lot of software support to make everything fall into place. All of that comes together slowly and that forecast shown in the webinar actually starts with very modest growth simply because the rest of the support network for CXL needs to be put together before CXL can really take off.

Q: We see in-memory databases (IMDBs) as a big use case for CXL. But we have not seen any announcements from SAP HANA, Oracle or MS-SQL, or anybody adopting CXL with in-memory databases. Why is that?

A: We have not seen anything specific to date. We believe SAP HANA has published a paper and may have some work going on. See https://www.vldb.org/pvldb/vol17/p3827-ahn.pdf. IMDBs would benefit from more capacity as they would like as much memory capacity as possible, so CXL is definitely something they would benefit from.

Q: Do we expect GPUs to support CXL? Without that, the AI use case seems highly limited.

A: We don't really expect GPUs to support CXL and talking with CXL memory directly. Possibly once it's fully disaggregated it's possible you could have a layer where it is sharing information there. It’s more about if memory expansion for the system itself aids in AI use cases. We are seeing evidence that it does but how much that plays out we have to see.

We haven't really spoken with GPU vendors about this but the understanding is that one of the benefits of CXL is that it makes it easy to fill either the Graphics Double Data Rate (GDDR) that's on the GPU board or to bring in data that can go into the High Bandwidth Memory (HBM) so it seems like there'd be an opportunity for that even if it's not something people are speaking about now.

There are two uses we see right now in GPU. The first one is when they want more memory. They could use what we call the unordered IO feature of CXL to go over and reach right into CXL Type-3 memory and therefore get more memory expansion. The second use case is the GPU actually using CXL coherency to communicate with the CPU so they are cache coherent going into the CPU and they can quickly exchange data back and forth. Again both require heavy lifting - lots of software enabling - but I I think those use cases do exist. It just takes effort to go enable those and get the benefits.

Finally, we hope you will join us for more SNIA webinars. Visit https://www.snia.org/webinars for the complete list of scheduled and on-demand webinars. And check out the SNIA Educational Library for great content on CXL, memory, and much more.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

CXL Memory

Blog

A Deep Look at New Memories Q&A

A Deep Look at New Memories Q&A

SNIA CMS Community

Feb 12, 2025

New memories like MRAM, ReRAM, PCM, and FRAM are vying to replace embedded flash and, eventually, even embedded SRAM. In our Deep Look at New Memories webinar, our speakers Arthur Sainio, SNIA Persistent Memory Special Interest Group Co-Chair, Tom Coughlin of Coughlin Associates, and Jim Handy of Objective Analysis took a present and future look, explaining the applications that have already adopted new memory technologies in the marketplace, their impact on computer architectures and AI, the outlook for important near-term changes, and how economics dictate success or failure. If you have not yet watched the webinar, check it out in our SNIA Educational Library! The audience was highly engaged and asked many interesting questions, some which were answered at the end of the webinar. However, we could not get to all of them, so our Q&A covers those that remained. Feel free to reach out to us at askcms@snia.org if you have more! Q: How long will it take for a new memory to replace DRAM? A: DRAM has a couple of things going for it that any prospective rival does not, and that’s the fact that it’s already produced in enormous volumes (about 20 billion chips per year) and it has more than five decades of learning behind it. Any rival will need to compete against DRAM on cost, and that will naturally take advantage of the new memory’s ability to go far beyond DRAM’s scaling limit, but the new memory will also need to be produced in high enough volume to supersede DRAM’s advantages in volume and learning. That’s going to take some time, but we think that the mid-2030s may see a transition underway. Q: You talked a lot about MRAM applications, how are other new memories being used? A. Our focus on MRAM is largely because of its widespread use right now. ReRAM is just beginning to find more applications and is a big focus of leading foundries. Panasonic introduced a ReRAM-based MCU way back in 2012, but they have been pretty alone. Another company that’s pretty alone STMicroelectronics, who ships the world’s only PCM-based MCU. Back when Intel was pursuing Optane we focused a lot of attention on PCM because Optane used PCM and ran in pretty high volume. From a unit-volume standpoint, though, FRAM beats all others in an extremely narrow application space: RFID fare cards for trains. These chips are really tiny, though, so they don’t consume many wafers. Q. Will any of these memories move from back end of line to front end of line production and why or why not? A. There’s a big advantage in being back end of line (BEOL) that has been used to reduce the cost of 3D NAND. With BEOL you can build the memory bits on top of the support logic to make a significantly smaller chip. Today companies are just starting to migrate from that to a hybrid-bonded approach, where two wafers on two different process lines are used, one to make the bits and one to make the logic. A BEOL-friendly bit cell lends itself to this approach too. Q. What role could these new non-volatile memories play in chiplet technology/heterogeneous integration? A. Future processors will have a logic chip for the processor and supporting chiplets for memory, whether it’s firmware memory, scratchpad memory, or a cache. These new technologies can support all three, although most of today’s research is focused on slower versions that won’t be too useful for the lowest-level caches closest to the processor. Over time we expect that to change, too. Q. What role will these memories play in CXL-based memory systems? A. CXL provides a wide variety of solutions to computing architecture. It supports memories of all kinds: Fast & slow, volatile & nonvolatile, byte write and block erase. Both CXL and NVMe can support any of these memory types, but NVMe is not fast enough to take advantage of really fast memories, so CXL is likely to be used in systems that need NVMe-like support at speed significantly faster than NVMe. Q. How important is radiation resistance in memory? A. That’s a tough question, because radiation barely makes a difference to a PC or smart phone user, but it’s a “Make or Break” issue for aerospace and certain other applications. There’s radiation everywhere, and it corrupts bits. Sometimes that just means that your PC bombs, resulting in “vocabulary enrichment” as you reboot. If a program bit is lost in a deep-space satellite, it’s likely that the entire billion-dollar mission will become a total loss. Also, there’s a lot of radiation in space, but the earth’s atmosphere absorbs a lot it before it gets down to the surface, so it’s less of an issue down here than it is in space. Radiation can have a significant impact on DRAM memory used in networking equipment. It can cause bit flips referred to as Single Event Upsets (SEUs) which requires network equipment to be restarted. Using memory that has more resistance to radiation is beneficial in this case. If you are interested in this topic, check out a blog post from Jim Handy on memory issues in space medical applications. Q. How much are these various new memories affected by electromagnetic fields? A. They’re not as susceptible to stray fields as many people think. While you can’t put an MRAM inside of the powerful magnetic coil of an MRI imager, a lot of other common sources of magnetism are not a concern. Jim Handy’s working with MRAM makers and leading MRAM researchers to put together a table that illustrates where this stands in real-life terms that anyone can understand. Q. What sort of manufacturing volume will new memory need to replace DRAM, assuming it had similar performance? A. Based on the NAND flash crossover with DRAM prices in 2004, and upon Intel’s trouble getting Optane costs competitive with DRAM, despite Optane’s significantly smaller die size, Objective Analysis estimates that the wafer volume of a competing technology must come within an order of magnitude of that of DRAM’s for its costs to fall below DRAM’s. Q. How will AI affect new memory demand, both in the data center and for consumer applications? A. For anything to play a part in AI or any other computing application, it must provide a compelling cost advantage over more established technologies. In the data center that cost includes energy costs as well as the cost of the computing equipment itself, so if a slightly more costly technology can reduce energy costs so much that the total cost of ownership (TCO) is reduced, then it will find broad acceptance. These technologies aren’t there yet. Portable applications are somewhat different, because these memories can often reduce the cost of the system’s battery, creating a lower TCO than established technologies can do. Q. You mentioned a 10nm limit for DRAM, are there ways that DRAM might get around that limit--such as with 3D memory? A. DRAM’s kind of 3D already, so the benefit of turning it on its side as the industry did with NAND flash doesn’t bring DRAM anywhere near the benefits that the 3D switch caused for NAND flash. One very promising approach is to take some of the FRAM materials and use them to shrink the DRAM’s capacitor, but if you do that, you may as well just build an FRAM. Another possibility is to convert to a gain cell, which uses 2-3 transistors to replace the DRAM’s 1-transistor, 1-capacitor cell. One huge advantage of the gain cell is that it can shrink with the process, rather than being limited by the size of the capacitor. It’s early in the game, though, and although we are certain that an ingenious solution will get us past this hurdle, it’s too early to tell what that solution will be.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Memory

Blog

Emerging Memories Branch Out – a Q&A

Emerging Memories Branch Out – a Q&A

SNIA CMS Community

Feb 19, 2024

Our recent SNIA Persistent Memory SIG webinar explored in depth the latest developments and futures of emerging memories – now found in multiple applications both as stand-alone chips and embedded into systems on chips. We got some great questions from our live audience, and our experts Arthur Sainio, Tom Coughlin, and Jim Handy have taken the time to answer them in depth in this blog. And if you missed the original live talk, watch the video and download the PDF here. Q: Do you expect Persistent Memory to eventually gain the speeds that exist today with DRAM? A: It appears that that has already happened with the hafnium ferroelectrics that SK Hynix and Micron have shown.Ferroelectric memory is a very fast technology and with very fast write cycles there should be every reason for it to go that way. With the hooks that are in CXL

, though, that shouldn’t be that much of a problem since it’s a transactional protocol. The reads, then, will probably rival DRAM speeds for MRAM and for resistive RAM (MRAM might get up to DRAM speeds with its writes too). In fact, there are technologies like spin-orbit torque and even voltage-controlled magnetic anisotropy that promise higher performance and also low write latency for MRAM technologies. I think that probably most applications are read intensive and so the read is the real place where the focus is, but it does look like we are going to get there. Q: Are all the new Memory technology protocols (electrically) compatible to DRAM interfaces like DDR4 or DDR5? If not, then shouldn’t those technologies have lower chances of adoption as they add dependency on custom in-memory controller? A: That’s just a logic problem. There’s nothing innate about any memory technology that couples it tightly with any kind of a bus, and so because NOR Flash and SRAM are the easy targets so far, most emerging technologies have used a NOR flash or SRAM type interface. However, in the future they could use DDR. There’re some special twists because you don’t have to refresh emerging memory technologies. but you know in general they could use DDR. But one of the beauties of CXL is that you put anything you want to with any kind of interface on the other side of CXL and CXL erases what the differences are. It moderates them so although they may have different performances it’s hidden behind the CXL network. Then the burden goes on to the CXL controller designers to make sure that those emerging technologies, whether it’s MRAM or others, can be adopted behind that CXL protocol. My expectation is for there to be a few companies early on who provide CXL controllers that that do have some kind of a specialty interface on them whether it’s for MRAM or for Resistive RAM or something like that, and then eventually for them to move their way into the mainstream. Another interesting thing about CXL is that we may even see a hierarchy of different memories within CXL itself which also includes as part of CXL including domain specific processors or accelerators that operate close to memory, and so there are very interesting opportunities there as well. If you can do processing close to memory you lower the amount of data you’re moving around and you’re saving a lot of power for the computing system. Q: Emerging memory technologies have a byte-level direct access programming model, which is in contrast to block-based NAND Flash. Do you think this new programming model will eventually replace NAND Flash as it reduces the overhead and reduces the power of transferring Data? A: It’s a question of cost and that’s something that was discussed very much in our webinar. If you haven’t got a cost that’s comparable to NAND Flash, then you can’t really displace it. But as far as the interface is concerned, the NAND interface is incredibly clumsy. All of these technologies do have both byte interfaces rather than a block interface but also, they can write in place – they don’t need to have a pre-erased block to write into. That from a technical standpoint is a huge advantage and now it’s just a question of whether or not they can get the cost down – which means getting the volume up. Q: Can you discuss the High Bandwidth Memory (HBM) trends? What about memories used with Graphic Processing Units (GPUs)? A: That topic isn’t the subject of this webinar as this webinar is about emerging memory technologies. But, to comment, we don’t expect to see emerging memory technologies adopt an HBM interface anytime in the really near future because HBM does springboard off DRAM and, as we discussed on one of the slides, DRAM has a transition that we don’t know when it’s going to happen that it goes to another emerging memory technology. We’ve put it into the early 2030s in our chart, but it could be much later than that and HBM won’t convert over to an emerging memory technology until long after that. However, HBM involves stacking of chips and that ultimately could happen. It’s a more expensive process right now – a way of getting a lot of memory very close to a processor – and if you look at some of the NVIDIA applications for example, this is an example of the Chiplet technology and HBM can play a role in those Chiplet technologies for GPUs.. That’s another area that’s going to be using emerging memories as well – in the Chiplets. While we didn’t talk about that so much in this webinar, it is another place for emerging memories to be playing a role. There’s one other advantage to using an emerging memory that we did not talk about: emerging memories don’t need refresh. As a matter of fact, none of the emerging memory technologies need refresh. More power is consumed by DRAM refreshing than by actual data accesses. And so, if you can cut that out of it, you might be able to stack more chips on top of each other and get even more performance, but we still wouldn’t see that as a reason for DRAM to be displaced early on in HBM and then later on in the mainstream DRAM market. Although, if you’re doing all those refreshes there’s a fair amount of potential of heat generation by doing that, which may have packaging implications as well. So, there may be some niche areas in there which could be some of the first ways in which some of these emerging memories are potentially used for those kinds of applications, if the performance is good enough. Q: Why have some memory companies failed? Apart from the cost/speed considerations you mention, what are the other minimum envelope features that a new emerging memory should have? Is capacity (I heard 32Gbit multiple times) one of those criteria? A: Shipping a product is probably the single most important activity for success. Companies don’t have to make a discrete or standalone SRAM or emerging memory chip but what they need to do is have their technology be adopted by somebody who is shipping something if they’re not going to ship it themselves. That’s what we see in the embedded market as a good path for emerging memory IP: To get used and to build up volume. And as the volume and comfort with manufacturing those memories increase, it opens up the possibility down the road of lower costs with higher volume standalone memory as well. Q: What are the trends in DRAM interfaces? Would you discuss CXL’s role in enabling composable systems with DRAM pooling? A: CXL, especially CXL 3.0, has particularly pointed at pooling. Pooling is going to be an extremely important development in memory with CXL, and it’s one of the reasons why CXL probably will proliferate. It allows you to be able to allocate memory which is not attached to particular server CPUs and therefore to make more efficient and effective use of those memories. We mentioned this earlier when we said that right now DRAM is that memory with some NAND Flash products out there too. But this could expand into other memory technologies behind CXL within the CXL pool as well as accelerators (domain specific processors) that do some operations closer to where the memory lives. So, we think there’s a lot of possibilities in that pooling for the development and growth of emerging memories as well as conventional memories. Q: Do you think molecular-based technologies (DNA or others) can emerge in the coming years as an alternative to some of the semiconductor-based memories? A: DNA and other memory technologies are in a relatively early stage but there are people who are making fairly aggressive plans on what they can do with those technologies. We think the initial market for those molecular memories are not in this high performance memory application; but especially with DNA, the potential density of storage and the fact that you can make lots of copies of content by using genetic genomic processes makes them very attractive potentially for archiving applications. The things we’ve seen are mostly in those areas because of the performance characteristics. But the potential density that they’re looking at is actually aimed at that lower part of the market, so it has to be very, very cost effective to be able to do that, but the possibilities are there. But again, as with the emerging high performance memories, you still have the economies of scale you have to deal with – if you can’t scale it fast enough the cost won’t go down enough that will actually will be able to compete in those areas. So it faces somewhat similar challenges, though in a different part of the market. Earlier in the webcast, we said when showing the orb chart, that for something to fit into the computing storage hierarchy it has to be cheaper than the next faster technology and faster than the next cheaper technology. DNA is not a very fast technology and so that automatically says it has to be really cheap for it to catch on and that puts it in a very different realm than the emerging memories that we’re talking about here. On the other hand, you never know what someone’s going to discover, but right now the industry doesn’t know how to make fast molecular memories. Q: What is your intuition on how tomorrow’s highly dense memories might impact non-load/store processing elements such as AI accelerators? As model sizes continue to grow and energy density becomes more of an issue, it would seem like emerging memories could thrive in this type of environment. Your thoughts? A: Any memory would thrive in an environment where there was an unbridled thirst for memory. as artificial intelligence (AI) currently is. But AI is undergoing some pretty rapid changes, not only in the number of the parameters that are examined, but also in the models that are being used for it. We recently read a paper that was written by Apple* where they actually found ways of winnowing down the data that was used for a large language model into something that would fit into an Apple MacBook Pro M2 and they were able to get good performance by doing that. They really accelerated things by ignoring data that didn’t really make any difference. So, if you take that kind of an approach and say: “Okay. If those guys keep working on that problem that way, and they take it to the extreme, then you might not need all that much memory after all.” But still, if memory were free, I’m sure that there’d be a ton of it out there and that is just a question of whether or not these memories can get cheaper than DRAM so that they can look like they’re free compared to what things look like today. There are three interesting elements of this: First, CXL, in addition allowing mixing of memory types, again allows you to put in those domain specific processors as well close to the memory. Perhaps those can do some of the processing that’s part of the model, in which case it would lower the energy consumption. The other thing it supports is different computing models than what we traditionally use. Of course there is quantum computing, but there also is something called neural networks which actually use the memory as a matrix multiplier, and those are using these emerging memories for that technology which could be used for AI applications. The other thing that’s sort of hidden behind this is that spin tunnelling is changing processing itself in that right now everything is current-based, but there’s work going on in spintronic based devices that instead of using current would use the spin of electrons for moving data around, in which case we can avoid resistive heating and our processing could run a lot cooler and use less energy to do so. So, there’s a lot of interesting things that are kind of buried in the different technologies being used for these emerging memories that actually could have even greater implications on the development of computing beyond just the memory application themselves. And to elaborate on spintronics, we’re talking about logic and not about spin memory – using spins rather than that of charge which is current. Q: Flash has an endurance issue (maximum number of writes before it fails). In your opinion, what is the minimum acceptable endurance (number of writes) that an emerging memory should support? It’s amazing how many techniques have fallen into place since wear was an issue in flash SSDs. Today’s software understands which loads have high write levels and which don’t, and different SSDs can be used to handle the two different kinds of load. On the SSD side, flash endurance has continually degraded with the adoption of MLC, TLC, and QLC, and is sometimes measured in the hundreds of cycles. What this implies is that any emerging memory can get by with an equally low endurance as long as it’s put behind the right controller. In high-speed environments this isn’t a solution, though, since controllers add latency, so “Near Memory” (the memory tied directly to the processor’s memory bus) will need to have higher endurance. Still, an area that can help to accommodate that is the practice of putting code into memories that have low endurance and data into higher-endurance memory (which today would be DRAM). Since emerging memories can provide more bits at a lower cost and power than DRAM, the write load to the code space should be lower, since pages will be swapped in and out more frequently. The endurance requirements will depend on this swapping, and I would guess that the lowest-acceptable level would be in the tens of thousands of cycles. Q: It seems that persistent memory is more of an enterprise benefit rather than a consumer benefit. And consumer acceptance helps the advancement and cost scaling issues. Do you agree? I use SSDs as an example. Once consumers started using them, the advancement and prices came down greatly. Anything that drives increased volume will help. In most cases any change to large-scale computing works its way down to the PC, so this should happen in time here, too. But today there’s a growing amount of MRAM use in personal fitness monitors, and this will help drive costs down, so initial demand will not exclusively come from enterprise computing. At the same time, the IBM FlashDrive that we mentioned uses MRAM, too, so both enterprise and consumer are already working to simultaneously grow consumption. Q: The CXL diagram (slide 22 in the PDF) has 2 CXL switches between the CPUs and the memory. How much latency do you expect the switches to add, and how does that change where CXL fits on the array of memory choices from a performance standpoint? The CXL delay goals are very aggressive, but I am not sure that an exact number has been specified. It’s on the order of 70ns per “Hop,” which can be understood as the delay of going through a switch or a controller. Naturally, software will evolve to work with this, and will move data that has high bandwidth requirements but is less latency-sensitive to more remote areas, while managing the more latency-sensitive data to near memory. Q: Where can I learn more about the topic of Emerging Memories? Here are some resources to review

Persistent Memory Summit Presentations
Blog post on the first FRAM, which was made in 1952, making it the first semiconductor memory: FRAM Turns 68
Blog Post on Gordon Moore 1970 PCM paper: Original PCM Article from 1970
Blog post with overview of all the emerging memory technologies: Emerging Memories Today: The Technologies: MRAM, ReRAM, PCM/XPoint, FRAM, etc.

* LLM in a Flash: Efficient Large Language Model Inference with Limited Memory, Kevin Avizalideh, et. al., arXiv:2312.11514 [cs.CL] The post Emerging Memories Branch Out – a Q&A first appeared on SNIA Compute, Memory and Storage Blog.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

CXL Memory SSD

Blog

Your Questions Answered on Persistent Memory, CXL, and Memory Tiering

Your Questions Answered on Persistent Memory, CXL, and Memory Tiering

SNIA CMS Community

Jul 10, 2023

With the persistent memory ecosystem continuing to evolve with new interconnects like CXL

and applications like memory tiering, our recent Persistent Memory, CXL, and Memory Tiering-Past, Present, and Future webinar was a big success. If you missed it, watch it on demand HERE! Many questions were answered live during the webinar, but we did not get to all of them. Our moderator Jim Handy from Objective Analysis, and experts Andy Rudoff and Bhushan Chithur from Intel, David McIntyre from Samsung, and Sudhir Balasubramanian and Arvind Jagannath from VMware have taken the time to answer them in this blog. Happy reading! Q: What features or support is required from a CXL capable endpoint to e.g. an accelerator to support the memory pooling? Any references? A: You will have two interfaces, one for the primary memory accesses and one for the management of the pooling device. The primary memory interface is the .mem and the management interface will be via the .io or via a sideband interface. In addition you will need to implement a robust failure recovery mechanism since the blast radius is much larger with memory pooling. Q: How do you recognize weak information security (in CXL)? A: CXL has multiple features around security and there is considerable activity around this in the Consortium. For specifics, please see the CXL Specification or send us a more specific question. Q: If the system (e.g. x86 host) wants to deploy CXL memory (Type 3) now, is there any OS kernel configuration, BIO configuration to make the hardware run with VMWare (ESXi)? How easy or difficult this setup process? A: A simple CXL Type 3 Memory Device providing volatile memory is typically configured by the pre-boot environment and reported to the OS along with any other main memory. In this way, a platform that supports CXL Type 3 Memory can use it without any additional setup and can run an OS that contains no CXL support and the memory will appear as memory belonging to another NUMA code. That said, using an OS that does support CXL enables more complex management, error handling, and more complex CXL devices. Q: There was a question on ‘Hop” length. Would you clarify? A: In the webinar around minute 48, it was stated that a Hop was 20ns, but this is not correct. A Hop is often spoken of as “Around 100ns.” The Microsoft Azure Pond paper quantifies it four different ways, which range from 85ns to 280ns. Q: Do we have any idea how much longer the latency will be? A: The language CXL folks use is “Hops.” An address going into CXL is one Hop, and data coming back is another. In a fabric it would be twice that, or four Hops. The latency for a Hop is somewhere around 100ns, although other latencies are accepted. Q: For memory semantic SSD: There appears to be a trend among 2LM device vendors to presume the host system will be capable of providing telemetry data for a device-side tiering mechanism to decide what data should be promoted and demoted. Meanwhile, software vendors seem to be focused on the devices providing telemetry for a host-side tiering mechanism to tell the device where to move the memory. What is your opinion on how and where tiering should be enforced for 2LM devices like a memory semantic SSD? A: Tiering can be managed both by the host and within computational storage drives that could have an integrated compute function to manage local tiering- think edge applications. Q: Re VM performance in Tiering: It appears you’re comparing the performance of 2 VM’s against 1. It looked like the performance of each individual VM on the tiering system was slower than the DRAM only VM. Can you explain why we should take the performance of 2 VMs against the 1 VM? Is the proposal that we otherwise would have required those 2 VM’s to run on separate NUMA node, and now they’re running on the same NUMA node? A: Here the use case was, lower TCO & increased capacity of memory along with aggregate performance of VM’s v/s running few VM’s on DRAM. In this use case, the DRAM per NUMA Node was 384GB, the Tier2 memory per NUMA node was 768GB. The VM RAM was 256GB. In the DRAM only case, if we have to run business critical workloads e.g., Oracle with VM RAM=256GB, we could only run 1 VM (256GB) per NUMA Node (DRAM=384GB), we cannot over-provision memory in the DRAM only case as every NUMA node has 384GB only. So potentially we could run 4 such VM’s (VM RAM=256Gb) in this case with NUMA node affinity set as we did in this use case OR if we don’t do NUMA node affinity, maybe 5 such VM’s without completely maxing out the server RAM. Remember, we did NUMA node affinity in this use case to eliminate any cross NUMA latency.78 Now with Tier2 memory in the mix, each NUMA node has 384GB DRAM and 768GB Tier2 Memory, so theoretically one could run 16-17 such VM’s (VM RAM =256GB), hence we are able to increase resource maximization, run more workloads, increase transactions etc , so lower TCO, increased capacity and aggregate performance improvement. Q: CXL is changing very fast, we have 3 protocol versions in 2 years, as a new consumer of CXL what are the top 3 advantages of adopting CXL right away v/s waiting for couple of more years? A: All versions of CXL are backward compatible. Users should have no problem using today’s CXL devices with newer versions of CXL, although they won’t be able to take advantage of any new features that are introduced after the hardware is deployed. Q: (What is the) ideal when using agilex FPGAs as accelerators? A: CXL 3.0 supports multiple accelerators via the CXL switching fabric. This is good for memory sharing across heterogeneous compute accelerators, including FPGAs. Thanks again for your support of SNIA education, and we invite you to write askcmsi@snia.org for your ideas for future webinars and blogs! The post Your Questions Answered on Persistent Memory, CXL, and Memory Tiering first appeared on SNIA Compute, Memory and Storage Blog.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Case Studies CXL Memory Standards

Blog

The Blurred Lines of Memory and Storage – A Q&A

The Blurred Lines of Memory and Storage – A Q&A

John Kim

Jul 22, 2019

The lines are blurring as new memory technologies are challenging the way we build and use storage to meet application demands. That’s why the SNIA Networking Storage Forum (NSF) hosted a “Memory Pod” webcast is our series, “Everything You Wanted to Know about Storage, but were too Proud to Ask.” If you missed it, you can watch it on-demand here along with the presentation slides. We promised Q. Do tools exist to do secure data overwrite for security purposes? A. Most popular tools are cryptographic signing of the data where you can effectively erase the data by throwing away the keys. There are a number of technologies available; for example, the usual ones like BitLocker (part of Windows 10, for example) where the NVDIMM-P is tied to a specific motherboard. There are others where the data is encrypted as it is moved from NVDIMM DRAM to flash for the NVDIMM-N type. Other forms of persistent memory may offer their own solutions. SNIA is working on a security model for persistent memory, and there is a presentation on our work here. Q. Do you need to do any modification on OS or application to support Direct Access (DAX)? A. No, DAX is a feature of the OS (both Windows and Linux support it). DAX enables direct access to files stored in persistent memory or on a block device. Without DAX support in a file system, the page cache is generally used to buffer reads and writes to files, and DAX avoids that extra copy operation by performing reads and writes directly to the storage device. Q. What is the holdup on finalizing the NVDIMM-P standard? Timeline? A. The DDR5 NVDIMM-P standard is under development. Q. Do you have a webcast on persistent memory (PM) hardware too? A. Yes. The snia.org website has an educational library with over 2,000 educational assets. You can search for material on any storage-related topic. For instance, a search on persistent memory will get you all the presentations about persistent memory. Q. Must persistent memory have Data Loss Protection (DLP) A. Since it’s persistent, then the kind of DLP is the kind relevant for other classes of storage. This presentation on the SNIA Persistent Memory Security Threat Model covers some of this. Q. Traditional SSDs are subject to “long tail” latencies, especially as SSDs fill and writes must be preceded by erasures. Is this “long-tail” issue reduced or avoided in persistent memory? A. As PM is byte addressable and doesn’t require large block erasures, the flash kind of long tail latencies will be avoided. However, there are a number of proposed technologies for PM, and the read and write latencies and any possible long tail “stutters” will depend on their characteristics. Q. Does PM have any Write Amplification Factor (WAF) issues similar to SSDs? A. The write amplification (WA) associated with non-volatile memory (NVM) technologies comes from two sources.

When the NVM material cannot be modified in place but requires some type of “erase before write” mechanism where the erasure domain (in bytes) is larger than the writes from the host to that domain.
When the atomic unit of data placement on the NVM is larger than the size of incoming writes. Note the term used to denote this atomic unit can differ but is often referred to as a page or sector.

NVM technologies like the NAND used in SSDs suffer from both sources 1 and 2. This leads to very high write amplification under certain workloads, the worst being small random writes. It can also require over provisioning; that is, requiring more NVM internally than is exposed to the user externally. Persistent memory technologies (for example Intel’s 3DXpoint) only suffer from source 2 and can in theory suffer WA when the writes are small. The severity of the write amplification is dependent on how the memory controller interacts with the media. For example, current PM technologies are generally accessed over a DDR-4 channel by an x86 processor. x86 processors send 64 bytes at a time down to a memory controller, and can send more in certain cases (e.g. interleaving, multiple channel parallel writes, etc.). This makes it far more complex to account for WA than a simplistic random byte write model or in comparison with writing to a block device. Q. Persistent memory can provide faster access in comparison to NAND FLASH, but the cost is more for persistent memory. What do you think on the usability for this technology in future? A. Very good. See this presentation “MRAM, XPoint, ReRAM PM Fuel to Propel Tomorrow’s Computing Advances” by analysts, Tom Coughlin and Jim Handy for an in-depth treatment. Q. Does PM have a ‘lifespan’ similar to SSDs (e.g. 3 years with heavy writes, 5 years)? A. Yes, but that will vary by device technology and manufacturer. We expect the endurance to be very high; comparable or better than the best of flash technologies. Q. What is the performance difference between fast SSD vs “PM as DAX?” A. As you might expect us to say; it depends. PM via DAX is meant as a bridge to using PM natively, but you might expect to have improved performance from PM over NVMe as compared with a flash based SSD, as the latency of PM is much lower than flash; micro-seconds as opposed to low milliseconds. Q. Does DAX work the same as SSDs? A. No, but it is similar. DAX enables efficient block operations on PM similar to block operations on an SSD. Q. Do we have any security challenges with PME? A. Yes, and JEDEC is addressing them. Also see the Security Threat Model presentation here. Q. On the presentation slide of what is or is not persistent memory, are you saying that in order for something to be PM it must follow the SNIA persistent memory programming model? If it doesn’t follow that model, what is it? A. No, the model is a way of consuming this new technology. PM is anything that looks like memory (it is byte addressable via CPU load and store operations) and is persistent (it doesn’t require any external power source to retain information). Q. DRAM is basically a capacitor. Without power, the capacitor discharges and so the data is volatile. What exactly is persistent memory? Does it store data inside DRAM or it will use FLASH to store data? A. The presentation discusses two types of NVDIMM; one is based on DRAM and a flash backup that provides the persistence (that is NVDIMM-N), and the other is based on PM technologies (that is NVDIMM-P) that are themselves persistent, unlike DRAM. Q. Slide 15: If Persistent memory is fast and can appear as byte-addressable memory to applications, why bother with PM needing to be block addressed like disks? A. Because it’s going to be much easier to support applications from day one if PM can be consumed like very fast disks. Eventually, we expect PM to be consumed directly by applications, but that will require them to be upgraded to take advantage of it. Q. Can you please elaborate on byte and block addressable? A. Block addressable is the way we do I/O; that is, data is read and written in large blocks of data, typically 4Kbytes in size. Disk interfaces like SCSI or NVMe take commands to read and write these blocks of data to the external device by transferring the data to and from CPU memory, normally DRAM. Byte addressable means that we’re not doing any I/O at all; the CPU instructions for loading & storing fast registers from memory are used directly on PM. This removes an entire software stack to do the I/O, and means we can efficiently work on much smaller units of data; down to the byte as opposed to the fixed 4Kb demanded by I/O interfaces. You can learn more in our presentation “File vs. Block vs. Object Storage.” There are now 10 installments of the “Too Proud to Ask” webcast series, covering these topics:

If you have an idea for an “Everything You Wanted to Know about Storage, but were too Proud to Ask” presentation, please let comment on this blog and the NSF team will put it up for consideration.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

ethernet Memory Networked Storage Solid State Storage

Blog

Everything You Wanted to Know about Memory

Everything You Wanted to Know about Memory

John Kim

Apr 9, 2019

Many followers (dare we say fans?) of the SNIA Networking Storage Forum (NSF) are familiar with our popular webcast series "Everything You Wanted To Know About Storage But Were Too Proud To Ask." If you've missed any of the nine episodes we've done to date, they are all available on-demand and provide a 101 lesson on a range of storage related topics like buffers, storage controllers, iSCSI and more. Our next "Too Proud to Ask" webcast on May 16, 2019 will be "Everything You Wanted To Know About Storage But Were Too Proud To Ask – Part Taupe – The Memory Pod." Traditionally, much of the IT infrastructure that we've built over the years can be divided fairly simply into storage (the place we save our persistent data), network (how we get access to the storage and get at our data) and compute (memory and CPU that crunches on the data). In fact, so successful has this model been that a trip to any cloud services provider allows you to order (and be billed for) exactly these three components. The only purpose of storage is to persist the data between periods of processing it on a CPU. And the only purpose of memory is to provide a cache of fast accessible data to feed the huge appetite of compute. Currently, we build effective systems in a cost-optimal way by using appropriate quantities of expensive and fast memory (DRAM for instance) to cache our cheaper and slower storage. But fast memory has no persistence at all; it's only storage that provides the application the guarantee that storing, modifying or deleting data does exactly that. Memory and storage differ in other ways. For example, we load from memory to registers on the CPU, perform operations there, and then store the results back to memory by loading from and storing to byte addresses. This load/store technology is different from storage, where we tend to move data back and fore between memory and storage in large blocks, by using an API (application programming interface). It's clear the lines between memory and storage are blurring as new memory technologies are challenging the way we build and use storage to meet application demands. New memory technologies look like storage in that they're persistent, if a lot faster than traditional disks or even Flash based SSDs, but we address them in bytes, as we do memory like DRAM, if more slowly. Persistent memory (PM) lies between storage and memory in latency, bandwidth and cost, while providing memory semantics and storage persistence. In this webcast, our SNIA experts will discuss:

Fundamental terminology relating to memory
Traditional uses of storage and memory as a cache
How can we build and use systems based on PM?
Persistent memory over a network
Do we need a new programming model to take advantage of PM?
Interesting use cases for systems equipped with PM
How we might take better advantage of this new technology

Register today for this live webcast on May 16^th. Our experts will be available to answer the questions that you should not be too proud to ask! And if you're curious to know why each of the webcasts in this series is associated with a different color (rather than a number), check out this SNIA NSF blog that explains it all.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

ethernet Memory Networked Storage

Blog

Everything You Wanted to Know about Memory

Everything You Wanted to Know about Memory

John Kim

Apr 9, 2019

Many followers (dare we say fans?) of the SNIA Networking Storage Forum (NSF) are familiar with our popular webcast series “Everything You Wanted To Know About Storage But Were Too Proud To Ask.” If you’ve missed any of the nine episodes we’ve done to date, they are all available on-demand and provide a 101 lesson on a range of storage related topics like buffers, storage controllers, iSCSI and more. Our next “Too Proud to Ask” webcast on May 16, 2019 will be “Everything You Wanted To Know About Storage But Were Too Proud To Ask – Part Taupe – The Memory Pod.” Traditionally, much of the IT infrastructure that we’ve built over the years can be divided fairly simply into storage (the place we save our persistent data), network (how we get access to the storage and get at our data) and compute (memory and CPU that crunches on the data). In fact, so successful has this model been that a trip to any cloud services provider allows you to order (and be billed for) exactly these three components. The only purpose of storage is to persist the data between periods of processing it on a CPU. And the only purpose of memory is to provide a cache of fast accessible data to feed the huge appetite of compute. Currently, we build effective systems in a cost-optimal way by using appropriate quantities of expensive and fast memory (DRAM for instance) to cache our cheaper and slower storage. But fast memory has no persistence at all; it’s only storage that provides the application the guarantee that storing, modifying or deleting data does exactly that. Memory and storage differ in other ways. For example, we load from memory to registers on the CPU, perform operations there, and then store the results back to memory by loading from and storing to byte addresses. This load/store technology is different from storage, where we tend to move data back and fore between memory and storage in large blocks, by using an API (application programming interface). It’s clear the lines between memory and storage are blurring as new memory technologies are challenging the way we build and use storage to meet application demands. New memory technologies look like storage in that they’re persistent, if a lot faster than traditional disks or even Flash based SSDs, but we address them in bytes, as we do memory like DRAM, if more slowly. Persistent memory (PM) lies between storage and memory in latency, bandwidth and cost, while providing memory semantics and storage persistence. In this webcast, our SNIA experts will discuss:

Fundamental terminology relating to memory
Traditional uses of storage and memory as a cache
How can we build and use systems based on PM?
Persistent memory over a network
Do we need a new programming model to take advantage of PM?
Interesting use cases for systems equipped with PM
How we might take better advantage of this new technology

Register today for this live webcast on May 16^th. Our experts will be available to answer the questions that you should not be too proud to ask! And if you’re curious to know why each of the webcasts in this series is associated with a different color (rather than a number), check out this SNIA NSF blog that explains it all.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

ethernet Memory Networked Storage

Subscribe to Memory

Unlocking CXL's Potential Q&A

Find a similar article by tags

Leave a Reply

A Deep Look at New Memories Q&A

Find a similar article by tags

Leave a Reply

Emerging Memories Branch Out – a Q&A

Find a similar article by tags

Leave a Reply

Your Questions Answered on Persistent Memory, CXL, and Memory Tiering

Find a similar article by tags

Leave a Reply

The Blurred Lines of Memory and Storage – A Q&A

Find a similar article by tags

Leave a Reply

Everything You Wanted to Know about Memory

Find a similar article by tags

Leave a Reply

Everything You Wanted to Know about Memory

Find a similar article by tags

Leave a Reply

Unlocking CXL's Potential Q&A

Find a similar article by tags

Leave a Reply

A Deep Look at New Memories Q&A

Find a similar article by tags

Leave a Reply

Emerging Memories Branch Out – a Q&A

Find a similar article by tags

Leave a Reply

Your Questions Answered on Persistent Memory, CXL, and Memory Tiering

Find a similar article by tags

Leave a Reply

The Blurred Lines of Memory and Storage &ndash; A Q&A

Find a similar article by tags

Leave a Reply

Everything You Wanted to Know about Memory

Find a similar article by tags

Leave a Reply

Everything You Wanted to Know about Memory

Find a similar article by tags

Leave a Reply

The Blurred Lines of Memory and Storage – A Q&A