Add How the Landscape of Memory is Evolving With CXL

2025-10-26 06:00:42 +08:00 · 2025-10-26 06:00:42 +08:00 · f0e16b30df
commit f0e16b30df
parent c67430ec03
1 changed files with 9 additions and 0 deletions
--- a/How-the-Landscape-of-Memory-is-Evolving-With-CXL.md
+++ b/How-the-Landscape-of-Memory-is-Evolving-With-CXL.md
@ -0,0 +1,9 @@
 <br>As datasets develop from megabytes to terabytes to petabytes, the price of moving data from the block storage gadgets across interconnects into system memory, performing computation and then storing the massive dataset again to persistent storage is rising in terms of time and power (watts). Additionally, heterogeneous computing hardware increasingly wants access to the identical datasets. For example, a normal-goal CPU may be used for assembling and preprocessing a dataset and scheduling duties, however a specialised compute engine (like a GPU) is way sooner at coaching an AI model. A more efficient answer is needed that reduces the switch of massive datasets from storage directly to processor-accessible memory. A number of organizations have pushed the business toward options to these problems by retaining the datasets in massive, byte-addressable, sharable memory. In the nineteen nineties, the scalable coherent interface (SCI) allowed a number of CPUs to entry memory in a coherent approach inside a system. The heterogeneous system structure (HSA)1 specification allowed memory sharing between devices of different types on the same bus.<br>[musiheart.com](http://www.musiheart.com)
 <br>In the decade beginning in 2010, the Gen-Z normal delivered a memory-semantic bus protocol with high bandwidth and low latency with coherency. These efforts culminated in the extensively adopted Compute Express Hyperlink (CXLTM) standard getting used as we speak. For the reason that formation of the Compute Specific Link (CXL) consortium, Micron has been and remains an energetic contributor. Compute Specific Link opens the door for saving time and energy. The brand new CXL 3.1 customary permits for byte-addressable, load-retailer-accessible memory like DRAM to be shared between totally different hosts over a low-latency, excessive-bandwidth interface using trade-customary components. This sharing opens new doorways beforehand solely doable via expensive, proprietary equipment. With shared [Memory Wave](https://www.wiki.klausbunny.tv/index.php?title=Widespread_Secondary_Infections_Embody_Infectious_Diarrhea) systems, the information can be loaded into shared memory once and then processed a number of instances by a number of hosts and accelerators in a pipeline, without incurring the price of copying information to local memory, block storage protocols and latency. Furthermore, some community knowledge transfers could be eradicated.<br>
 <br>For example, information could be ingested and stored in shared memory over time by a number related to a sensor array. Once resident in memory,  [boost brain function](http://www.onestopclean.kr/bbs/board.php?bo_table=free&wr_id=717996) a second host optimized for this purpose can clear and preprocess the info, adopted by a third host processing the data. Meanwhile, the primary host has been ingesting a second dataset. The one info that must be handed between the hosts is a message pointing to the information to indicate it is ready for processing. The big dataset by no means has to move or be copied, saving bandwidth, power and memory space. One other example of zero-copy knowledge sharing is a producer-consumer data model the place a single host is responsible for collecting data in memory, after which a number of other hosts eat the info after it’s written. As before, the producer just needs to ship a message pointing to the tackle of the information, signaling the other hosts that it’s prepared for consumption.<br>
 <br>Zero-copy data sharing can be additional enhanced by CXL memory modules having built-in processing capabilities. For instance, if a CXL memory module can carry out a repetitive mathematical operation or data transformation on an information object completely within the module, system bandwidth and energy may be saved. These financial savings are achieved by commanding the memory module to execute the operation without the data ever leaving the module utilizing a [capability](https://realitysandwich.com/_search/?search=capability) referred to as near memory compute (NMC). Additionally, the low-latency CXL fabric can be leveraged to send messages with low overhead in a short time from one host to another, between hosts and memory modules, or between memory modules. These connections can be used to synchronize steps and share pointers between producers and consumers. Beyond NMC and communication advantages, advanced memory telemetry will be added to CXL modules to provide a brand new window into actual-world software traffic in the shared devices2 without burdening the host processors.<br>
 <br>With the insights gained, operating systems and management software can optimize information placement (memory tiering) and tune different system parameters to satisfy operating targets, from performance to vitality consumption. Extra memory-intensive, value-add capabilities comparable to transactions are also ideally suited to NMC. Micron is excited to mix massive, scale-out CXL international shared memory and enhanced memory features into our memory lake concept. As datasets grow from megabytes to terabytes to petabytes, the price of transferring information from the block storage gadgets throughout interconnects into system memory, performing computation and then storing the big dataset back to persistent storage is rising by way of time and power (watts). Additionally, heterogeneous computing hardware more and more wants entry to the identical datasets. For instance, a normal-purpose CPU may be used for assembling and preprocessing a dataset and scheduling duties, however a specialized compute engine (like a GPU) is way sooner at training an AI mannequin.<br>