While Intel’s primary product focus is on the processors, or brains, that make computers work, system memory (that’s DRAM) is a critical component for performance. This is especially true in servers, where the multiplication of processing cores has outpaced the rise in memory bandwidth (in other words, the memory bandwidth available per core has fallen).
In heavy-duty computing jobs like weather modeling, computational fluid dynamics and certain types of AI, this mismatch could create a bottleneck — until now.
After several years of development with industry partners, Intel engineers have found a path to open that bottleneck, crafting a novel solution that has created the fastest system memory ever and is set to become a new open industry standard. The recently introduced Intel® Xeon® 6 data center processors are the first to benefit from this new memory, called MRDIMMs, for higher performance — in the most plug-and-play manner imaginable.
Intel’s Bhanu Jaiswal, Xeon product manager in Intel’s Data Center and AI (DCAI) group, explains that “a significant percentage of high-performance computing workloads are memory-bandwidth bound,” the kind most set to benefit from MRDIMMs.
It sounds almost too good to be true — here’s the story behind the DDR5 Multiplexed Rank Dual Inline Memory Module, or MRDIMM for storytelling efficiency.
Bringing Parallelism to System Memory, with Friends
It turns out the most common memory modules used for data center jobs, known as RDIMMs, do have parallel resources on board, like modern processors do. They’re just not used that way.
“Most DIMMs have two ranks for performance and capacity,” says George Vergis, a senior principal engineer in memory pathfinding in DCAI. “It’s the sweet spot.”
You can think of ranks like, well, banks – one set of memory chips on a module would belong to one and the rest to the other rank. With RDIMMs, data can be stored and accessed across multiple ranks independently but not simultaneously.
Considering this situation, recalls Vergis, “We’re like, ‘Wait a minute. We’ve got parallel resources that are unused. Why can’t we put them together?’” The idea that Vergis pursued was to put a small interface chip – a multiplexer or “mux” – on the DRAM module. It allows data to flow across both ranks of memory in the same unit of time.
The mux buffer consolidates each MRDIMM’s electrical load, which allows the interface to operate at a higher speed compared to RDIMMs. And now that both ranks of memory can be accessed in parallel, its bandwidth has doubled.
The result is the fastest system memory ever created – by a leap that would normally take several generations of memory technologies to achieve (in this case, peak bandwidth rises by almost 40%, from 6,400 megatransfers per second (MT/s) to 8,800 MT/s).
Same Standard Memory Module, Just Faster
At this point you might have your own “wait a minute” question: Is Intel getting back into the memory business? No. Although Intel started as a memory company and invented technologies including EPROM and DRAM, the company has exited its various memory product businesses over its history (some quite famously).
But Intel never stopped the “lifts all boats” efforts that make different computing components interoperable and higher performing. Vergis represents Intel on the board of JEDEC, which sets open standards for the microelectronics industry, most notably for memory. Vergis earned a JEDEC award in 2018 for his work on the DDR5 standard, and right now he’s devoting time to DDR6. (JEDEC also honored Intel CEO Pat Gelsinger this year for a career as “a strong proponent of open standards as evidenced by Intel’s historic leadership in developing standards.”)
Vergis and his cohorts started this work in 2018 and proved the concept with prototypes by 2021. Intel teamed up with the memory ecosystem to build the first components, and donated the specs to JEDEC as a new open standard in late 2022.
What stands out about the MRDIMM is its ease of use. It employs the same connector and form factor as a regular RDIMM (even the little mux chips fit in previously empty spots on the module), thus requiring no changes in the motherboard.
MRDIMMs also bring along all the same error-correcting and reliability, availability and serviceability (RAS) features as RDIMMs. Data integrity is maintained no matter how separate requests might be multiplexed across the data buffer, Vergis explains.
This all means data center customers can choose MRDIMMs when they order a new server, or later they can slide that server out of the rack and swap the RDIMMs for new MRDIMMs. Not a single line of code needs to change to enjoy newfound performance.
Xeon 6 + MRDIMM = 🚀🚀
What is required is a CPU that can work with MRDIMMs, and the first one available is the Intel Xeon 6 processor with Performance-cores, code-named Granite Rapids, which came to market this year.
Recent independent tests compared two identical Xeon 6 systems, one with MRDIMMs and the other with RDIMMs. The system with MRDIMMs completed jobs as much as 33% faster.
Jaiswal says the “improvement in bandwidth that MRDIMM delivers is very much applicable to small language models and traditional deep learning and recommendation system types of AI workloads that can easily run on Xeon and achieve a good performance boost with MRDIMM.”
Leading memory vendors have introduced MRDIMMs, with additional memory makers expected to launch more. High-performance computing labs – such as the National Institute for Quantum Science and Technology and National Institute for Fusion Science, among others – are actively adopting Xeon 6 with P-cores because of MRDIMMs, with support from OEMs like NEC.
“Intel definitely has a lead,” Jaiswal notes, “backed by a strong ecosystem of OEMs and memory vendors.”
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.