It often feels as though memory is an outlier in the technology world. While we’ve seen significant changes in compute power (both with CPU and GPU) and storage, memory development has been iterative rather than revolutionary.
While that approach has worked in the past, current memory technology is starting to cause challenges, due to an issue known as the “memory wall problem”. This occurs when a processor’s speed outpaces memory’s bandwidth and, as a result, the processor has to wait for data to be transferred from memory, introducing a bottleneck.
The performance restrictions caused by the memory wall problem are only getting worse, as CPU and GPU advancement continues to outpace improvements in memory architecture. And it’s being exacerbated by the growth of demanding, memory-intensive workloads such as high-performance computing (HPC) and AI, which didn’t exist at the same scales until relatively recently, but are now seeing rapid adoption.
This issue is creating the need for new memory technologies, designed for modern workloads.
“Emerging memory technologies are being driven by the explosive growth in AI and machine learning, big data analytics, scientific computing, and hyperscale cloud datacentres,” said JB Baker, VP marketing and product management, ScaleFlux. “Traditional memory technologies like DRAM and NAND flash are reaching scaling, speed, and density limits that restrict the performance required for next-generation workloads.”
With AI demands highlighting the flaws in existing systems, Baker believes that new memory technologies are required.
“The needs for higher bandwidth, lower latency, greater capacity, and energy efficiency in AI and HPC applications are exposing the inadequacy of traditional solutions. We are indeed approaching the physical and economic end of the road for conventional memory scaling, making new architectures and technologies essential,” he noted.
While traditional memory technology still has a role to play, other factors are pushing the demand for emerging technologies.
David Norfolk, Practice leader: development and governance, Bloor Research, explained, “AI hype is driving things - that and the need for vendors to sell something new with higher margins. I very much doubt that we are at the end of the road yet, but people always want more speed, scaling, and density. “What may be a driver, is more energy efficiency, less heat and more reliability - less waste.”
Defining the problem
High-performance workloads have numerous requirements, so no single emerging memory technology works across the board. For example, AI workloads require a significant volume of data, both for fresh processing and longer-term storage. That applies to both typical Generative AI (Gen AI) services, such as ChatGPT, and to a growing number of physical machines that collect data using sensors for decision-making. But it isn’t always clear what data is needed, when, and for how long it should be stored.
Martin Kunze, CMO, Cerabyte, explained, “It is not yet defined how much raw sensor data is needed for decision making, and how long it needs to be retained for when it comes to machine-human interaction. There were already legal consequences for companies that didn’t keep enough data to reconstruct accidents that were caused by false AI-decisions.”
Legal reasons, rather than purely technological ones, will have their part to play in how emerging memory technologies are provisioned and used.
“The ‘audit trail data’ will be one of many drivers that lead to the surging demand for data storage,” Kunze continued. “Current storage technologies are approaching their limits; analysts are forecasting a scenario where mainstream technologies can deliver only 50% of the required demand – a looming supply gap could put AI-related investments at risk.”
A tiered approach
Universal memory, which combines persistent memory and storage into a single unit, would seem to be the panacea, providing fixed storage for vast amounts of data and high speeds for processing on demand. However, that is unlikely to be a realistic proposition for some time, so tiered data using a variety of technologies will be the default in the short-to-medium term.
Arthur Sainio and Raghu Kulkarni, Persistent Memory Special Interest Group co-chairs, SNIA (The Storage Networking Industry Association), said, “Universal memory such as PCM and ULTRARAM promises to merge RAM speed with persistence, but faces manufacturing complexity, high costs, and scaling barriers. Tiered architectures will dominate short-to-medium term due to cost efficiency. Universal memory may see niche use (edge AI, aerospace) but requires material breakthroughs to displace tiered systems, likely post-2030. Hybrid solutions like CXL PMEM + DRAM + SSDs, remain the pragmatic path.”
While technological issues impede such a technology, there’s also a concern from some that this memory type might inhibit performance.
“While the concept of universal memory is intellectually appealing, in practice we are likely to maintain a tiered storage architecture for the foreseeable future,” said Baker. “The technical gap between DRAM-class speeds and persistent storage-class latencies remains too large to collapse into a single layer without major compromises.”
While universal memory may not be an immediate solution to the memory wall problem, emerging memory technologies still have a big role to play, particularly in how tiers interoperate.
“Emerging memory technologies may narrow this gap, but they are more likely to create new tiers rather than eliminate the concept of tiered architectures altogether,” continued Baker.
Most experts agree that tiering with AI workloads will be different from traditional ones.
"In a future coined by AI, the typical segmentation of hot - warm - cool - cold data will very likely be increasingly blurry. Large chunks of cold data need to be warmed up quickly, and then after being processed, put back in cold storage. For example - to enhance AI training or AI-assisted search, for patterns in scientific data or to present sensor data was the basis for AI decisions in a liability case in court,” said Kunze.
When considering emerging memory, or indeed any future technology, it’s natural to expect higher performance will be one of the most significant new features being offered.
But that may not be the case, since the demand for greater performance, higher capacity, and reduced overall costs, means that vastly different technologies have their part to play, as Erfane Arwani, founder and CEO, Biomemory, explained: "DNA storage isn’t fast, but it’s insanely dense and lasts forever. Perfect for archiving AI models and massive datasets you don’t need to access often.”
Persistent memory
Fully fledged universal memory that can do everything required of traditional storage and memory may be a long way off, but there are alternative technologies, such as persistent memory, which is designed to retain data without requiring constant power (i.e. it’s non-volatile). Persistent memory promises to bridge the gap between storage and memory. But while this idea sounds great, it has been a rocky road for this technology so far, with the best-known example, Intel Optane, being abandoned.
As Joseph Lynn, executive VP of operations, Tarmin, explained, “Several factors contributed to this. Optane faced challenges due to its relatively high cost per GB compared to NAND flash, making it less appealing for capacity-sensitive applications. Further, when used in DIMM configuration, its performance, while better than NAND, did not fully match DRAM, so the performance/cost-benefit could not always be justified, when considering memory capacity and latency.”
These kinds of issues seem to be universal with current persistent memory technologies, preventing them from mass uptake and universal appeal, as explained by Baker. “Major limitations include: high bit cost, which makes scaling economically challenging; long read/write latencies, which cannot match the speed requirements of latency-sensitive applications and low read/write throughput, which bottlenecks throughput-intensive applications,” he said. “Emerging alternatives that offer better density, faster access, and lower energy per operation are increasingly attractive for AI and HPC workloads.”
So, again, we come back to the need for tiering, with emerging persistent memory technologies able to work in specific tiers. That includes slower, data-rich tiers, as Arwani noted: “DNA shows that tiered storage still makes sense. It’s ideal for the coldest layer - super dense, low-energy, and long-lasting.”
Faster persistent memory technologies have their place, although there are still hurdles that need to be overcome.
“Emerging alternatives like MRAM and ReRAM provide advantages such as near-SRAM speed, zero standby power in the case of MRAM, and analogue compute capabilities like ReRAM, but face scalability and manufacturing hurdles. They are gaining some traction as they promise better scalability, energy efficiency, and performance for future HPC demands, but have hurdles to overcome,” said Saino and Kulkarni. “CXL NV-CMM types of products offer DRAM-like speed and persistence, making them valuable for caching and checkpointing functions in HPC applications. High density hybrid CXL solutions are likely as well.”
No new architectures
One thing that seems clear is that there will not be a new server architecture for HPC and AI workloads that replaces what we have today. Advances in CPU and GPU technology, and large investments in such platforms, still make general-purpose computing the best fit for most jobs.
As such, some emerging memory technologies are likely to be more of niche interest, for custom jobs that require the fastest speeds. Computational-RAM (CRAM), where computations can take place directly in RAM, is a good example of this.
“Although CRAM offers compelling advantages for AI inference and acceleration in theory, it suffers from very limited programmability and restricted workload flexibility. As a result, CRAM is unlikely to replace the traditional server architecture for general HPC. Instead, it will at most be deployed selectively for niche applications,” said Baker.
Effective scaling and higher density
Irrespective of this, AI and HPC are pushing the requirements for more memory and require more flexible ways of using it. In that regard, continuing to push the boundaries of today’s memory technologies makes sense, as it can help maximise investment in current computing architecture.
At the core of memory development are two technologies that can help: 3D DRAM for increased capacity and Compute Express Link (CLX) for improved scaling and memory pooling.
“HPC and AI require both 3D DRAM for capacity and bandwidth, and CXL for scalable, cost-effective memory expansion. 3D DRAM such as HBM3, is ideal for on-package, high-speed tasks like training large AI models due to its fast data access and energy efficiency. CXL will provide pooled memory for flexibility and persistent workloads,” said Sainio and Kulkarni. “A hybrid approach that combines these technologies is essential for efficiently meeting the growing demands of modern HPC and AI applications.”
Emerging technologies also promise to maximise the investment in existing storage, which is particularly important given the need for a tiered approach to modern workloads.
Kunze gives an example: "Emerging technologies such as ceramic storage can release expensive high performing storage like HDDs which today is used for storing cold data, for a better use.”
Emerging memory technologies also promise improved caching and access to data available on traditional storage technologies, such as flash and hard disks.
“Advanced caching strategies leveraging faster memory types - such as HBM or stacked DRAM - can significantly accelerate access to hot data, improving the performance of existing storage systems. Using persistent memory for metadata acceleration or tiered caching layers will continue to enhance storage efficiency without fundamentally redesigning architectures,” said Baker.
Software is critical
While hardware may steal the limelight, software is essential in provisioning and managing data tiers. Crucially, software has to make life easier and work with what’s available, rather than changing how systems work.
This is a valuable lesson learned from Intel Optane, as Lynn explained: “A final hurdle for Optane was the slow adaptation of the software ecosystem. While, in theory, Optane DIMMs expanded memory transparently to the OS, in practice, optimising databases and file systems to take full advantage of its unique persistence and performance characteristics proved to be complex and time-consuming, further hindering its widespread and effective use.”
Software is critical to the success of any technology, particularly in a future where resources must be efficiently combined across different platforms.
“Software optimises workloads across CPUs, GPUs, TPUs, and CRAM by managing resources, scheduling tasks, and improving memory use. Tools like Kubernetes and TensorFlow ensure efficient hardware utilisation, while future innovations in AI-driven orchestration, unified APIs, and real-time monitoring will enhance performance and energy efficiency across heterogeneous platforms,” said Sainio and Kulkarni.
Barriers to uptake
While the AI explosion may make adoption of emerging memory technologies a forgone conclusion, there are still many risks, particularly around the investment in existing memory technologies. Demand for new technologies can be limited by what’s currently working.
A general resistance to new technology is something noted by Norfolk, who highlighted that one of the biggest barriers to adoption is “The amount of legacy tech still in use and working ‘well enough’ in many applications. Plus, general mistrust of anything too new unless there is no alternative.”
In a similar vein, new technology has to be demonstrably better than what’s available now. As Baker said, “New memory technologies must not only outperform but also offer acceptable economics compared to DRAM or NAND to achieve widespread adoption.”
These are factors that we’ve seen time and time again, but failure to invest in emerging technologies poses a risk of its own. As Kunze explained: “100x more money is invested in computation than in memory and storage. But without investment in newly scalable technologies, billions of investments in AI could be squandered due to lack of storage. This looming risk should be exposed to and explored in the AI and AI-Investor community.”
The future is coming
Despite these warnings, the requirements of demanding computing workflows are only exacerbating the memory wall problem, increasing the need for novel solutions. Emerging memory technologies are required now more than ever, and wider adoption is only a matter of time.
"Looking five years ahead, the confluence of ever-increasing data intensity and the scaling of datasets suggests we are indeed on the cusp of a transformative period in memory technology, arguably the most significant in a generation. This relentless growth in data demands will necessitate radical advancements and new architectural approaches to overcome the limitations of current memory systems,” said Lynn.
Developments in scalability and density must be priorities for any new technology looking to successfully tackle the memory wall challenge. Thankfully, the building blocks of these technological advancements are already available.
“Breakthroughs in CXL-based memory and Racetrack memory could transform the industry. CXL will enable scalable, low-latency persistent memory integration, while Racetrack memory offers ultra-high density, faster speeds, and energy efficiency. These advancements can revolutionise AI, HPC, and edge computing performance,” said Sainio and Kulkarni.
It’s important to think about how data will be used to understand the future of emerging memory technologies, as Kunze explained: "There will be ‘hot storage’ and ‘not so hot storage.’ The distinction between hot and cold storage/data will disappear; rather, data will be classified by the need to make it immediately accessible or not.”
As a result, the future looks set to be based on multiple technologies, with tiering used to hit different requirements at different points in a system. That means emerging memory technologies, but also continuing to push the limits of what today’s technology can offer.
“We expect there will be more flavours of persistent and volatile memory. They will be based primarily on DDR cell but also NAND cells,” said Baker. “The objective of DDR based memory will offer lower power, slower performance vs DRAM and cost vs a standard DRAM. It will reside between DRAM and NAND in the compute hierarchy. The innovation on NAND memory will target to expand the bandwidth in the overall compute hierarchy to meet the needs of AI and in-memory databases.”
Conclusion
You’ll probably be disappointed if you are expecting a new, emerging memory technology to become standardised in the near future. For the time being, the traditional tiered memory architecture isn’t going anywhere, and will continue to see iterative improvements to boost speed and capacity.
But, equally, the ever-growing demands of AI and HPC workloads mean that there’s a sense of urgency to solve the performance bottlenecks with current memory designs.
Held back by issues such as high costs, limited software support and a general resistance to technological change, emerging technologies have not caught on quite yet.
That said, there is clearly a sense that change is inevitable, sooner or later, and various approaches could be adopted to address the bottlenecks of current memory in the future.