A new research paper has discovered the usefulness of DRAM cache for GPUs which can help enable higher performance at low power.
The GPU industry, which involves consumer, workstation, and AI GPUs, is proceeding in a way that we are seeing advancements in memory capacities and bandwidth, but it isn't sustainable, and ultimately, we could hit the limits if an innovative approach isn't taken.
We have seen GPU makers advance this segment by either incorporating large sums of secondary LLCs (Last Level Caches) or increasing the size of L2 caches. Keeping this in mind, researchers have devised a new way of developing GPU memories, particularly HBM, to break the modern-day capacities and bandwidth limits, along with making data transfer and management much more efficient.
Based on a research paper published at ArVix, researchers have proposed using a dedicated DRAM cache on GPU memory, similar to what we see in modern-day SSDs. DRAM cache is a high-speed storage place for memory, allowing an effective "fetch-and-execute" process. However, this cache isn't what we see in SSDs, somewhat a bit different since it involves the use of SCM (Storage-Class Memory), which is a much more viable alternative to the modern-day HBM and has a lower per-bit dollar cost than DRAM as well.
Researchers have proposed a hybrid approach, utilizing both SCM and DRAM together, to reduce and avoid memory oversubscription & ensure a higher performance per capacity.
The research is pretty in-depth, as expected, and it involves multiple data-flow models as well to aid the SCM data fetching process, and one of those is Aggregated Metadata-In-Last-column (AMIL) DRAM cache organization, which is an attempt to hasten up the process of fetching "data tags," which tells where the
Read more on wccftech.com