2024 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)
Download PDF

Abstract

DRAM-based processing-in-memory (PIM) archi-tectures achieve high energy efficiency and throughput for data-intensive applications. However, due to the stringent area, power, and timing constraints of the DRAM, processing elements with only limited functionality can be integrated close to the DRAM's memory banks. This paper presents CORD-PIM, an algorithm and architecture co-design that enhances the complexity and the set of functions implemented in a PIM architecture. CORD-PIM exhibits an energy and performance-efficient implementation of transcendental functions (T f s) using the CORDIC algorithm for different input precisions (4, 8, and 16 bits). Simulation results on a vector of 1 million 16-bit elements show that CORD-PIM, on average, achieves 36x higher energy efficiency and about 286 x higher throughput than their standard C/C++ CPU implementation. CORD-PIM is demonstrated on a discrete Fourier transform (DFT) of 512×512, which requires a detailed computation of T f s. The results show an improvement of 20 x in energy efficiency and 120 x in throughput versus a CPU and an average of 3.6 x energy efficiency and 28 x throughput improvement against ASICs.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles