Abstract
As the scaling of memory density slows physically, a promising solution is to scale memory logically by enhancing the CPU's memory controller to encode and store data more densely in memory. This is known as hardware memory compression. Hardware memory compression decouples OS-managed physical memory from actual memory (i.e., DRAM); the memory controller spends a dynamically varying amount of DRAM on each physical page, depending on the compressibility of the page's content. The newly-decoupled actual memory effectively forms a new layer of memory beyond the traditional layers of virtual, pseudo-physical, and physical memory. We note unlike these traditional memory layers, each with its own specialized allocation interface (e.g., malloc/mmap for virtual memory, page tables+MMU for physical memory), this new layer of memory introduced by hardware memory compression still awaits its own unique memory allocation interface; its absence makes the allocation of actual memory imprecise and, sometimes, even impossible. Imprecisely allocating less actual memory, and/or unable to allocate more, can harm performance. Even imprecisely allocating more actual memory to some jobs can be harmful as it can result in allocating less actual memory to other jobs in highly-occupied memory systems, where compression is useful. To restore precise memory allocation, we design a new memory allocation specialized for this new layer of memory and, subsequently, architect a new MMU-like component in the memory controller and tackle the corresponding design challenges. We create a full-system FPGA prototype of a hardware-compressed memory system with precise memory allocation. Our evaluations using the prototype show that jobs perform stably under colocation. The performance variation is only 1%-2%; in comparison, it is 19%-89% under the prior art.