2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Download PDF

Abstract

The constantly widening processor-memory speed gap substantially exacerbates the dependence of program performance on the on-chip memory hierarchy design and data management in chip multiprocessors. However, traditional data management mechanisms take neither the characteristic of asymmetric distribution of on-chip memory accesses nor the property of non-uniform access latency into consideration in large distributed cache. It is difficult to make an intelligent trade-off between the hit rate and the hit latency, which has an important impact on the memory efficiency. To tackle this problem, this paper presents a novel pressure self-adapting dynamic non-uniform cache architecture (PSA-NUCA). By integrating the replica and activity aware pseudo-LRU replacement policy (RAA-LRU), the enhanced first-touch mapping policy based on selective victim retention (FT-SVR), and the pressure aware adaptive replication policy (PA-ARP) into a unified intelligent data management framework, PSA-NUCA alleviates the contradiction between the miss rate and hit latency effectively with concern for both the characteristic of asymmetric distribution of memory access and the property of non-uniform access latency. Simulation results using a full system simulator demonstrate that PSA-NUCA outperforms the baseline shared non-uniform cache architecture by an average of 7.78% for the multi-thread benchmark programs we examined, while the hardware overhead is negligible.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles