2014 IEEE International Conference on High Performance Computing and Communications (HPCC), 2014 IEEE 6th International Symposium on Cyberspace Safety and Security (CSS) and 2014 IEEE 11th International Conference on Embedded Software and Systems (ICESS)
Download PDF

Abstract

Memory accesses limit the performance of stream processors. The stream compiler exploits the reuse of records distributed on different ALU clusters by introducing inter-cluster communications, which decreases the program performance. The paper presents the Stream Transpose (ST) approach to exploit such reuse. The approach, by reorganizing the data, puts data that have been distributed on neighboring ALU clusters on the same ALU cluster, hence exploiting the reuse without any inter-Cluster communications. The experimental results show the approach can exploit the reuse of records distributed among ALU clusters without any inter-cluster communications or any decrease of accessing streams, and gains at most 1.46 speedup over the approach with inter-cluster communication.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles