2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)
Download PDF

Abstract

Due to recent artificial intelligence (AI) technology progress, more and more applications present all-to-all, irregular or unpredictable communication patterns among compute nodes in high-performance computing (HPC) systems. Traditional communication infrastructures, e.g., torus or fat-tree interconnection networks, may not handle well their matchmaking problems with these newly emerging applications. For these typical non-random network topologies, there are already many communication-efficient application mapping algorithms. However, for the above unpredictable communication patterns, it is difficult to efficiently map their applications onto the non-random network topologies. In this case, a simple optimization is to map their applications with small diameter or average shortest path length (ASPL) among the assigned compute nodes. In this context, we recommend to use random network topologies as the communication infrastructures, which have drawn increasing attention for the use of HPC interconnects. In this study, we make a comparative study to analyze the performance impact of application mapping on non-random and random network topologies. We list several application mapping policies, and compare their job scheduling performances assuming that the communication patterns are unpredictable to the computing system. Evaluations with a large compound application workload show that, when compared to non-random topologies, random topologies can reduce the average turnaround time up to 39.3% by a random connected mapping method and up to 72.1% by a diameter/ASPL-based mapping method.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles