2015 IEEE 9th International Conference on Self-Adaptive and Self-Organizing Systems (SASO)
Download PDF

Abstract

The Power3 processor is a 64-bit implementation of the PowerPC(TM) architecture and is the successor to the Power2(TM) processor for workstations and servers which require high performance floating point capability. The previous processors used Newton-Raphson algorithms for their implementations of divide and square root. The Power3 processor has a longer pipeline latency, which would substantially increase the latency for these instructions. Instead, new algorithms based on power series approximations were developed which provide significantly better performance than the Newton-Raphson algorithm for this processor. This paper describes the algorithms, and then shows how both the series based algorithms and the Newton-Raphson algorithms are affected by pipeline length. For the Power3, the power series algorithms reduce the divide latency by over 20% and the square root latency by 35%.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles