2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Download PDF

Abstract

Existing implementations of transformer networks by field-programmable gate array (FPGA) focus only on attention computation, or suffer from fixed model structure without flexibility. In this article, we propose an FPGA-based overlay processor, named Transformer-OPU for general accelerations of transformer networks. Experimental result shows that our Transformer-OPU achieves 5.19-15.06× and 1.14-2.89× speedup compared with CPU and GPU, respectively. We also observe 1.10-2.47× better latency compared with previously customized FPGA accelerators, and is 1.45× faster than NPE.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles