Abstract
Artificial intelligence (AI) hardware accelerator is an emerging research for several applications and domains. The hardware accelerator's direction is to provide high computational speed with retaining low-cost and high learning performance. The main challenge is to design complex machine learning models on hardware with high performance. This article presents a thorough investigation into machine learning accelerators and associated challenges. It describes a hardware implementation of different structures such as convolutional neural network (CNN), recurrent neural network (RNN), and artificial neural network (ANN). The challenges such as speed, area, resource consumption, and throughput are discussed. It also presents a comparison between the existing hardware design. Last, the article describes the evaluation parameters for a machine learning accelerator in terms of learning and testing performance and hardware design.