2019 IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS)
Download PDF

Abstract

Speech recognition accuracy has been greatly improved due to the application of Deep Neural Network (DNN) in it. A variety of DNN-based speaker recognition methods have recently been proposed, which have been greatly improved over traditional recognition methods. However, the speaker recognition model based on DNN is usually used for the extraction of bottleneck features or as the extraction part of the sufficient statistics in the traditional i-vector model. The training speed of the model is not ideal, and the recognition accuracy needs to be improved. In this paper, Convolutional Neural Network (CNN) is used for speaker recognition, which reduces the complexity of model training. The method of hopping connection is used to solve the problem of network degradation to some extent, and a method based on softmax loss and center loss joint supervision is also introduced. The performance of the recognition model is further improved. The experimental results show that the Equal Error Rate (EER) of End-to-End CNN for speaker recognition is improved by 13.4% compared with the baseline method, and the performance is better.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles