A speaker recognition system based on joint factor analysis (JFA) is proposed to improve whispering speakers’ recognition rate under channel mismatch. The system estimated separately the eigenvoice and the eigenchannel before calculating the corresponding speaker and the channel factors. Finally, a channel-free speaker model was built to describe accurately a speaker using model compensation. The test results from the whispered speech databases obtained under eight different channels showed that the correct recognition rate of a recognition system based on JFA was higher than that of the Gaussian Mixture Model-Universal Background Model. In particular, the recognition rate in cellphone channel tests increased significantly.
Despite the growing importance of packet switching systems, there is still a shortage of thorough analyses of VoIP transmission effect on speech and speaker recognition performance. Voice over IP transmission systems use packet switching. There is no guarantee of delivery. The main disadvantage of VoIP is a packet loss which has a major impact on the performance experienced by the users of the network. There are several techniques to mask the effects of a packet loss, referred to as packet loss concealment. In this study, the effect of voice transmission over IP on automatic speaker verification system performance was investigated. The analyzed system was based on MAP-EM-GMM modelling methods. Four various speech codecs of H.323 standard were investigated with special emphasis placed on the packet loss phenomenon and various packet loss concealment techniques.