A Comparative Study of Classifier Based Mispronunciation Detection System for Confusing

M Maqsood, H. A. Habib, S. M. Anwar, M. A. Ghazanfar, T. Nawaz


Pronunciation training systems detect mispronunciations from language learner’s speech and provide useful feedback. Mispronunciation detection systems can either be developed using Confidence Measures (CM) or using classifiers with Acoustic Phonetic Features (APF). This paper presents an APF based computer assisted pronunciation training (CAPT) system for most confusing Arabic phoneme pairs (/ ط / vs/ ت/)and (/ ح / vs / خ / or / ه /) developed for subjects of  Pakistani origin. A super-vector is formed based on APF consisting of Mel-frequency cepstral coefficients (MFCCs) along with its first and second derivative, energy, zero-cross, spectral features and pitch. A large dataset has been recorded from 200 speakers of Pakistani origin learning Arabic as their second language. Four different machine learning classifiers; Random Forest, Naïve Bayes, Ada-boost and K-NN have been used for mispronunciation detection. A comparison has been conducted between these classifiers and standard Goodness of Pronunciation (GOP) method. The results show that Random Forest outperforms all other methods by a significant margin.

Full Text:



