Home > Published Issues > 2025 > Volume 16, No. 3, 2025 >
JAIT 2025 Vol.16(3): 330-341
doi: 10.12720/jait.16.3.330-341

Text-Independent Speaker Identification Using Arabic Phonemes

Samiha R. Alarjani 1,*, Imran Rao 2, Iram Fatima 3, and Hafiz Farooq Ahmad 1
1. Computer Science Department, College of Computer Sciences and Information Technology (CCSIT), King Faisal University, P.O. Box 400, Al-Ahsa 31982, Saudi Arabia
2. Blue Brackets
3. Independent Researcher
Email: 222401728@student.kfu.edu.sa, Same7a.567@gmail.com (S.R.A.); imranrao@gmail.com (I.R.); iram.fa@gmail.com (I.F.); hfahmad@kfu.edu.sa (H.F.A.)
*Corresponding author

Manuscript received August 25, 2024; revised October 21, 2024; accepted November 15, 2024; published March 6, 2025.

Abstract—Speaker identification has become a fundamental aspect of digital speech processing and user authentication. However, challenges persist, including speaking in noisy environments, variations in speakers’ emotions, and voice length. Despite the improvements in speech processing techniques, further enhancements are needed, especially in text-independent Arabic speaker identification. Therefore, this research aims to investigate the use of phonemes for speaker identification, which has not been investigated thoroughly in the Arabic language. This study involves using three machine learning models, namely the Gaussian Mixture Model (GMM), Support Vector Machine (SVM), and Gaussian Mixture Model-Universal Background Model (GMM-UBM). In addition, this research employs Mel-Frequency Cepstral Coefficients (MFCC) features and specially constructed datasets containing short Arabic phonemes gathered from 104 speakers. The machine learning models are evaluated using popular metrics such as accuracy, precision, recall, and F1-score. The results show that the GMM model performs better in identifying speakers based on 20 MFCCs, while the SVM model performs better for 40 MFCCs, with accuracies of 96.2% and 95.7%, respectively. The results of the analysis indicate that there are three minimum Arabic phonemes required to identify speakers accurately. This research provides good insights into Arabic speaker identification using short utterances, which can help in the future development of reliable speaker identification systems using short audio samples.
 
Keywords—speaker identification, Arabic, phonemes, Mel-Frequency Cepstral Coefficients (MFCC), machine learning, Gaussian Mixture Model (GMM), Support Vector Machine (SVM), Gaussian Mixture Model-Universal Background Model (GMM-UBM)

Cite: Samiha R. Alarjani, Imran Rao, Iram Fatima, and Hafiz Farooq Ahmad, "Text-Independent Speaker Identification Using Arabic Phonemes," Journal of Advances in Information Technology, Vol. 16, No. 3, pp. 330-341, 2025. doi: 10.12720/jait.16.3.330-341

Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions