Home > Published Issues > 2024 > Volume 15, No. 11, 2024 >
JAIT 2024 Vol.15(11): 1264-1272
doi: 10.12720/jait.15.11.1264-1272

Advanced Hybrid and Preprocessing Models for Diagnosis Challenges in Data Classification

Mustafa Adil Fayez * and Sefer Kurnaz
Computer Engineering Department, Science Institute, Altinbas University, Istanbul, Türkiye
Email: mustafaadil302@gmail.com (M.A.F.); sefer.kurnaz@altinbas.edu.tr (S.K.)
*Corresponding author

Manuscript received February 20, 2024; revised May 9, 2024; accepted August 5, 2024; published November 17, 2024.

Abstract—Machine Learning (ML), often viewed as a cutting-edge technology best suited for qualified specialists, presents limited access for other physicians and scientists in the medical profession. In this work, we provide a new, sophisticated, and highly successful technology for medical applications, especially cardiac diagnostics. We propose a novel advanced hybrid optimization model with two essential parts. Initially, we apply a high-performance hybrid resampling technique for feature engineering and pre-processing. This approach, which combines Synthetic Minority Oversampling Technique Edited Nearest Neighbors (SMOTEENN) with Neighborhood Cleaning Rules (NCL), addresses class imbalance in the data. We developed a complex hybrid optimization model that incorporates hyper-parameter optimization, advanced Application Programming Interface (API) functions, and a super-learner ensemble model to enhance diagnosis accuracy in cases where datasets lack balance. Furthermore, we developed high-performance prediction models using sophisticated Support Vector Machines (SVMs). We show that, with re-sampled Cardiovascular Disease (CVD) data, the advanced hybrid optimization model attained an astounding accuracy of 98%. By comparison, an advanced SVM model obtained 96% accuracy, while an advanced deep learning model produced 95.5% accuracy. Our new sophisticated hybrid optimization machine learning models may significantly improve physicians’ interpretation of ML results. This strategy could make it easier to apply AI methods on a large scale in the clinic, which would eventually raise patient outcomes and diagnostic accuracy.
 
Keywords—Cardiovascular Disease (CVD), Neighborhood Cleaning Rules (NCL), Synthetic Minority Oversampling Technique Edited Nearest Neighbors (SMOTEENN), hybrid advanced models, optimization, Application Programming Interface (API) function

Cite: Mustafa Adil Fayez and Sefer Kurnaz, "Advanced Hybrid and Preprocessing Models for Diagnosis Challenges in Data Classification," Journal of Advances in Information Technology, Vol. 15, No. 11, pp. 1264-1272, 2024.

Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.