From Data to Prediction: Comparative Analysis of Machine  Learning Classifiers for Type 2 Diabetes

Samet, Sarra; Samet, Ahmed

DSpace Home
→
Maison de l'Intelligence Artificielle
→
Conférence et Séminaire
→
Séminaires
→
National Conference on Artificial Intelligence and Its Applications (NCAIA 2024)
→
View Item

From Data to Prediction: Comparative Analysis of Machine Learning Classifiers for Type 2 Diabetes

Samet, Sarra; Samet, Ahmed

URI: http://depot.umc.edu.dz/handle/123456789/14651

Date: 2024-10-25

Abstract:

Timely identification and diagnosis of medical conditions hold paramount im portance in averting severe health complications and optimizing healthcare effica-cy. Machine Learning, an offshoot of Artificial Intelligence, possesses considera-ble potential in anticipatory analysis through the integration of Data Mining. The objective of our investigation is to establish a streamlined mechanism for the prompt and precise identification of Type 2 diabetes by utilizing the widely rec-ognized Pima dataset, which encompasses eight clinical parameters. To ensure equitable consideration of all features, we employ the "Standard scaler" technique for feature scaling. Our primary focus lies in enhancing the accuracy of diabetes prognosis by employing supervised machine learning methods, namely Decision Tree, Random Forest, Gradient Boosting algorithms, and Support Vector Ma-chine. Performance evaluation encompasses various metrics such as F1-score, MCC, and other relevant indicators. Notably, Random Forest emerges as the most accurate model, attaining an impressive accuracy rate of 95.24%. Moreover, to mitigate overfitting, we conduct a 5-fold cross-validation, which further af-firms an accuracy rate of 92.55%. It is worth highlighting that our proposed models exhibit superior accuracy in predicting diabetes mellitus when compared to previous endeavors employing the Pima dataset.

Show full item record