الاستخدام الأمثل لمعاملات خوارزميات تعلم الآلة للتنبؤ بمرض السكري Ismail I. Al-Ahmeda,Yousif A. Al-Hajb, Marwan M. Al-Falahc, Khadeja M. Al-Nashadc, Naif M. Al-Falahd

Ismail I. Al-Ahmeda,Yousif A. Al-Hajb, Marwan M. Al-Falahc, Khadeja M. Al-Nashadc, Naif M. Al-Falahd

الاستخدام الأمثل لمعاملات خوارزميات تعلم الآلة للتنبؤ بمرض السكري Optimal Using of Machine Learning Algorithms Hyperparameters for Diabetes Prediction Ismail I. Al-Ahmeda,Yousif A. Al-Hajb, Marwan M. Al-Falahc, Khadeja M. Al-Nashadc, Naif M. Al-Falahd الخلاصة: يُعد مرض السكري ) Diabetes Mellitus ) قضية صحية متزايدة الانتشار عالمياً نظراً لانتشاره الواسع وطبيعته المزمنة. يُعتبر التشخيص المبكر له أمراً بالغ الأهمية لإدارة الحالة بفعالية وتحسين نتائج المرضى. ينقسم مرض السكري إلى نوعين: النوع الأول الذي يُظهر أعراضاً واضحة، والنوع الثاني الذي يكون عادةً بلا أعراض في مراحله المبكرة، مما يجعل الكشف المبكر عنه أمراً صعباً. من أجل التغلب على هذا التحدي، تم استخدام مجموعة بيانات بيما الهندية لمرض السكري ) Pima Indian Diabetes Dataset ( وتم استخدام خوارزميات تعلم الآلة، ومنها ( Support Vector Machine (، و ) Random Forest (، و ) Logistic Regression (. تم استخدام نموذج التقييم المتقاطع، مع تقسيم البيانات إلى عشرة مجموعات ) K-fold stratified cross-validation ( لتقييم أداء الخوارزميات. علاوة على ذلك، تم تنفيذ ضبط المعاملات ( Hyperparameter tuning ( لتحسين أداء هذه النماذج. لقد أظهرت النتائج أن خوارزمية الانحدار اللوجستي حققت أعلى معدل دقة بنسبة 79 %. لذا تسلط هذه الدراسة الضوء على إمكانية استخدام خوارزميات تعلم الآلة في الكشف المبكر والتشخيص لمرض السكري، خاصة في الحالات التي تكون فيها الطرق التقليدية محدودة. كما توضح النتائج أهمية ضبط المعاملات لتحسين أداء خوارزميات تعلم الآلة في تطبيقات الطب. كما تؤكد هذه الدراسة أيضًا على تأثير ضبط المعاملات وتقنيات هندسة السمات المتقدمة في تحسين دقة نماذج التنبؤ بمرض السكري. إن الخوارزميات الأولية التي استخدمت في هذه الدراسة كانت أقل فعالية قبل تنفيذ هذه التقنيات. إن نتائج هذه الدراسة تبين أهمية ضبط معاملات الخوارزميات بعناية وتطوير تقنيات هندسة السمات المتقدمة لتعزيز فاعلية نماذج تعلم الآلة في التنبؤ بمرض السكري. كما تشير هذه النتائج إلى أهمية تطوير نماذج تنبؤ أكثر دقة وموثوقية، مما يُمَكِن الأطباء المهنيون من تشخيص مرض السكري في الوقت المناسب وتوفير العلاج الفعال للمرضى Abstract: Diabetes Mellitus (DM) is a growing health concern worldwide due to its widespread prevalence and chronic nature. Early diagnosis is crucial for effective management and improving patient outcomes. There are two forms of DM, type 1 which presents symptoms, and type 2, which is often asymptomatic in its early stages, making early detection challenging. To address this challenge, the Pima Indian Diabetes Dataset (PIDD) was utilized and employed machine learning algorithms including Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR). A cross-validation model with K-fold stratified cross-validation equal to 10 was employed to evaluate the performance of the algorithms. Furthermore, hyperparameter tuning was performed to optimize the performance of the models. Our results showed that the logistic regression algorithm had the highest accuracy with a value of 79%. This study highlights the potential of using machine learning algorithms in the early detection and diagnosis of DM, especially in cases where traditional methods may be limited. Also the results of this study demonstrate the importance of hyperparameter tuning in improving the performance of machine learning algorithms for medical applications. Where the results of this study highlight the significant impact of hyperparameter tuning and feature engineering techniques on improving the accuracy of prediction models for diabetes. It is worth noting that the initial algorithms used in this study performed less effectively prior to the implementation of these techniques. These findings underscore the importance of careful algorithm tuning and advanced feature engineering in enhancing the efficacy of machine learning models for diabetes prediction. These results have important implications for the development of more accurate and reliable prediction models, which can aid medical professionals in providing timely diagnosis and effective treatment to patients with diabetes. Keywords: Machine learning algorithms, Diabetes Mellitus (DM), Pima Indian Diabetes Dataset, Hyperparameter tuning.

البحث العلمي ابحاث المجلة | الابحاث المنشورة

اسم الباحث     :    Ismail I. Al-Ahmeda,Yousif A. Al-Hajb, Marwan M. Al-Falahc, Khadeja M. Al-Nashadc, Naif M. Al-Falahd
DOI     :    https://aif-doi.org/
ملخص البحث     :    الخلاصة: يُعد مرض السكري ) Diabetes Mellitus ) قضية صحية متزايدة الانتشار عالمياً نظراً لانتشاره الواسع وطبيعته المزمنة. يُعتبر التشخيص المبكر له أمراً بالغ الأهمية لإدارة الحالة بفعالية وتحسين نتائج المرضى. ينقسم مرض السكري إلى نوعين: النوع الأول الذي يُظهر أعراضاً واضحة، والنوع الثاني الذي يكون عادةً بلا أعراض في مراحله المبكرة، مما يجعل الكشف المبكر عنه أمراً صعباً. من أجل التغلب على هذا التحدي، تم استخدام مجموعة بيانات بيما الهندية لمرض السكري ) Pima Indian Diabetes Dataset ( وتم استخدام خوارزميات تعلم الآلة، ومنها ( Support Vector Machine (، و ) Random Forest (، و ) Logistic Regression (. تم استخدام نموذج التقييم المتقاطع، مع تقسيم البيانات إلى عشرة مجموعات ) K-fold stratified cross-validation ( لتقييم أداء الخوارزميات. علاوة على ذلك، تم تنفيذ ضبط المعاملات ( Hyperparameter tuning ( لتحسين أداء هذه النماذج. لقد أظهرت النتائج أن خوارزمية الانحدار اللوجستي حققت أعلى معدل دقة بنسبة 79 %. لذا تسلط هذه الدراسة الضوء على إمكانية استخدام خوارزميات تعلم الآلة في الكشف المبكر والتشخيص لمرض السكري، خاصة في الحالات التي تكون فيها الطرق التقليدية محدودة. كما توضح النتائج أهمية ضبط المعاملات لتحسين أداء خوارزميات تعلم الآلة في تطبيقات الطب. كما تؤكد هذه الدراسة أيضًا على تأثير ضبط المعاملات وتقنيات هندسة السمات المتقدمة في تحسين دقة نماذج التنبؤ بمرض السكري. إن الخوارزميات الأولية التي استخدمت في هذه الدراسة كانت أقل فعالية قبل تنفيذ هذه التقنيات. إن نتائج هذه الدراسة تبين أهمية ضبط معاملات الخوارزميات بعناية وتطوير تقنيات هندسة السمات المتقدمة لتعزيز فاعلية نماذج تعلم الآلة في التنبؤ بمرض السكري. كما تشير هذه النتائج إلى أهمية تطوير نماذج تنبؤ أكثر دقة وموثوقية، مما يُمَكِن الأطباء المهنيون من تشخيص مرض السكري في الوقت المناسب وتوفير العلاج الفعال للمرضى
Abstract: Diabetes Mellitus (DM) is a growing health concern worldwide due to its widespread prevalence and chronic nature. Early diagnosis is crucial for effective management and improving patient outcomes. There are two forms of DM, type 1 which presents symptoms, and type 2, which is often asymptomatic in its early stages, making early detection challenging. To address this challenge, the Pima Indian Diabetes Dataset (PIDD) was utilized and employed machine learning algorithms including Support Vector Machine (SVM), Random Forest (RF), and Logistic Regression (LR). A cross-validation model with K-fold stratified cross-validation equal to 10 was employed to evaluate the performance of the algorithms. Furthermore, hyperparameter tuning was performed to optimize the performance of the models. Our results showed that the logistic regression algorithm had the highest accuracy with a value of 79%. This study highlights the potential of using machine learning algorithms in the early detection and diagnosis of DM, especially in cases where traditional methods may be limited. Also the results of this study demonstrate the importance of hyperparameter tuning in improving the performance of machine learning algorithms for medical applications. Where the results of this study highlight the significant impact of hyperparameter tuning and feature engineering techniques on improving the accuracy of prediction models for diabetes. It is worth noting that the initial algorithms used in this study performed less effectively prior to the implementation of these techniques. These findings underscore the importance of careful algorithm tuning and advanced feature engineering in enhancing the efficacy of machine learning models for diabetes prediction. These results have important implications for the development of more accurate and reliable prediction models, which can aid medical professionals in providing timely diagnosis and effective treatment to patients with diabetes. Keywords: Machine learning algorithms, Diabetes Mellitus (DM), Pima Indian Diabetes Dataset, Hyperparameter tuning. رجوع تحميل البحث