DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET

İsmail Buğra Bölükbaşı; Betül Yağmahan

doi:10.48623/aperta.286136

22 Ekim 2022 Konferans bildirisi Açık Erişim

DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET

İsmail Buğra Bölükbaşı; Betül Yağmahan

JSON-LD (schema.org)

{
  "@context": "https://schema.org/", 
  "@id": 286136, 
  "@type": "ScholarlyArticle", 
  "creator": [
    {
      "@id": "https://orcid.org/0000-0002-9405-0900", 
      "@type": "Person", 
      "affiliation": "Yalova \u00dcniversitesi", 
      "name": "\u0130smail Bu\u011fra B\u00f6l\u00fckba\u015f\u0131"
    }, 
    {
      "@id": "https://orcid.org/0000-0003-1744-3062", 
      "@type": "Person", 
      "affiliation": "Bursa Uluda\u011f \u00dcniversitesi", 
      "name": "Bet\u00fcl Ya\u011fmahan"
    }
  ], 
  "datePublished": "2022-10-22", 
  "description": "<p>In recent years, the number of people with diabetes has been increasing daily. Diabetes is an important<br>\ndisease that can cause serious damage to the body in the future and even cause death if precautions are<br>\nnot taken. Early and accurate detection of ever-increasing diabetes is gaining more importance in the<br>\nmedical world. The number of studies using machine learning methods to diagnose diabetes has<br>\nincreased significantly in the literature.<br>\nIn this study, type-2 diabetes disease was classified using different data preprocessing and machine<br>\nlearning methods on real-world data taken from a public hospital in Turkey. Logistic regression, Naive<br>\nBayes, C4.5, and Random Forest classification models were used in the study. In the classification<br>\nmodels, the patient&#39;s age, gender, complete blood count, biochemistry, and hormone test results were<br>\nused as input variables, and the disease diagnosis made by specialist doctors was used as output variable.<br>\nIn total, 43 different variables were studied. When the dataset was examined, it was noticed that there<br>\nwas an imbalance between the classes in the target variable. In cases where there is a class imbalance,<br>\nthe classification models can make incorrect assignments to the classes. To eliminate the class imbalance<br>\nin the data set used in the study, three different resampling methods were used: random undersampling<br>\n(RUS), random oversampling (ROS), and synthetic minority oversampling (SMOTE).<br>\nThe performances of four different machine learning methods were compared on each of the original<br>\ntraining dataset, random undersampled training dataset, random oversampled training dataset, and<br>\nsynthetic minority oversampled training dataset. A total of 16 different scenarios were studied.<br>\nAs a result of the analysis of all scenarios, four combinations that give the best results were determined.<br>\nThese are Naive Bayes working with original training dataset, Random Forest working with random<br>\nundersampled training and synthetic minority oversampled training datasets, and C4.5 algorithm<br>\nworking with random oversampled training dataset. The algorithm that takes the first place among the<br>\nfour scenarios that show the best results is the Random Forest algorithm working with random<br>\nundersampled training dataset.</p>", 
  "headline": "DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET", 
  "identifier": 286136, 
  "image": "https://aperta.ulakbim.gov.tr/static/img/logo/aperta_logo_with_icon.svg", 
  "keywords": [
    "Diabetes Diagnosis", 
    "Type-2 Diabetes", 
    "Machine Learning", 
    "Classification", 
    "Imbalanced Dataset", 
    "Resampling Methods"
  ], 
  "license": "http://www.opendefinition.org/licenses/cc-by-sa", 
  "name": "DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET", 
  "url": "https://aperta.ulakbim.gov.tr/record/286136"
}

görüntülenme

indirilme

Daha fazla ayrıntı...

	Tüm sürümler	Bu sürüm
Görüntülenme	0	0
İndirme	0	0
Veri hacmi	0 Bytes	0 Bytes
Tekil görüntülenme	0	0
Tekil indirme	0	0

Kayıt Bilgileri

Yayınlanma tarihi:: 22/10/2022
ISBN:: 987-625-8246-29-2
Bilim dalları:: Sağlık Bilimleri > Tıp > Dahili Tıp Bilimleri > İç Hastalıkları > Endokrinoloji ve Metabolizma Hastalıkları
Anahtar kelimeler:: Diabetes Diagnosis Type-2 Diabetes Machine Learning Classification Imbalanced Dataset Resampling Methods
Yayınlandığı yer:: ABSTRACT BOOK, IKSAD Publishing, Adana, pp. 330-331 (987-625-8246-29-2).
Konferans Bilgileri:: CUKUROVA 9th INTERNATIONAL SCIENTIFIC RESEARCHES CONFERENCE, Adana, October 09-11, 2022
Lisans:: Creative Commons Attribution Share-Alike

Sürümler

Sürüm 1

22/10/2022

Belirli bir sürüme mi atıf vermek istiyorsunuz?

Gösterilen DOI numarası tüm sürümleri temsil eder ancak son sürümü çözer. Bu nedenle belirli bir sürüme atıf vermek için sürüm numarasının belirtilmesi gerekmektedir.

DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET

DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET

JSON-LD (schema.org)

Kayıt Bilgileri

Sürümler

Alıntı yap

Paylaş

Dışa aktar

TÜBİTAK ULAKBİM

İLETİŞİM