DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET

İsmail Buğra Bölükbaşı; Betül Yağmahan

doi:10.48623/aperta.286136

22 Ekim 2022 Konferans bildirisi Açık Erişim

DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET

İsmail Buğra Bölükbaşı; Betül Yağmahan

DataCite XML

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.48623/aperta.286136</identifier>
  <creators>
    <creator>
      <creatorName>İsmail Buğra Bölükbaşı</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-9405-0900</nameIdentifier>
      <affiliation>Yalova Üniversitesi</affiliation>
    </creator>
    <creator>
      <creatorName>Betül Yağmahan</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0003-1744-3062</nameIdentifier>
      <affiliation>Bursa Uludağ Üniversitesi</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Diagnosis Of Diabetes Disease Using Machine Learning Methods In An Imbalanced Diabetes Dataset</title>
  </titles>
  <publisher>Aperta</publisher>
  <publicationYear>2022</publicationYear>
  <subjects>
    <subject>Diabetes Diagnosis</subject>
    <subject>Type-2 Diabetes</subject>
    <subject>Machine Learning</subject>
    <subject>Classification</subject>
    <subject>Imbalanced Dataset</subject>
    <subject>Resampling Methods</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2022-10-22</date>
  </dates>
  <resourceType resourceTypeGeneral="Text">Conference paper</resourceType>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://aperta.ulakbim.gov.tr/record/286136</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.48623/aperta.286135</relatedIdentifier>
  </relatedIdentifiers>
  <rightsList>
    <rights rightsURI="http://www.opendefinition.org/licenses/cc-by-sa">Creative Commons Attribution Share-Alike</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;In recent years, the number of people with diabetes has been increasing daily. Diabetes is an important&lt;br&gt;
disease that can cause serious damage to the body in the future and even cause death if precautions are&lt;br&gt;
not taken. Early and accurate detection of ever-increasing diabetes is gaining more importance in the&lt;br&gt;
medical world. The number of studies using machine learning methods to diagnose diabetes has&lt;br&gt;
increased significantly in the literature.&lt;br&gt;
In this study, type-2 diabetes disease was classified using different data preprocessing and machine&lt;br&gt;
learning methods on real-world data taken from a public hospital in Turkey. Logistic regression, Naive&lt;br&gt;
Bayes, C4.5, and Random Forest classification models were used in the study. In the classification&lt;br&gt;
models, the patient&amp;#39;s age, gender, complete blood count, biochemistry, and hormone test results were&lt;br&gt;
used as input variables, and the disease diagnosis made by specialist doctors was used as output variable.&lt;br&gt;
In total, 43 different variables were studied. When the dataset was examined, it was noticed that there&lt;br&gt;
was an imbalance between the classes in the target variable. In cases where there is a class imbalance,&lt;br&gt;
the classification models can make incorrect assignments to the classes. To eliminate the class imbalance&lt;br&gt;
in the data set used in the study, three different resampling methods were used: random undersampling&lt;br&gt;
(RUS), random oversampling (ROS), and synthetic minority oversampling (SMOTE).&lt;br&gt;
The performances of four different machine learning methods were compared on each of the original&lt;br&gt;
training dataset, random undersampled training dataset, random oversampled training dataset, and&lt;br&gt;
synthetic minority oversampled training dataset. A total of 16 different scenarios were studied.&lt;br&gt;
As a result of the analysis of all scenarios, four combinations that give the best results were determined.&lt;br&gt;
These are Naive Bayes working with original training dataset, Random Forest working with random&lt;br&gt;
undersampled training and synthetic minority oversampled training datasets, and C4.5 algorithm&lt;br&gt;
working with random oversampled training dataset. The algorithm that takes the first place among the&lt;br&gt;
four scenarios that show the best results is the Random Forest algorithm working with random&lt;br&gt;
undersampled training dataset.&lt;/p&gt;</description>
  </descriptions>
</resource>

görüntülenme

indirilme

Daha fazla ayrıntı...

	Tüm sürümler	Bu sürüm
Görüntülenme	0	0
İndirme	0	0
Veri hacmi	0 Bytes	0 Bytes
Tekil görüntülenme	0	0
Tekil indirme	0	0

Kayıt Bilgileri

Yayınlanma tarihi:: 22/10/2022
ISBN:: 987-625-8246-29-2
Bilim dalları:: Sağlık Bilimleri > Tıp > Dahili Tıp Bilimleri > İç Hastalıkları > Endokrinoloji ve Metabolizma Hastalıkları
Anahtar kelimeler:: Diabetes Diagnosis Type-2 Diabetes Machine Learning Classification Imbalanced Dataset Resampling Methods
Yayınlandığı yer:: ABSTRACT BOOK, IKSAD Publishing, Adana, pp. 330-331 (987-625-8246-29-2).
Konferans Bilgileri:: CUKUROVA 9th INTERNATIONAL SCIENTIFIC RESEARCHES CONFERENCE, Adana, October 09-11, 2022
Lisans:: Creative Commons Attribution Share-Alike

Sürümler

Sürüm 1

22/10/2022

Belirli bir sürüme mi atıf vermek istiyorsunuz?

Gösterilen DOI numarası tüm sürümleri temsil eder ancak son sürümü çözer. Bu nedenle belirli bir sürüme atıf vermek için sürüm numarasının belirtilmesi gerekmektedir.

DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET

DIAGNOSIS OF DIABETES DISEASE USING MACHINE LEARNING METHODS IN AN IMBALANCED DIABETES DATASET

DataCite XML

Kayıt Bilgileri

Sürümler

Alıntı yap

Paylaş

Dışa aktar

TÜBİTAK ULAKBİM

İLETİŞİM