Dergi makalesi Açık Erişim

A Methodology for Comparing the Reliability of GPU-Based and CPU-Based HPCs

Cini, Nevin; Yalcin, Gülay


MARC21 XML

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="n">No. 1</subfield>
    <subfield code="c">Article 22</subfield>
    <subfield code="v">Vol. 53</subfield>
    <subfield code="p">ACM Computing Surveys</subfield>
  </datafield>
  <controlfield tag="005">20240912103243.0</controlfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:aperta.ulakbim.gov.tr:273917</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="0">(orcid)0000-0001-5348-4043</subfield>
    <subfield code="a">Cini, Nevin</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Today, GPUs are widely used as coprocessors/accelerators in High-Performance Heterogeneous Computing&lt;br&gt;
due to their many advantages. However, many researches emphasize that GPUs are not as reliable as desired&lt;br&gt;
yet. Despite the fact that GPUs are more vulnerable to hardware errors than CPUs, the use of GPUs in HPCs&lt;br&gt;
is increasing more and more. Moreover, due to native reliability problems of GPUs, combining a great number&lt;br&gt;
of GPUs with CPUs can significantly increase HPCs&amp;rsquo; failure rates. For this reason, analyzing the reliability&lt;br&gt;
characteristics of GPU-based HPCs has become a very important issue. Therefore, in this study we evaluate&lt;br&gt;
the reliability of GPU-based HPCs. For this purpose, we first examined field data analysis studies for GPU-&lt;br&gt;
based and CPU-based HPCs and identified factors that could increase systems failure/error rates. We then&lt;br&gt;
compared GPU-based HPCs with CPU-based HPCs in terms of reliability with the help of these factors in&lt;br&gt;
order to point out reliability challenges of GPU-based HPCs. Our primary goal is to present a study that can&lt;br&gt;
guide the researchers in this field by indicating the current state of GPU-based heterogeneous HPCs and&lt;br&gt;
requirements for the future, in terms of reliability. Our second goal is to offer a methodology to compare the&lt;br&gt;
reliability of GPU-based HPCs and CPU-based HPCs. To the best of our knowledge, this is the first survey&lt;br&gt;
study to compare the reliability of GPU-based and CPU-based HPCs in a systematic manner.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="2">opendefinition.org</subfield>
    <subfield code="a">cc-by</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by-nc/4.0/</subfield>
    <subfield code="a">Creative Commons Attribution-NonCommercial</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Computer systems organization</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Reliability</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">System failure</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">log file analysis</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">checkpoint/recovery</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Graphics Processing Unit</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Yüksek başarımlı hesaplama</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">High Performance Computing</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Dependable and fault-tolerant systems and networks</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Hardware</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Hardware test</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Robustness</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Computer systems organization</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Cross-computing tools and techniques</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Software and its engineering</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Software organization and properties</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Extra-functional properties</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">failure prediction</subfield>
  </datafield>
  <controlfield tag="001">273917</controlfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  </datafield>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">A Methodology for Comparing the Reliability of GPU-Based and CPU-Based HPCs</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-02-06</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Yalcin, Gülay</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="u">https://aperta.ulakbim.gov.trrecord/273917/files/Makale1.pdf</subfield>
    <subfield code="s">516140</subfield>
    <subfield code="z">md5:ba5273044fb4ade31a70b24dda0199ee</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1145/3372790</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
</record>
249
93
görüntülenme
indirilme
Görüntülenme 249
İndirme 93
Veri hacmi 48.0 MB
Tekil görüntülenme 220
Tekil indirme 88

Alıntı yap