Dergi makalesi Açık Erişim

A Methodology for Comparing the Reliability of GPU-Based and CPU-Based HPCs

Cini, Nevin; Yalcin, Gülay


JSON-LD (schema.org)

{
  "@context": "https://schema.org/", 
  "@id": 273917, 
  "@type": "ScholarlyArticle", 
  "creator": [
    {
      "@id": "https://orcid.org/0000-0001-5348-4043", 
      "@type": "Person", 
      "name": "Cini, Nevin"
    }, 
    {
      "@type": "Person", 
      "name": "Yalcin, G\u00fclay"
    }
  ], 
  "datePublished": "2020-02-06", 
  "description": "<p>Today, GPUs are widely used as coprocessors/accelerators in High-Performance Heterogeneous Computing<br>\ndue to their many advantages. However, many researches emphasize that GPUs are not as reliable as desired<br>\nyet. Despite the fact that GPUs are more vulnerable to hardware errors than CPUs, the use of GPUs in HPCs<br>\nis increasing more and more. Moreover, due to native reliability problems of GPUs, combining a great number<br>\nof GPUs with CPUs can significantly increase HPCs&rsquo; failure rates. For this reason, analyzing the reliability<br>\ncharacteristics of GPU-based HPCs has become a very important issue. Therefore, in this study we evaluate<br>\nthe reliability of GPU-based HPCs. For this purpose, we first examined field data analysis studies for GPU-<br>\nbased and CPU-based HPCs and identified factors that could increase systems failure/error rates. We then<br>\ncompared GPU-based HPCs with CPU-based HPCs in terms of reliability with the help of these factors in<br>\norder to point out reliability challenges of GPU-based HPCs. Our primary goal is to present a study that can<br>\nguide the researchers in this field by indicating the current state of GPU-based heterogeneous HPCs and<br>\nrequirements for the future, in terms of reliability. Our second goal is to offer a methodology to compare the<br>\nreliability of GPU-based HPCs and CPU-based HPCs. To the best of our knowledge, this is the first survey<br>\nstudy to compare the reliability of GPU-based and CPU-based HPCs in a systematic manner.</p>", 
  "headline": "A Methodology for Comparing the Reliability of GPU-Based and CPU-Based HPCs", 
  "identifier": 273917, 
  "image": "https://aperta.ulakbim.gov.tr/static/img/logo/aperta_logo_with_icon.svg", 
  "inLanguage": {
    "@type": "Language", 
    "alternateName": "eng", 
    "name": "English"
  }, 
  "keywords": [
    "Computer systems organization", 
    "Reliability", 
    "System failure", 
    "log file analysis", 
    "checkpoint/recovery", 
    "Graphics Processing Unit", 
    "Y\u00fcksek ba\u015far\u0131ml\u0131 hesaplama", 
    "High Performance Computing", 
    "Dependable and fault-tolerant systems and networks", 
    "Hardware", 
    "Hardware test", 
    "Robustness", 
    "Computer systems organization", 
    "Cross-computing tools and techniques", 
    "Software and its engineering", 
    "Software organization and properties", 
    "Extra-functional properties", 
    "failure prediction"
  ], 
  "license": "https://creativecommons.org/licenses/by-nc/4.0/", 
  "name": "A Methodology for Comparing the Reliability of GPU-Based and CPU-Based HPCs", 
  "url": "https://aperta.ulakbim.gov.tr/record/273917"
}
249
93
görüntülenme
indirilme
Görüntülenme 249
İndirme 93
Veri hacmi 48.0 MB
Tekil görüntülenme 220
Tekil indirme 88

Alıntı yap