Dergi makalesi Açık Erişim

Architectural Trade-Off Analysis for Accelerating LSTM Network Using Radix-<i>r</i> OBC Scheme

Khan, Mohd Tasleem; Yantir, Hasan Erdem; Salama, Khaled Nabil; Eltawil, Ahmed M.


Dublin Core

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Khan, Mohd Tasleem</dc:creator>
  <dc:creator>Yantir, Hasan Erdem</dc:creator>
  <dc:creator>Salama, Khaled Nabil</dc:creator>
  <dc:creator>Eltawil, Ahmed M.</dc:creator>
  <dc:date>2023-01-01</dc:date>
  <dc:description>This paper presents architectural trade-off analysis for accelerating two (Type I, II) fixed-point long short-term memory (LSTM) network based on circulant matrix-vector multiplications (MVMs) using radix -r offset binary coding (OBC) scheme. Type I MVM architecture rotates the weights with the proposed modulo-cum interleaver and uses partial product generators (PPGs) with a single generation unit across a column. It is hardware-optimized using a single adder tree through time multiplexing. Meanwhile, Type II MVM architecture rotates the inputs with the proposed store-cum interleaver and uses single PPGs with a single generation unit across a row. It is time optimized by unfolding shift-accumulate unit to a shift-add tree followed by pipelining. A new design for element-wise multiplication using radix -r PPG is also presented. Both the designs are extended to their block-circulant variants for certain accuracy requirements. Post-synthesis of Type I and II architectures for a different model, kernel, radix sizes and clock frequencies result in several efficient designs. Compared with the prior scheme, Type I architecture for 128x128 with r = 2 on 28 nm FDSOI technology at 800 MHz occupies 32.27% lesser area, consumes 67.89% lesser power at the same throughput, while Type II architecture at the expense of area and power provides 40x higher throughput.</dc:description>
  <dc:identifier>https://aperta.ulakbim.gov.trrecord/265184</dc:identifier>
  <dc:identifier>oai:aperta.ulakbim.gov.tr:265184</dc:identifier>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>http://www.opendefinition.org/licenses/cc-by</dc:rights>
  <dc:source>IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS 70(1) 14</dc:source>
  <dc:title>Architectural Trade-Off Analysis for Accelerating LSTM Network Using Radix-&lt;i&gt;r&lt;/i&gt; OBC Scheme</dc:title>
  <dc:type>info:eu-repo/semantics/article</dc:type>
  <dc:type>publication-article</dc:type>
</oai_dc:dc>
35
4
görüntülenme
indirilme
Görüntülenme 35
İndirme 4
Veri hacmi 916 Bytes
Tekil görüntülenme 29
Tekil indirme 4

Alıntı yap