Accelerating deterministic and stochastic binarized neural networks on FPGAS using OpenCL

Lammie, Corey, Xiang, Wei, and Azghadi, Mostafa Rahimi (2019) Accelerating deterministic and stochastic binarized neural networks on FPGAS using OpenCL. In: Procedings of the International Midwest Symposium on Circuits and Systems. pp. 626-629. From: MWSCAS 2019: IEEE 62nd International Midwest Symposium on Circuits and Systems, 4-7 August 2019, Dallas, TX, USA.

[img] PDF (Published Version) - Published Version
Restricted to Repository staff only

View at Publisher Website:


Recent technological advances have proliferated the available computing power, memory, and speed of modern Central Processing Units (CPUs), Graphics Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs). Consequently, the performance and complexity of Artificial Neural Networks (ANNs) is burgeoning. While GPU-accelerated Deep Neural Networks (DNNs) currently offer state-of-the-art performance, they consume large amounts of power. Training such networks on CPUs is inefficient, as data throughput and parallel computation is limited. FPGAs are considered a suitable candidate for performance critical, low power systems, e.g. the Internet of Things (IOT) edge devices. Using the Xilinx SDAccel or Intel FPGA SDK for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated on heterogeneous platforms. Moreover, the resource utilization and power consumption of DNNs can be further enhanced by utilizing regularization techniques that binarize network weights. In this paper, we introduce, to the best of our knowledge, the first FPGA-accelerated stochastically binarized DNN implementations, and compare them to implementations accelerated on both GPUs and FPGAs. All our developed networks are trained and benchmarked using the popular MNIST and CIFAR-10 datasets. For our binarized and conventional FPGA-based networks, we achieve a >16-fold improvement in power consumption, compared to their GPU-accelerated counterparts. Also, our binarized FPGA-based networks require >25% shorter inference times, compared to their GPU-based counterparts.

Item ID: 61750
Item Type: Conference Item (Research - E1)
ISBN: 978-1-7281-2787-3
ISSN: 1548-3746
Copyright Information: © 2019 IEEE.
Date Deposited: 03 Feb 2020 01:24
FoR Codes: 40 ENGINEERING > 4009 Electronics, sensors and digital hardware > 400902 Digital electronic devices @ 25%
46 INFORMATION AND COMPUTING SCIENCES > 4606 Distributed computing and systems software > 460606 Energy-efficient computing @ 25%
40 ENGINEERING > 4008 Electrical engineering > 400801 Circuits and systems @ 50%
SEO Codes: 97 EXPANDING KNOWLEDGE > 970109 Expanding Knowledge in Engineering @ 50%
97 EXPANDING KNOWLEDGE > 970110 Expanding Knowledge in Technology @ 50%
Downloads: Total: 2
More Statistics

Actions (Repository Staff Only)

Item Control Page Item Control Page