Low-power and high-speed deep FPGA inference engines for weed classification at the edge
Lammie, Corey, Olsen, Alex, Carrick, Tony, and Rahimi Azghadi, Mostafa (2019) Low-power and high-speed deep FPGA inference engines for weed classification at the edge. IEEE Access, 7. pp. 51171-51184.
|
PDF (Author Accepted Version)
- Accepted Version
Download (26MB) | Preview |
Abstract
Deep neural networks (DNNs) have recently achieved remarkable performance in a myriad of applications, ranging from image recognition to language processing. Training such networks on graphics processing units (GPUs) currently offers unmatched levels of performance; however, GPUs are subject to large-power requirements. With recent advancements in high-level synthesis (HLS) techniques, new methods for accelerating deep networks using field programmable gate arrays (FPGAs) are emerging. FPGA-based DNNs present substantial advantages in energy efficiency over conventional CPU- and GPU-accelerated networks. Using the Intel FPGA software development kit (SDK) for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated targeting heterogeneous platforms including CPUs, GPUs, and FPGAs. These networks, if properly customized on GPUs and FPGAs, can be ideal candidates for learning and inference in resource-constrained portable devices such as robots and the Internet of Things (IoT) edge devices, where power is limited and performance is critical. Here, we introduce GPU- and FPGA-accelerated deterministically binarized DNNs, tailored toward weed species classification for robotic weed control. Our developed networks are trained and benchmarked using a publicly available weed species dataset, named DeepWeeds, which include close to 18 000 weed images. We demonstrate that our FPGA-accelerated binarized networks significantly outperform their GPU-accelerated counterparts, achieving a>7-fold decrease in power consumption, while performing inference on weed images 2.86 times faster compared to our best performing baseline full-precision GPU implementation. These significant benefits are gained whilst losing only 1.17% of validation accuracy. In this paper, this is a significant step toward enabling deep inference and learning on IoT edge devices, and smart portable machines such as agricultural robots, which is the target application.
Item ID: | 57387 |
---|---|
Item Type: | Article (Research - C1) |
ISSN: | 2169-3536 |
Related URLs: | |
Copyright Information: | © IEEE |
Funders: | Nvidia |
Date Deposited: | 15 Mar 2020 22:39 |
FoR Codes: | 30 AGRICULTURAL, VETERINARY AND FOOD SCIENCES > 3002 Agriculture, land and farm management > 300207 Agricultural systems analysis and modelling @ 30% 40 ENGINEERING > 4008 Electrical engineering > 400801 Circuits and systems @ 30% 40 ENGINEERING > 4007 Control engineering, mechatronics and robotics > 400706 Field robotics @ 40% |
SEO Codes: | 86 MANUFACTURING > 8614 Machinery and Equipment > 861401 Agricultural Machinery and Equipment @ 100% |
Downloads: |
Total: 3 |
More Statistics |