Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/37822
Title: Structured Dynamic Precision for Deep Neural Networks Quantization
Authors: Huang, Kai
Li, Bowen
Xiong, Dongliang
Jiang, Haitian
Jiang, Xiaowen
Yan, Xiaolang
CLAESEN, Luc 
Liu, Dehong
Chen, Junjian
Liu, Zhili
Issue Date: 2022
Publisher: ACM
Source: ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS,
Abstract: Deep Neural Networks (DNNs) have achieved remarkable success in various Artiicial Intelligence (AI) applications. Quantization is a critical step in DNNs compression and acceleration for deployment. To further boost DNN execution eiciency, many works explore to leverage the input-dependent redundancy with dynamic quantization for diferent regions. However, the sensitive regions in the feature map are irregularly distributed, which restricts the real speed up for existing accelerators. To this end, we propose an algorithm-architecture co-design, named Structured Dynamic Precision (SDP). In speciic, we propose a quantization scheme in which the high-order bit part and the low-order bit part of data can be masked independently. And a ixed number of term parts are dynamically selected for computation based on the importance of each term in the group. We also present a hardware design to enable the algorithm eiciently with small overheads, whose inference time mainly scales with the precision proportionally. Evaluation experiments on extensive networks demonstrate that compared to the state-of-the-art dynamic quantization accelerator DRQ, our SDP can achieve 29% performance gain and 51% energy reduction for the same level of model accuracy.
Keywords: Neural Networks;compression and accelleration;systolic array;algorithm-architecture co-design
Document URI: http://hdl.handle.net/1942/37822
ISSN: 1084-4309
e-ISSN: 1557-7309
DOI: 10.1145/3549535
ISI #: 000917034400012
Rights: © 2022 Association for Computing Machinery
Category: A1
Type: Journal Contribution
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
3549535.pdf
  Restricted Access
Published version1.46 MBAdobe PDFView/Open    Request a copy
Show full item record

WEB OF SCIENCETM
Citations

1
checked on Apr 24, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.