Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/37822
Title: | Structured Dynamic Precision for Deep Neural Networks Quantization | Authors: | Huang, Kai Li, Bowen Xiong, Dongliang Jiang, Haitian Jiang, Xiaowen Yan, Xiaolang CLAESEN, Luc Liu, Dehong Chen, Junjian Liu, Zhili |
Issue Date: | 2022 | Publisher: | ACM | Source: | ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, | Abstract: | Deep Neural Networks (DNNs) have achieved remarkable success in various Artiicial Intelligence (AI) applications. Quantization is a critical step in DNNs compression and acceleration for deployment. To further boost DNN execution eiciency, many works explore to leverage the input-dependent redundancy with dynamic quantization for diferent regions. However, the sensitive regions in the feature map are irregularly distributed, which restricts the real speed up for existing accelerators. To this end, we propose an algorithm-architecture co-design, named Structured Dynamic Precision (SDP). In speciic, we propose a quantization scheme in which the high-order bit part and the low-order bit part of data can be masked independently. And a ixed number of term parts are dynamically selected for computation based on the importance of each term in the group. We also present a hardware design to enable the algorithm eiciently with small overheads, whose inference time mainly scales with the precision proportionally. Evaluation experiments on extensive networks demonstrate that compared to the state-of-the-art dynamic quantization accelerator DRQ, our SDP can achieve 29% performance gain and 51% energy reduction for the same level of model accuracy. | Keywords: | Neural Networks;compression and accelleration;systolic array;algorithm-architecture co-design | Document URI: | http://hdl.handle.net/1942/37822 | ISSN: | 1084-4309 | e-ISSN: | 1557-7309 | DOI: | 10.1145/3549535 | ISI #: | 000917034400012 | Rights: | © 2022 Association for Computing Machinery | Category: | A1 | Type: | Journal Contribution |
Appears in Collections: | Research publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
3549535.pdf Restricted Access | Published version | 1.46 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.