Structured precision skipping: Accelerating convolutional neural networks with budget-aware dynamic precision selection

Huang, Kai; CHEN, Siang; LI, Bowen; CLAESEN, Luc; Yao, Hao; Chen, Junjian; Jiang, Xiaowen; Liu, Zhili; Xiong, Dongliang

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/36647

Full metadata record

DC Field	Value	Language
dc.contributor.author	Huang, Kai	-
dc.contributor.author	CHEN, Siang	-
dc.contributor.author	LI, Bowen	-
dc.contributor.author	CLAESEN, Luc	-
dc.contributor.author	Yao, Hao	-
dc.contributor.author	Chen, Junjian	-
dc.contributor.author	Jiang, Xiaowen	-
dc.contributor.author	Liu, Zhili	-
dc.contributor.author	Xiong, Dongliang	-
dc.date.accessioned	2022-02-14T14:31:02Z	-
dc.date.available	2022-02-14T14:31:02Z	-
dc.date.issued	2022	-
dc.date.submitted	2022-02-14T14:27:09Z	-
dc.identifier.citation	Journal of systems architecture, 124 (Art N° 102403)	-
dc.identifier.issn	1383-7621	-
dc.identifier.uri	http://hdl.handle.net/1942/36647	-
dc.description.abstract	Despite the remarkable advancement in various intelligence tasks achieved by Convolutional Neural Networks, the massive computation and storage consumption limit applications on resource-constrained devices. Existing works explore to reduce computation cost by leveraging the input-dependent redundancy at runtime. The irregular dynamic sparsity distribution, however, limits the real speedup for dynamic models deployed in traditional neural network accelerators. To solve this problem, we propose an algorithm-architecture co-design, named structured precision skipping (SPS), to exploit the dynamic precision redundancy in statically quantized models. SPS computes most neurons in a lower precision and only a small portion of important neurons in a higher precision to preserve performance. Specifically, we first propose the structured dynamic block to exploit the dynamic sparsity in a structured manner. Based on the block, we then apply a budget-aware training method by inducing a budget regularization to learn the precision skipping under a target resource constraint. Finally, we present an architecture design based on the bit-serial architecture with support for SPS models, where only a predict controller module with small overhead is introduced. Extensive evaluation results demonstrate that SPS can achieve up to 1.5× speedup and 1.4× energy saving on various models and datasets with marginal accuracy loss.	-
dc.description.sponsorship	This work is supported by the National Key R&D Program of China (2020YFB0906000, 2020YFB0906001).	-
dc.language.iso	en	-
dc.publisher	ELSEVIER	-
dc.rights	2022 Elsevier B.V. All rights reserved	-
dc.subject.other	Convolutional neural networks	-
dc.subject.other	Algorithm-architecture co-design	-
dc.subject.other	Model compression and acceleration	-
dc.subject.other	Dynamic quantization	-
dc.title	Structured precision skipping: Accelerating convolutional neural networks with budget-aware dynamic precision selection	-
dc.type	Journal Contribution	-
dc.identifier.volume	124	-
local.format.pages	13	-
local.bibliographicCitation.jcat	A1	-
local.publisher.place	RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS	-
local.type.refereed	Refereed	-
local.type.specified	Article	-
local.bibliographicCitation.artnr	102403	-
dc.identifier.doi	10.1016/j.sysarc.2022.102403	-
dc.identifier.isi	000782573200004	-
dc.identifier.eissn	1873-6165	-
local.provider.type	Pdf	-
local.uhasselt.international	yes	-
item.fullcitation	Huang, Kai; CHEN, Siang; LI, Bowen; CLAESEN, Luc; Yao, Hao; Chen, Junjian; Jiang, Xiaowen; Liu, Zhili & Xiong, Dongliang (2022) Structured precision skipping: Accelerating convolutional neural networks with budget-aware dynamic precision selection. In: Journal of systems architecture, 124 (Art N° 102403).	-
item.validation	ecoom 2023	-
item.fulltext	With Fulltext	-
item.contributor	Huang, Kai	-
item.contributor	CHEN, Siang	-
item.contributor	LI, Bowen	-
item.contributor	CLAESEN, Luc	-
item.contributor	Yao, Hao	-
item.contributor	Chen, Junjian	-
item.contributor	Jiang, Xiaowen	-
item.contributor	Liu, Zhili	-
item.contributor	Xiong, Dongliang	-
item.accessRights	Restricted Access	-
crisitem.journal.issn	1383-7621	-
crisitem.journal.eissn	1873-6165	-
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
1-s2.0-S1383762122000078-main.pdf Restricted Access	Published version	1.45 MB	Adobe PDF	View/Open Request a copy

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM