Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/48256
Full metadata record
DC FieldValueLanguage
dc.contributor.authorLI, Bowen-
dc.contributor.authorHuang, Kai-
dc.contributor.authorCHEN, Siang-
dc.contributor.authorXiong, Dongliang-
dc.contributor.authorCLAESEN, Luc-
dc.date.accessioned2026-01-26T14:04:56Z-
dc.date.available2026-01-26T14:04:56Z-
dc.date.issued2021-
dc.date.submitted2026-01-15T09:21:14Z-
dc.identifier.citationBalasubramanian, Vineeth N.; Tsang, Ivor (Ed.). Proceedings of Machine Learning Research PMLR, p. 502 -517-
dc.identifier.issn2640-3498-
dc.identifier.urihttp://hdl.handle.net/1942/48256-
dc.description.abstractWith the development of deployment platforms and application scenarios for deep neural networks, traditional fixed network architectures cannot meet the requirements. Meanwhile the dynamic network inference becomes a new research trend. Many slimmable and scalable networks have been proposed to satisfy different resource constraints (e.g., storage, latency and energy). And a single network may support versatile architectural configurations including: depth, width, kernel size, and resolution. In this paper, we propose a novel network architecture reuse strategy enabling dynamic precision in parameters. Since our low-precision networks are wrapped in the high-precision networks like an onion, we name it dynamic precision onion quantization (DPOQ). We train the network by using the joint loss with scaled gradients. To further improve the performance and make different precision network compatible with each other, we propose the precision shift batch normalization (PSBN). And we also propose a scalable input-specific inference mechanism based on this architecture and make the network more adaptable. Experiments on the CIFAR and ImageNet dataset have shown that our DPOQ achieves not only better flexibility but also higher accuracy than the individual quantization-
dc.language.isoen-
dc.subject.otherDynamic Quantization-
dc.subject.otherNeural Network-
dc.titleDPOQ: Dynamic Precision Onion Quantization-
dc.typeProceedings Paper-
local.bibliographicCitation.authorsBalasubramanian, Vineeth N.-
local.bibliographicCitation.authorsTsang, Ivor-
local.bibliographicCitation.conferencedate2021, November 17-19-
local.bibliographicCitation.conferencenameAsian Conference on Machine Learning-
local.bibliographicCitation.conferenceplaceOnline-
dc.identifier.epage517-
dc.identifier.spage502-
local.format.pages16-
local.bibliographicCitation.jcatC1-
dc.relation.referencesRon Banner, Yury Nahshan, and Daniel Soudry. Post training 4-bit quantization of convolutional networks for rapid-deployment. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 7948–7956. Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. Once-for-all: Train one network and specialize it for efficient deployment. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. PACT: parameterized clipping activation for quantized neural networks. CoRR, abs/1805.06085, 2018. Yoni Choukroun, Eli Kravchik, Fan Yang, and Pavel Kisilev. Low-bit quantization of neural networks for efficient inference. In 2019 IEEE/CVF International Conference on Computer Vision Workshops, ICCV Workshops 2019, Seoul, Korea (South), October 27-28, 2019, pages 3009–3018. Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 3123–3131. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA, pages 248–255. Yoav Goldberg. A primer on neural network models for natural language processing. J. Artif. Intell. Res., 57:345–420, 2016. Sorin Mihai Grigorescu, Bogdan Trasnea, Tiberiu T. Cocias, and Gigel Macesanu. A survey of deep learning techniques for autonomous driving. J. Field Robotics, 37(3):362–386, 2020. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 770–778. Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Q. Weinberger. Multi-scale dense networks for resource efficient image classification. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized neural networks. In Daniel D. Lee, Masashi Sugiyama, Ulrike von Luxburg, Isabelle Guyon, and Roman Garnett, editors, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 4107–4115. Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, pages 448–456, 2015. Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew G. Howard, Hartwig Adam, and Dmitry Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In 2018 IEEE Conference on DPOQ: Dynamic Precision Onion Quantization Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 2704–2713. IEEE Computer Society. Qing Jin, Linjie Yang, and Zhenyu Liao. Adabits: Neural network quantization with adaptive bit-widths. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 2143–2153. Yigitcan Kaya, Sanghyun Hong, and Tudor Dumitras. Shallow-deep networks: Understanding and mitigating network overthinking. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pages 3301–3310. PMLR. Markus Nagel, Mart van Baalen, Tijmen Blankevoort, and Max Welling. Data-free quantization through weight equalization and bias correction. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 1325–1334. Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. Xnor-net: Imagenet classification using binary convolutional neural networks. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part IV, pages 525–542. Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. Regularized evolution for image classifier architecture search. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pages 4780–4789. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, pages 1–9. IEEE Computer Society. Surat Teerapittayanon, Bradley McDanel, and H. T. Kung. Branchynet: Fast inference via early exiting from deep neural networks. In 23rd International Conference on Pattern Recognition, ICPR 2016, Canc´un, Mexico, December 4-8, 2016, pages 2464–2469. Xin Wang, Fisher Yu, Zi-Yi Dou, Trevor Darrell, and Joseph E. Gonzalez. Skipnet: Learning dynamic routing in convolutional networks. In Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss, editors, Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XIII, volume 11217 of Lecture Notes in Computer Science, pages 420–436. Le Yang, Yizeng Han, Xi Chen, Shiji Song, Jifeng Dai, and Gao Huang. Resolution adaptive networks for efficient inference. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 2366–2375. Jiahui Yu and Thomas S. Huang. Universally slimmable networks and improved training techniques. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 1803–1811, a. Jiahui Yu and Thomas S. Huang. Network slimming by slimmable networks: Towards one-shot architecture search for channel numbers. CoRR, abs/1903.11728, b. Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, and Thomas S. Huang. Slimmable neural networks. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. Linfeng Zhang, Jiebo Song, Anni Gao, Jingwei Chen, Chenglong Bao, and Kaisheng Ma. Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 3712–3721. IEEE, a. Linfeng Zhang, Zhanhong Tan, Jiebo Song, Jingwei Chen, Chenglong Bao, and Kaisheng Ma. SCAN: A scalable neural networks framework towards compact and efficient models. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 4029–4038, b. Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, and Zhiru Zhang. Precision gating: Improving neural network efficiency with dynamic dual-precision activations. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, c. Shuchang Zhou, Zekun Ni, Xinyu Zhou, He Wen, Yuxin Wu, and Yuheng Zou. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. CoRR, abs/1606.06160, 2016. Barret Zoph and Quoc V. Le. Neural architecture search with reinforcement learning. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net. Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V. Le. Learning transferable architectures for scalable image recognition. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 8697–8710. IEEE Computer Society-
local.type.refereedRefereed-
local.type.specifiedProceedings Paper-
local.relation.ispartofseriesnr157-
local.provider.typePdf-
local.bibliographicCitation.btitleProceedings of Machine Learning Research PMLR-
local.uhasselt.internationalyes-
item.contributorLI, Bowen-
item.contributorHuang, Kai-
item.contributorCHEN, Siang-
item.contributorXiong, Dongliang-
item.contributorCLAESEN, Luc-
item.accessRightsOpen Access-
item.fulltextWith Fulltext-
item.fullcitationLI, Bowen; Huang, Kai; CHEN, Siang; Xiong, Dongliang & CLAESEN, Luc (2021) DPOQ: Dynamic Precision Onion Quantization. In: Balasubramanian, Vineeth N.; Tsang, Ivor (Ed.). Proceedings of Machine Learning Research PMLR, p. 502 -517.-
Appears in Collections:Research publications
Files in This Item:
File Description SizeFormat 
DPQQ - Dynamic Precision Onion Quantization.pdfPublished version419.97 kBAdobe PDFView/Open
Show simple item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.