A novel feature representation: Aggregating convolution kernels for image retrieval

WANG, Qi; Lai, Jinxing; CLAESEN, Luc; Yang, Zhenguo; Lei, Liang; Liu, Wenyin

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/31322

Full metadata record

DC Field	Value	Language
dc.contributor.author	WANG, Qi	-
dc.contributor.author	Lai, Jinxing	-
dc.contributor.author	CLAESEN, Luc	-
dc.contributor.author	Yang, Zhenguo	-
dc.contributor.author	Lei, Liang	-
dc.contributor.author	Liu, Wenyin	-
dc.date.accessioned	2020-06-30T05:50:35Z	-
dc.date.available	2020-06-30T05:50:35Z	-
dc.date.issued	2020	-
dc.date.submitted	2020-06-27T10:47:51Z	-
dc.identifier.citation	Neural Networks, 130 (2020) , p. 1 -10	-
dc.identifier.issn	0893-6080	-
dc.identifier.uri	http://hdl.handle.net/1942/31322	-
dc.description.abstract	Activated hidden unites in convolutional neural networks (CNNs), known as feature maps, dominate image representation, which is compact and discriminative. For ultra-large data sets, high dimensional feature maps in float format not only result in high computational complexity, but also occupy massive memory space. To this end, a new image representation by aggregating convolution kernels (ACK) is proposed, where some convolution kernels capturing certain patterns are activated. The top-n index numbers of the convolution kernels are extracted directly as image representation in discrete integer values, which rebuild relationship between convolution kernels and image. Furthermore, a distance measurement is defined from the perspective of ordered sets to calculate position-sensitive similarities between image representations. Extensive experiments conducted on Oxford Buildings, Paris, and Holidays, etc., manifest that the proposed ACK achieves competitive performance on image retrieval with much lower computational cost, outperforming the ones using feature maps for image representation.	-
dc.description.sponsorship	Acknowledgments This work is supported by the National Natural Science Foun-dationofChina(No.61703109,No.91748107,No.61902077,No.61675050), Guangdong Basic and Applied Basic Research Foun-dation (No. 2020A1515010616), Guangdong Innovative ResearchTeamProgram(No.2014ZT05G157).ThisstudywassupportedbytheSpecialResearchFundofHasseltUniversity(No.BOF20BL01)	-
dc.language.iso	en	-
dc.publisher	PERGAMON-ELSEVIER SCIENCE LTD	-
dc.rights	2020 Elsevier Ltd.	-
dc.subject.other	Image Retrieval	-
dc.subject.other	Image Representation	-
dc.subject.other	Feature Aggregating	-
dc.subject.other	Distance Measurement	-
dc.subject.other	Convolutional Neural Networks	-
dc.title	A novel feature representation: Aggregating convolution kernels for image retrieval	-
dc.type	Journal Contribution	-
dc.identifier.epage	10	-
dc.identifier.issue	2020	-
dc.identifier.spage	1-10	-
dc.identifier.volume	130	-
local.format.pages	10	-
local.bibliographicCitation.jcat	A1	-
local.publisher.place	THE BOULEVARD, LANGFORD LANE, KIDLINGTON, OXFORD OX5 1GB, ENGLAND	-
dc.relation.references	X. Hua. Video2shop: Exactly matching clothes in videos to online shopping images. CoRR, abs/1804.05287, 2018. O. Chum, A. Mikul k, M. Perdoch, and J. Matas. Total recall II: query expansion revisited. In CVPR 2011, pages 889{896, 2011. doi: 10.1109/CVPR.2011.5995601. Y. N. Claure, E. T. Matsubara, C. Padovani, and R. C. Prati. Polywatt: A polynomial water travel time estimator based on derivative dynamic time warping and perceptually impor- tant points. Computers & Geosciences, 112:54{63, 2018. doi: 10.1016/j.cageo.2017.12.002. G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In ECCV, volume 1, pages 1{2, 2004. 10 A. ElAdel, M. Zaied, and C. B. Amar. Fast DCNN based on fwt, intelligent dropout and layer skipping for image retrieval. Neural Networks, 95:10{18, 2017. doi: 10.1016/j.neunet.2017.07.015. D. Erhan, Y. Bengio, A. Courville, and P. Vincent. Visualizing higher-layer features of a deep network. University of Montreal, 1341(3):1, 2009. R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel. Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. CoRR, abs/1811.12231, 2018. G. Georgakis, S. Karanam, Z. Wu, J. Ernst, and J. Ko seck a. End- to-end learning of keypoint detector and descriptor for pose in- variant 3d matching. In CVPR, pages 1965{1973, June 2018. doi: 10.1109/CVPR.2018.00210. A. Gordo, J. Almaz an, J. Revaud, and D. Larlus. Deep image retrieval: Learning global representations for image search. In ECCV, pages 241{257, 2016. doi: 10.1007/9783319464664 15. A. Gordo, J. Almaz an, J. Revaud, and D. Larlus. End-to-end learn- ing of deep visual representations for image retrieval. Interna- tional Journal of Computer Vision, 124(2):237{254, 2017. doi: 10.1007/s1126301710168. H. Guo, K. Zheng, X. Fan, H. Yu, and S. Wang. Visual attention con- sistency under image transforms for multi-label image classi cation. In CVPR, June 2019. doi: 10.1109/CVPR.2019.00082. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, pages 770{778, 2016. doi: 10.1109/CVPR.2016.90. H. J egou and O. Chum. Negative evidences and co-occurences in image retrieval: The bene t of PCA and whitening. In ECCV, pages 774{787, 2012. doi: 10.1007/9783642337093 55. H. J egou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large scale image search. In ECCV, pages 304{317, 2008. doi: 10.1007/9783540886822 24. H. J egou, M. Douze, C. Schmid, and P. P erez. Aggregating local descriptors into a compact image representation. In CVPR, pages 3304{3311, 2010. doi: 10.1109/CVPR.2010.5540039. Y. Kalantidis, C. Mellina, and S. Osindero. Cross-dimensional weight- ing for aggregated deep convolutional features. In ECCV, pages 685{701, 2016. doi: 10.1007/9783319466040 48. E. J. Keogh and M. J. Pazzani. Derivative dynamic time warping. In SIAM, pages 1{11, 2001. doi: 10.1137/1.9781611972719.1. P. W. Koh and P. Liang. Understanding black-box predictions via in uence functions. CoRR, abs/1703.04730, 2017. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classi cation with deep convolutional neural networks. In NIPS, pages 1106{ 1114, 2012. doi: 10.1145/3065386. K. Liu, H. Wang, F. Nie, and H. Zhang. Learning multi-instance enriched image representations via non-greedy ratio maximization of the l1-norm distances. In CVPR, pages 7727{7735, 06 2018. doi: 10.1109/CVPR.2018.00806. Y. Liu, F. Nie, Q. Gao, X. Gao, J. Han, and L. Shao. Flexible unsupervised feature extraction for image classi cation. Neural Networks, 115:65{71, 2019. doi: 10.1016/j.neunet.2019.03.008. D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91{110, 2004. doi: 10.1023/B:VISI.0000029664.99615.94. J. Y. Ng, F. Yang, and L. S. Davis. Exploiting local features from deep networks for image retrieval. In CVPR, pages 53{61, 2015. doi: 10.1109/CVPRW.2015.7301272. D. Nist er and H. Stew enius. Scalable recognition with a vo- cabulary tree. In CVPR, pages 2161{2168, 2006. doi: 10.1109/CVPR.2006.264. K. Pang, K. Li, Y. Yang, H. Zhang, T. M. Hospedales, T. Xiang, and Y.-Z. Song. Generalising ne-grained sketch-based image retrieval. In CVPR, June 2019. doi: 10.1109/CVPR.2019.00077. F. Perronnin, J. S anchez, and T. Mensink. Improving the sher kernel for large-scale image classi cation. In ECCV, pages 143{156, 2010. doi: 10.1007/9783642155611 11. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, 2007. doi: 10.1109/CVPR.2007.383172. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale im- age databases. In CVPR, 2008. doi: 10.1109/CVPR.2008.4587635. F. Radenovic, G. Tolias, and O. Chum. Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell., 41(7):1655{1668, 2019. doi: 10.1109/TPAMI.2018.2846566. A. S. Razavian, J. Sullivan, S. Carlsson, and A. Maki. Visual instance retrieval with deep convolutional networks. Ite Transactions on Me- dia Technology and Applications, 4, 2014. doi: 10.3169/mta.4.251. E. Rosten and T. Drummond. Machine learning for high-speed corner detection. In ECCV, pages 430{443, 2006. doi: 10.1007/11744023 34. P. C. Roy and V. N. Boddeti. Mitigating information leakage in image representations: A maximum entropy approach. In CVPR, June 2019. doi: 10.1109/CVPR.2019.00269. M. Sarigul, B. M. Ozyildirim, and M. Avci. Di erential convolu- tional neural network. Neural Networks, 116:279{287, 2019. doi: 10.1016/j.neunet.2019.04.025. K. Simonyan and A. Zisserman. Very deep convolutional net- works for large-scale image recognition. In ICLR, 2015. URL http://arxiv.org/abs/1409.1556. J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In ICCV, pages 1470{1477, 2003. doi: 10.1109/ICCV.2003.1238663. J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. A. Riedmiller. Striving for simplicity: The all convolutional net. In ICLR, 2015. URL http://arxiv.org/abs/1412.6806. K. Sun, S. Mou, J. Qiu, T. Wang, and H. Gao. Adaptive fuzzy control for nontriangular structural stochastic switched nonlinear systems with full state constraints. IEEE Transactions on Fuzzy Systems, 27(8):1587{1601, 2019. doi: 10.1109/TFUZZ.2018.2883374. K. Sun, J. Qiu, H. R. Karimi, and H. Gao. A novel nite- time control for nonstrict feedback saturated nonlinear systems with tracking error constraint. IEEE Transactions on Sys- tems, Man, and Cybernetics: Systems, pages 1{12, 2019. doi: 10.1109/TSMC.2019.2958072. K. Sun, L. Liu, J. Qiu, and G. Feng. Fuzzy adaptive nite- time fault-tolerant control for strict-feedback nonlinear sys- tems. IEEE Transactions on Fuzzy Systems, 2020. doi: 10.1109/TFUZZ.2020.2965890. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In ICLR, 2014. URL http://arxiv.org/abs/1312.6199. C. Tian, Y. Xu, and W. Zuo. Image denoising using deep cnn with batch renormalization. Neural Networks, 121:461{473, 2020. doi: 10.1016/j.neunet.2019.08.022. G. Tolias, R. Sicre, and H. J egou. Particular object retrieval with integral max-pooling of CNN activations. In ICLR, 2016. URL http://arxiv.org/abs/1511.05879. N. Vo, L. Jiang, C. Sun, K. Murphy, L.-J. Li, L. Fei-Fei, and J. Hays. Composing text and image for image retrieval - an empirical odyssey. In CVPR, June 2019. doi: 10.1109/CVPR.2019.00660. L. Wang, B. Wang, Z. Zhang, Q. Ye, L. Fu, G. Liu, and M. Wang. Robust auto-weighted projective low-rank and sparse recovery for visual representation. Neural Networks, 117:201{215, 2019a. doi: 10.1016/j.neunet.2019.05.007. Q. Wang, J. Lai, K. Xu, W. Liu, and L. Lei. Beauty product image retrieval based on multi-feature fusion and feature ag- gregation. In ACM Multimedia, pages 2063{2067, 2018. doi: 10.1145/3240508.3266431. Y. Wang, X. Tao, X. Shen, and J. Jia. Wide-context se- mantic image extrapolation. In CVPR, June 2019b. doi: 10.1109/CVPR.2019.00149. D. Wu, Q. Dai, J. Liu, B. Li, and W. Wang. Deep incremental hashing network for e cient image retrieval. In CVPR, June 2019. doi: 10.1109/CVPR.2019.00928. X. Yang, N. Wang, B. Song, and X. Gao. Bosr: A cnn-based aurora image retrieval method. Neural Networks, 116:188{197, 2019. doi: 10.1016/j.neunet.2019.04.012. M. D. Zeiler and R. Fergus. Visualizing and understanding convolu- 11 tional networks. In ECCV, pages 818{833, 2014. doi: 10.1007/978- 3319105901 53. F. Zhan and S. Lu. Esir: End-to-end scene text recognition via iterative image recti cation. In CVPR, June 2019. doi: 10.1109/CVPR.2019.00216. L. Zheng, Y. Yang, and Q. Tian. SIFT meets CNN: A decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell., 40 (5):1224{1244, 2018. doi: 10.1109/TPAMI.2017.2709749. L. Zhou and X. Gu. Embedding topological features into convolutional neural network salient object detection. Neural Networks, 121: 308{318, 2020. doi: 10.1016/j.neunet.2019.09.009. Y. Zhu, F. Zhuang, J. Wang, J. Chen, Z. Shi, W. Wu, and Q. He. Multi-representation adaptation network for cross-domain im- age classi cation. Neural Networks, 119:214{221, 2019. doi: 10.1016/j.neunet.2019.07.010.	-
local.type.refereed	Refereed	-
local.type.specified	Article	-
dc.identifier.doi	10.1016/j.neunet.2020.06.010	-
dc.identifier.pmid	32589586	-
dc.identifier.isi	WOS:000567813200001	-
dc.identifier.eissn	1879-2782	-
local.provider.type	Pdf	-
local.uhasselt.uhpub	yes	-
local.uhasselt.international	yes	-
item.fullcitation	WANG, Qi; Lai, Jinxing; CLAESEN, Luc; Yang, Zhenguo; Lei, Liang & Liu, Wenyin (2020) A novel feature representation: Aggregating convolution kernels for image retrieval. In: Neural Networks, 130 (2020) , p. 1 -10.	-
item.contributor	WANG, Qi	-
item.contributor	Lai, Jinxing	-
item.contributor	CLAESEN, Luc	-
item.contributor	Yang, Zhenguo	-
item.contributor	Lei, Liang	-
item.contributor	Liu, Wenyin	-
item.validation	ecoom 2021	-
item.accessRights	Open Access	-
item.fulltext	With Fulltext	-
crisitem.journal.issn	0893-6080	-
crisitem.journal.eissn	1879-2782	-
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
WangQ_2020.pdf Restricted Access	Published version	2.37 MB	Adobe PDF	View/Open Request a copy
A Novel Feature Representation Aggregating Convolution Kernels for Image Retrieval.pdf	Peer-reviewed author version	12.55 MB	Adobe PDF	View/Open

Show simple item record

SCOPUS^TM
Citations

21

checked on Feb 8, 2026

WEB OF SCIENCE^TM
Citations

19

checked on Feb 10, 2026

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM