Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/26396
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chakroun, Imen | - |
dc.contributor.author | HABER, Tom | - |
dc.contributor.author | Ashby, Thomas J. | - |
dc.date.accessioned | 2018-07-20T15:15:48Z | - |
dc.date.available | 2018-07-20T15:15:48Z | - |
dc.date.issued | 2017 | - |
dc.identifier.citation | Koumoutsakos, Pedro; Lees, Michael; Krzhizhanovskaya, Valeria; Dongarra, Jack; Sloot, Peter M. A. (Ed.). International conference on computational science (ICCS 2017), Elsevier Science BV,p. 2318-2322 | - |
dc.identifier.issn | 1877-0509 | - |
dc.identifier.uri | http://hdl.handle.net/1942/26396 | - |
dc.description.abstract | Stochastic Gradient Descent (SGD, or 1-SGD in our notation) is probably the most popular family of optimisation algorithms used in machine learning on large data sets due to its ability to optimise efficiently with respect to the number of complete training set data touches (epochs) used. Various authors have worked on data or model parallelism for SGD, but there is little work on how SGD fits with memory hierarchies ubiquitous in HPC machines. Standard practice suggests randomising the order of training points and streaming the whole set through the learner, which results in extremely low temporal locality of access to the training set and thus, when dealing with large data sets, makes minimal use of the small, fast layers of memory in an HPC memory hierarchy. Mini-batch SGD with batch size n (n-SGD) is often used to control the noise on the gradient and make convergence smoother and more easy to identify, but this can reduce the learning efficiency wrt. epochs when compared to 1-SGD whilst also having the same extremely low temporal locality. In this paper we introduce Sliding Window SGD (SW-SGD) which uses temporal locality of training point access in an attempt to combine the advantages of 1-SGD (epoch efficiency) with n-SGD (smoother convergence and easier identification of convergence) by leveraging HPC memory hierarchies. We give initial results on part of the Pascal dataset that show that memory hierarchies can be used to improve SGD performance. (C) 2017 The Authors. Published by Elsevier B.V. | - |
dc.description.sponsorship | European project ExCAPE [2] from the European Union's Horizon 2020 Research and Innovation programme [671555] | - |
dc.language.iso | en | - |
dc.publisher | Elsevier Science BV | - |
dc.relation.ispartofseries | Procedia Computer Science | - |
dc.rights | © 2017 The Authors. Published by Elsevier B.V | - |
dc.subject.other | SGD; sliding window; machine learning; SVM; logistic regression | - |
dc.subject.other | SGD; sliding window; machine learning; SVM; logistic regression | - |
dc.title | SW-SGD: The Sliding Window Stochastic Gradient Descent Algorithm | - |
dc.type | Proceedings Paper | - |
local.bibliographicCitation.authors | Koumoutsakos, Pedro | - |
local.bibliographicCitation.authors | Lees, Michael | - |
local.bibliographicCitation.authors | Krzhizhanovskaya, Valeria | - |
local.bibliographicCitation.authors | Dongarra, Jack | - |
local.bibliographicCitation.authors | Sloot, Peter M. A. | - |
local.bibliographicCitation.conferencedate | 12-14/07/2017 | - |
local.bibliographicCitation.conferencename | International Conference on Computational Science (ICCS) | - |
local.bibliographicCitation.conferenceplace | Zurich, Switzerland | - |
dc.identifier.epage | 2322 | - |
dc.identifier.spage | 2318 | - |
dc.identifier.volume | 108 | - |
local.format.pages | 5 | - |
local.bibliographicCitation.jcat | C1 | - |
dc.description.notes | [Chakroun, Imen; Ashby, Thomas J.] IMEC, Kapeldreef 75, B-3001 Leuven, Belgium. [Haber, Tom] Expertise Ctr Digital Media, Wetenschapspk 2, B-3590 Diepenbeek, Belgium. [Chakroun, Imen; Haber, Tom; Ashby, Thomas J.] ExaSci Life Lab, Kapeldreef 75, B-3001 Leuven, Belgium. | - |
local.publisher.place | Amsterdam, The Netherlands | - |
local.type.refereed | Refereed | - |
local.type.specified | Proceedings Paper | - |
local.relation.ispartofseriesnr | 108 | - |
local.class | dsPublValOverrule/author_version_not_expected | - |
local.type.programme | H2020 | - |
local.relation.h2020 | 671555 | - |
dc.identifier.doi | 10.1016/j.procs.2017.05.082 | - |
dc.identifier.isi | 000404959000243 | - |
local.bibliographicCitation.btitle | International conference on computational science (ICCS 2017) | - |
item.accessRights | Open Access | - |
item.validation | ecoom 2018 | - |
item.fulltext | With Fulltext | - |
item.contributor | Chakroun, Imen | - |
item.contributor | HABER, Tom | - |
item.contributor | Ashby, Thomas J. | - |
item.fullcitation | Chakroun, Imen; HABER, Tom & Ashby, Thomas J. (2017) SW-SGD: The Sliding Window Stochastic Gradient Descent Algorithm. In: Koumoutsakos, Pedro; Lees, Michael; Krzhizhanovskaya, Valeria; Dongarra, Jack; Sloot, Peter M. A. (Ed.). International conference on computational science (ICCS 2017), Elsevier Science BV,p. 2318-2322. | - |
Appears in Collections: | Research publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Haber.pdf | Published version | 473.7 kB | Adobe PDF | View/Open |
SCOPUSTM
Citations
8
checked on Sep 3, 2020
WEB OF SCIENCETM
Citations
12
checked on Apr 30, 2024
Page view(s)
132
checked on Sep 5, 2022
Download(s)
172
checked on Sep 5, 2022
Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.