Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/10835
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGOORTS, Patrik-
dc.contributor.authorROGMANS, Sammy-
dc.contributor.authorBEKAERT, Philippe-
dc.date.accessioned2010-04-06T11:37:44Z-
dc.date.available2010-04-06T11:37:44Z-
dc.date.issued2009-
dc.identifier.citation15th International Conference on Parallel and Distributed Systems. p. 300-307.-
dc.identifier.isbn978-0-7695-3900-3-
dc.identifier.urihttp://hdl.handle.net/1942/10835-
dc.description.abstractIn this paper, we investigate discrete finite impulse response (FIR) filtering of images, while harnessing the powerful computational resources of next-generation GPUs. These novel platforms exhibit a massive data parallel architecture with an advanced SIMT execution model and thread management, to enable designers to better cope with the infamous memory wall, i.e. the growing gap between the cost of data communication and computational processing. However, the concerning platforms still have hard constraints that prevent trivial optimization of convolution filtering. Although automatic (compiler) optimization is available, we investigate and explain the speedup potential considering manual intervention, given the context of FIR kernels. Furthermore, we present multiple convolution implementation techniques that are able to cope with the hard platform constraints in different situations, while still being able to optimize the implementation to the underlying architecture. Utilizing the acquired insights, a view is given on the impact for possible optimization when loosening these hard constraints in the near future.-
dc.language.isoen-
dc.publisherIEEE Computer Society-
dc.relation.ispartofseriesParallel and Distributed Systems, International Conference on-
dc.subject.otherdata distribution, FIR, convolution, CUDA-
dc.titleOptimal Data Distribution for Versatile Finite Impulse Response Filtering on Next-Generation Graphics Hardware Using CUDA-
dc.typeProceedings Paper-
local.bibliographicCitation.conferencename15th International Conference on Parallel and Distributed Systems-
local.bibliographicCitation.conferenceplaceShenzhen, Guangdong, China, December 09-December 11 2009-
dc.identifier.epage307-
dc.identifier.spage300-
local.bibliographicCitation.jcatC1-
local.type.specifiedProceedings Paper-
dc.bibliographicCitation.oldjcatC2-
dc.identifier.urlhttp://doi.ieeecomputersociety.org/10.1109/ICPADS.2009.79-
local.bibliographicCitation.btitle15th International Conference on Parallel and Distributed Systems-
item.fulltextNo Fulltext-
item.fullcitationGOORTS, Patrik; ROGMANS, Sammy & BEKAERT, Philippe (2009) Optimal Data Distribution for Versatile Finite Impulse Response Filtering on Next-Generation Graphics Hardware Using CUDA. In: 15th International Conference on Parallel and Distributed Systems. p. 300-307..-
item.contributorGOORTS, Patrik-
item.contributorROGMANS, Sammy-
item.contributorBEKAERT, Philippe-
item.accessRightsClosed Access-
Appears in Collections:Research publications
Show simple item record

Page view(s)

72
checked on Nov 7, 2023

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.