Optimal Data Distribution for Versatile Finite Impulse Response Filtering on Next-Generation Graphics Hardware Using CUDA

GOORTS, Patrik; ROGMANS, Sammy; BEKAERT, Philippe

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/10835

Full metadata record

DC Field	Value	Language
dc.contributor.author	GOORTS, Patrik	-
dc.contributor.author	ROGMANS, Sammy	-
dc.contributor.author	BEKAERT, Philippe	-
dc.date.accessioned	2010-04-06T11:37:44Z	-
dc.date.available	2010-04-06T11:37:44Z	-
dc.date.issued	2009	-
dc.identifier.citation	15th International Conference on Parallel and Distributed Systems. p. 300-307.	-
dc.identifier.isbn	978-0-7695-3900-3	-
dc.identifier.uri	http://hdl.handle.net/1942/10835	-
dc.description.abstract	In this paper, we investigate discrete finite impulse response (FIR) filtering of images, while harnessing the powerful computational resources of next-generation GPUs. These novel platforms exhibit a massive data parallel architecture with an advanced SIMT execution model and thread management, to enable designers to better cope with the infamous memory wall, i.e. the growing gap between the cost of data communication and computational processing. However, the concerning platforms still have hard constraints that prevent trivial optimization of convolution filtering. Although automatic (compiler) optimization is available, we investigate and explain the speedup potential considering manual intervention, given the context of FIR kernels. Furthermore, we present multiple convolution implementation techniques that are able to cope with the hard platform constraints in different situations, while still being able to optimize the implementation to the underlying architecture. Utilizing the acquired insights, a view is given on the impact for possible optimization when loosening these hard constraints in the near future.	-
dc.language.iso	en	-
dc.publisher	IEEE Computer Society	-
dc.relation.ispartofseries	Parallel and Distributed Systems, International Conference on	-
dc.subject.other	data distribution, FIR, convolution, CUDA	-
dc.title	Optimal Data Distribution for Versatile Finite Impulse Response Filtering on Next-Generation Graphics Hardware Using CUDA	-
dc.type	Proceedings Paper	-
local.bibliographicCitation.conferencename	15th International Conference on Parallel and Distributed Systems	-
local.bibliographicCitation.conferenceplace	Shenzhen, Guangdong, China, December 09-December 11 2009	-
dc.identifier.epage	307	-
dc.identifier.spage	300	-
local.bibliographicCitation.jcat	C1	-
local.type.specified	Proceedings Paper	-
dc.bibliographicCitation.oldjcat	C2	-
dc.identifier.url	http://doi.ieeecomputersociety.org/10.1109/ICPADS.2009.79	-
local.bibliographicCitation.btitle	15th International Conference on Parallel and Distributed Systems	-
item.fullcitation	GOORTS, Patrik; ROGMANS, Sammy & BEKAERT, Philippe (2009) Optimal Data Distribution for Versatile Finite Impulse Response Filtering on Next-Generation Graphics Hardware Using CUDA. In: 15th International Conference on Parallel and Distributed Systems. p. 300-307..	-
item.contributor	GOORTS, Patrik	-
item.contributor	ROGMANS, Sammy	-
item.contributor	BEKAERT, Philippe	-
item.accessRights	Closed Access	-
item.fulltext	No Fulltext	-
Appears in Collections:	Research publications

Show simple item record

Google Scholar^TM

Check

Google ScholarTM

Altmetric

Google Scholar^TM