English   español  
Please use this identifier to cite or link to this item: http://hdl.handle.net/10261/139427
logo share SHARE logo core CORE   Add this article to your Mendeley library MendeleyBASE

Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL
Exportar a otros formatos:

Exploiting task and data parallelism in ILUPACK's preconditioned CG solver on NUMA architectures and many-core accelerators

AuthorsAliaga, Jose I.; Badia, Rosa M.; Barreda, Maria; Bollhöfer, Matthias; Dufrechou, Ernesto; Ezzatti, Pablo; Quintana-Orti, Enrique S.
KeywordsSparse linear systems
Task and data parallelism
Reconditioned Conjugate Gradient solver
Multi-core processors
Intel Xeon Phi
Graphics processing units (GPUs)
Issue Date2016
CitationParallel Computing 54: 97- 107 (2016)
AbstractWe present specialized implementations of the preconditioned iterative linear system solver in ILUPACK for Non-Uniform Memory Access (NUMA) platforms and many-core hardware co-processors based on the Intel Xeon Phi and graphics accelerators. For the conventional x86 architectures, our approach exploits task parallelism via the OmpSs runtime as well as a message-passing implementation based on MPI, respectively yielding a dynamic and static schedule of the work to the cores, with different numeric semantics to those of the sequential ILUPACK. For the graphics processor we exploit data parallelism by off-loading the computationally expensive kernels to the accelerator while keeping the numeric semantics of the sequential case.
Identifiersdoi: 10.1016/j.parco.2015.12.004
issn: 0167-8191
Appears in Collections:(IIIA) Artículos
Files in This Item:
File Description SizeFormat 
accesoRestringido.pdf15,38 kBAdobe PDFThumbnail
Show full item record
Review this work

Related articles:

WARNING: Items in Digital.CSIC are protected by copyright, with all rights reserved, unless otherwise indicated.