2024-03-28T18:25:41Zhttp://digital.csic.es/dspace-oai/requestoai:digital.csic.es:10261/1011422021-12-28T15:42:19Zcom_10261_74com_10261_6col_10261_327
MC64-ClustalWP2: A highly-parallel hybrid strategy to align multiple sequences in many-core architectures
Díaz, David
Esteban, Francisco J.
Hernández Molina, Pilar
Caballero, Juan Antonio
Guevara, Antonio
Dorado, Gabriel
Gálvez, Sergio
Ministerio de Economía y Competitividad (España)
Junta de Andalucía
CSIC - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA)
We have developed the MC64-ClustalWP2 as a new implementation of the Clustal W algorithm, integrating a novel parallelization strategy and significantly increasing the performance when aligning long sequences in architectures with many cores. It must be stressed that in such a process, the detailed analysis of both the software and hardware features and peculiarities is of paramount importance to reveal key points to exploit and optimize the full potential of parallelism in many-core CPU systems. The new parallelization approach has focused into the most time-consuming stages of this algorithm. In particular, the so-called progressive alignment has drastically improved the performance, due to a fine-grained approach where the forward and backward loops were unrolled and parallelized. Another key approach has been the implementation of the new algorithm in a hybrid-computing system, integrating both an Intel Xeon multi-core CPU and a Tilera Tile64 many-core card. A comparison with other Clustal W implementations reveals the high-performance of the new algorithm and strategy in many-core CPU architectures, in a scenario where the sequences to align are relatively long (more than 10 kb) and, hence, a many-core GPU hardware cannot be used. Thus, the MC64-ClustalWP2 runs multiple alignments more than 18x than the original Clustal W algorithm, and more than 7x than the best x86 parallel implementation to date, being publicly available through a web service. Besides, these developments have been deployed in cost-effective personal computers and should be useful for life-science researchers, including the identification of identities and differences for mutation/polymorphism analyses, biodiversity and evolutionary studies and for the development of molecular markers for paternity testing, germplasm management and protection, to assist breeding, illegal traffic control, fraud prevention and for the protection of the intellectual property (identification/ traceability), including the protected designation of origin, among other applications. © 2014 Díaz et al.
2014-08-25T12:13:20Z
2014-08-25T12:13:20Z
2014-04-07
2014-08-25T12:13:20Z
artículo
PLoS ONE 9(4): e94044 (2014)
http://hdl.handle.net/10261/101142
10.1371/journal.pone.0094044
http://dx.doi.org/10.13039/501100003329
http://dx.doi.org/10.13039/100007652
http://dx.doi.org/10.13039/501100011011
24710354
eng
Publisher’s version
http://dx.doi.org/10.1371/journal.pone.0094044
Sí
http://creativecommons.org/licenses/by/4.0/
openAccess
Public Library of Science