Accelerating the computation of FLAPW methods on heterogeneous architectures

Davidović, Davor; Fabregat-Traver, Diego; Höhnerbach, Markus; Di Napoli, Edoardo

izvor podataka: crosbi ✓

Accelerating the computation of FLAPW methods on heterogeneous architectures (CROSBI ID 254232)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Davidović, Davor ; Fabregat-Traver, Diego ; Höhnerbach, Markus ; Di Napoli, Edoardo Accelerating the computation of FLAPW methods on heterogeneous architectures // Concurrency and computation, 30 (2018), 24; e4905, 14. doi: 10.1002/cpe.4905

Podaci o odgovornosti

Autori

Davidović, Davor ; Fabregat-Traver, Diego ; Höhnerbach, Markus ; Di Napoli, Edoardo

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Accelerating the computation of FLAPW methods on heterogeneous architectures

Sažetak

Legacy codes in computational science and engineering have been very successful in providing essential functionality to researchers. However, they are not capable of exploiting the massive parallelism provided by emerging heterogeneous architectures. The lack of portable performance and scalability puts them at high risk, ie, either they evolve or they are destined to be executed on older platforms and small clusters. One example of a legacy code which would heavily benefit from a modern redesign is FLEUR, a software for electronic structure calculations. In previous work, the computational bottleneck of FLEUR was partially re-engineered to have a modular design that relies on standard building blocks, namely, BLAS and LAPACK libraries. In this paper, we demonstrate how the initial redesign enables the portability to heterogeneous architectures. More specifically, we study different approaches to port the code to architectures consisting of multi-core CPUs equipped with one or more coprocessors such as Nvidia GPUs and Intel Xeon Phis. Our final code attains over 70% of the architectures' peak performance and outperforms Nvidia's and Intel's libraries. On JURECA, the large tier-0 cluster where FLEUR is often executed, the code takes advantage of the full power of the computing nodes, attaining 5× speedup over the sole use of the CPUs.

Ključne riječi

FLAPW ; FLEUR ; hybrid BLAS ; multiGPU ; Phi ; portability ; scalability

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o izdanju

Časopis

Concurrency and computation

Volumen (broj)

30 (24)

Godina

2018.

Broj rada

e4905

Broj stranica

Status objave rada

objavljeno

ISSN

1532-0626

e-ISSN

1532-0634

DOI

10.1002/cpe.4905

Povezanost rada

Povezane osobe

Davor Davidović (autor/i)

Povezane ustanove

Institut Ruđer Bošković (098) (autorova ustanova)

Područje

Matematika, Fizika, Računarstvo

Poveznice

doi.org

onlinelibrary.wiley.com

Indeksiranost

Scopus

Current Contents Connect (CCC)

Web of Science Core Collection, Science Citation Index Expanded (WoSCC-SCI-Exp)

Web of Science Core Collection, SCI-Exp, SSCI & A&HCI (WoSCC-SCI-Exp, SSCI, A&HCI)