Talks - Jan Winkelmann

Optimizing the ChASE eigensolver for Bethe-Salpeter computations
Jan Winkelmann, Edoardo Di Napoli and André Schleife
7th Workshop of the Joint Laboratory for Extreme Scale Computing.
17 July 2017.
The Chebyshev Accelerated Subspace iteration Eigensolver (ChASE) is an iterative eigensolver developed at the JSC by the SimLab Quantum Materials. The solver mainly targets sequences of dense eigenvalue problems as they arise in Density Functional Theory, but can also work on the single eigenproblem. ChASE leverages on the predominant use of BLAS 3 subroutines to achieve close-to-peak performance and potentially achieve scalability over hundreds if not thousands of computing nodes. We have recently succeeded to integrate a version of the ChASE library within the Jena BSE code. Preliminary comparison between ChASE and the conjugate gradient eigensolver (KSCG), previously used by the Jena BSE code, shows that ChASE can outperform KSCG with speedups up to 5X. In this talk we illustrate our latest results and give an outlook of the scientific problems that can be tackled once the integration is successfully completed.
abstract web PDF hide
Towards Automated Load Balancing via Spectrum Slicing for FEAST-like Solvers
Jan Winkelmann and Edoardo Di Napoli
6th Workshop of the Joint Laboratory for Extreme Scale Computing.
30 November 2016.
web PDF hide
Optimizing Least-Squares Rational Filters for Solving Interior Eigenvalue Problems
Jan Winkelmann and Edoardo Di Napoli
International Workshop on Parallel Matrix Algorithms and Applications.
Bordeaux, France, 6 July 2016.
web PDF hide
Exploring OpenMP Task Priorities on the MR3 Eigensolver
Jan Winkelmann and Paolo Bientinesi
SIAM Conference on Parallel Processing for Scientific Computing.
Université Pierre et Marie Curie, Paris, 12 April 2016.
As part of the OpenMP 4.1 draft, the runtime incorporates task priorities. We use the Method of Multiple Relatively Robust Representations (MR3), for which a pthreads-based task parallel version already exists (MR3SMP), to analyze and compare the performance of MR3SMP with three different OpenMP runtimes, with and without the support of priorities. From a dataset consisting of application matrices, it appears that OpenMP is always on par or better than the pthreads implementation
abstract web PDF hide