Home People Publications Talks Teaching Contact Github

Publications - Matthias Petschow

Journal Articles

  1. Improved Accuracy and Parallelism for MRRR-based Eigensolvers
    Matthias Petschow, Enrique S. Quintana-Orti and Paolo Bientinesi
    SIAM Journal of Scientific Computing, Volume 36(2), pp. 240-263, April 2014.
        author  = "Matthias Petschow and {Enrique S.} Quintana-Orti and Paolo Bientinesi",
        title   = "Improved Accuracy and Parallelism for MRRR-based Eigensolvers",
        journal = "SIAM Journal of Scientific Computing",
        year    = 2014,
        volume  = 36,
        number  = 2,
        pages   = "240--263",
        month   = apr,
        url     = "http://arxiv.org/pdf/1304.1864v3"
    The real symmetric tridiagonal eigenproblem is of outstanding importance in numerical computations; it arises frequently as part of eigensolvers for standard and generalized dense Hermitian eigenproblems that are based on a reduction to real tridiagonal form. For its solution, the algorithm of Multiple Relatively Robust Representations (MRRR) is among the fastest methods. Although fast, the solvers based on MRRR do not deliver the same accuracy as competing methods like Divide & Conquer or the QR algorithm. In this paper, we demonstrate that the use of mixed precisions leads to improved accuracy of MRRR-based eigensolvers with limited or no performance penalty. As a result, we obtain eigensolvers that are not only equally or more accurate than the best available methods, but also - under most circumstances - faster and more scalable than the competition.
  2. High-Performance Solvers for Dense Hermitian Eigenproblems
    SIAM Journal on Scientific Computing, Volume 35(1), pp. C1-C22, January 2013.
        author    = "Matthias Petschow and Elmar Peise and Paolo Bientinesi",
        title     = "High-Performance Solvers for Dense Hermitian Eigenproblems",
        journal   = "SIAM Journal on Scientific Computing",
        year      = 2013,
        volume    = 35,
        number    = 1,
        pages     = "C1-C22",
        month     = jan,
        doi       = "10.1137/110848803",
        publisher = "Society for Industrial and Applied Mathematics",
        url       = "http://arxiv.org/pdf/1205.2107.pdf"
    We introduce a new collection of solvers - subsequently called EleMRRR - for large-scale dense Hermitian eigenproblems. EleMRRR solves various types of problems: generalized, standard, and tridiagonal eigenproblems. Among these, the last is of particular importance as it is a solver on its own right, as well as the computational kernel for the first two; we present a fast and scalable tridiagonal solver based on the Algorithm of Multiple Relatively Robust Representations - referred to as PMRRR. Like the other EleMRRR solvers, PMRRR is part of the freely available Elemental library, and is designed to fully support both message-passing (MPI) and multithreading parallelism (SMP). As a result, the solvers can equally be used in pure MPI or in hybrid MPI-SMP fashion. We conducted a thorough performance study of EleMRRR and ScaLAPACK's solvers on two supercomputers. Such a study, performed with up to 8,192 cores, provides precise guidelines to assemble the fastest solver within the ScaLAPACK framework; it also indicates that EleMRRR outperforms even the fastest solvers built from ScaLAPACK's components.
  3. MR3-SMP: A Symmetric Tridiagonal Eigensolver for Multi-Core Architectures
    Parallel Computing, Volume 37(12), December 2011.
        author  = "Matthias Petschow and Paolo Bientinesi",
        title   = "MR3-SMP: A Symmetric Tridiagonal Eigensolver for Multi-Core Architectures",
        journal = "Parallel Computing",
        year    = 2011,
        volume  = 37,
        number  = 12,
        month   = dec,
        url     = "http://hpac.cs.umu.se/~pauldj/pubs/MR3-SMP-TR.pdf"
    The computation of eigenvalues and eigenvectors of symmetric tridiagonal matrices arises frequently in applications; often as one of the steps in the solution of Hermitian and symmetric eigenproblems. While several accurate and efficient methods for the tridiagonal eigenproblem exist, their corresponding implementations usually target uni-processors or large distributed memory systems. Our new eigensolver MR3-SMP is instead specifically designed for multi-core and many-core general purpose processors, which today have effectively replaced uni-processors. We show that in most cases MR3-SMP is faster and achieves better speedups than state-of-the-art eigensolvers for uni-processors and distributed-memory systems.
  4. Condensed Forms for the Symmetric Eigenvalue Problem on Multi-threaded Architectures
    Paolo Bientinesi, Francisco Igual, Daniel Kressner, Matthias Petschow and Enrique S. Quintana-Orti
    Concurrency and Computation: Practice and Experience, Volume 23(7), May 2011.
        author  = "Paolo Bientinesi and Francisco Igual and Daniel Kressner and Matthias Petschow and {Enrique S.} Quintana-Orti",
        title   = "Condensed Forms for the Symmetric Eigenvalue Problem on Multi-threaded Architectures",
        journal = "Concurrency and Computation: Practice and Experience",
        year    = 2011,
        volume  = 23,
        number  = 7,
        month   = may,
        url     = "http://hpac.cs.umu.se/~pauldj/pubs/red-multi-TR.pdf"
    We investigate the performance of the routines in LAPACK and the Successive Band Reduction (SBR) toolbox for the reduction of a dense matrix to tridiagonal form, a crucial preprocessing stage in the solution of the symmetric eigenvalue problem, on general-purpose multi-core processors. In response to the advances of hardware accelerators, we also modify the code in the SBR toolbox to accelerate the computation by off-loading a significant part of the operations to a graphics processor (GPU). The performance results illustrate the parallelism and scalability of these algorithms on current high-performance multi-core and many-core architectures.
  5. Effects of Pupil Discretization and Littrow Illumination in the Simulation of Bright-Field Defect Detection
    Stephan Rafler, Matthias Petschow, Uwe Seifert, Karsten Frenner, Jens Goeckeritz and Wolfgang Osten
    Optics Letters, Volume 34(12), pp. 1840-1842, 2009.
        author  = "Stephan Rafler and Matthias Petschow and Uwe Seifert and Karsten Frenner and Jens Goeckeritz and Wolfgang Osten",
        title   = "Effects of Pupil Discretization and Littrow Illumination in the Simulation of Bright-Field Defect Detection",
        journal = "Optics Letters",
        year    = 2009,
        volume  = 34,
        number  = 12,
        pages   = "1840--1842"
    The simulation of optical defect detection on wafers by microscopy requires several approximations. For the simulation with Fourier modal methods, the number of modes is limited by memory and time consumption. Furthermore, the illumination pupil has to be divided into discrete incidence angles for plane waves. We present a way to save the computation of most of these incidence angles. It works only for certain configurations of grating period and illumination wavelength but is an accurate and robust approximation in these cases. We present sample calculations for one- and two-dimensional periodic structures with defects.

Peer Reviewed Conference Publications

  1. An Example of Symmetry Exploitation for Energy-related Eigencomputations
    INTERNATIONAL CONFERENCE OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING 2009: (ICCMSE 2009), AIP Conference Proceedings, Volume 1504, pp. 1134-1137, 2012.
        author    = "Matthias Petschow and Edoardo {Di Napoli} and Paolo Bientinesi",
        title     = "An Example of Symmetry Exploitation for Energy-related Eigencomputations",
        year      = 2012,
        volume    = 1504,
        series    = "AIP Conference Proceedings",
        pages     = "1134--1137",
        doi       = "10.1063/1.4772126",
        url       = "http://link.aip.org/link/doi/10.1063/1.4772126"
    One of the most used approaches in simulating materials is the tight-binding approximation. When using this method in a material simulation, it is necessary to compute the eigenvalues and eigenvectors of the Hamiltonian describing the system. In general, the system possesses few explicit symmetries. Due to them, the problem has many degenerate eigenvalues. The ambiguity in choosing a orthonormal basis of the invariant subspaces, associated with degenerate eigenvalues, will result in eigenvectors which are not invariant under the action of the symmetry operators in matrix form. A meaningful computation of the eigenvectors needs to take those symmetries into account. A natural choice is a set of eigenvectors, which simultaneously diagonalizes the Hamiltonian and the symmetry matrices. This is possible because all the matrices commute with each other. The simultaneous eigenvectors and the corresponding eigenvalues will be in a parametrized form in terms of the lattice momentum components. This functional dependence of the eigenvalues is the dispersion relation and describes the band structure of a material. Therefore it is important to find this functional dependence in any numerical computation related to material properties.