Martin Kronbichler
Lehrstuhl für Numerische Mechanik
Technische Universität München
Munich, Germany
My talk will present our work on fast matrix-free methods in the deal.II finite element library. The implementation is based on sum factorization techniques for quadrilateral and hexahedral elements, which combine low operation counts with a memory-efficient data layout. This design makes the matrix-free operator evaluation two to five times faster than sparse matrix-vector products already on quadratic elements. Due to better complexity, the gap increases as the element order increases. In particular, matrix-free operator evaluation cost is almost constant per degree of freedom for element degrees between two and ten. The algorithms are designed for support of adaptive meshes with hanging nodes. In terms of generic finite element design, we aim to use similar implementations for continuous and discontinuous bases. For both continuous and discontinuous finite element formulations, our methods reach 25 to 70 percent of arithmetic peak of modern CPUs, which is more than an order of magnitude higher than matrix-based methods and very competitive in the context of PDE solvers. Despite the high level of optimization, we aim for a generic programming interface where the weak form in quadrature points, the access to source and destination vectors, and the desired order of derivatives are specified in a transparent way. The matrix-free kernels have been used in several complex application scenarios in computational fluid dynamics and wave propagation, including matrix-free multigrid methods, and run on up to 150,000 processor cores.