For each project, I provide a number of questions and key points to be answered/addressed in your reports. Use the given links as starting points. Do some research and find more -good- references to deepen your knowledge on the matter. --------------------------- Expression templates and their use in linear algebra: - What are expression templates (ET)? - How are they implemented? (Give details and examples) - Which tools/libraries make use of them? (Give an general overview of these libraries) - Which linear algebra operations benefit from this technique? (again, give examples) - Which are the limitations of ETs? * Keywords: Expression Templates, lazy evaluation, memory bound, compute bound. * Links: * Expression Templates by Todd Veldhuizen: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.248 * Techniques for Scientific C++ by Todd Veldhuizen: www.cs.indiana.edu/pub/techreports/TR542.pdf --------------------------- Smart Expressions Templates and their use in linear algebra: - What are expression templates (ET) / smart expression templates (SET)? - What is the implementation idea/technique behind them? (Give details and examples) - Which tools/libraries make use of them? (Give an general overview of these libraries) - Which linear algebra operations benefit from this technique? (again, give examples) - Which limitations of ETs do SETs overcome and by means of what techniques? * Keywords: Expression Templates, Smart Expression Templates, memory bound, compute bound. * Links: * Expression Templates by Todd Veldhuizen: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.248 * arxiv.org/pdf/1104.1729 --------------------------- Blitz / Blaze / Eigen / Armadillo - Introduction: What is it? Motivation for the project to start? Target optimizations / use cases? - API + Operands and Datatypes + Classes of operations provided + Features (triangularity, symmetry, etc) + Examples - Which problems/operations are suitable - Which are not? - Technology behind it / which optimizations are applied? - Custom and/or own code? Relies on the BLAS library? - Comparison of performance results. * Keywords: Blitz/Blaze/Eigen/Armadillo, (expression) templates, BLAS. * Links: * http://blitz.sourceforge.net/ * https://code.google.com/p/blaze-lib/ * http://eigen.tuxfamily.org/index.php?title=Main_Page * http://arma.sourceforge.net/ --------------------------- TCE: loop fusion and loop tiling - Intro: what is TCE? - Which problem do they address? Main ideas/optimizations to address the problem? - Discuss the details of some of these ideas. Focus on minimization of operation count and memory requirements, and especially on the decision algorithms for the loop transformations. - Show some performance results. * Keywords: tensor contraction engine, loop fusion, loop tiling. * Links: * http://www.csc.lsu.edu/~gb/TCE/ * http://www.csc.lsu.edu/~gb/TCE/Publications/SpaceTime-PLDI02.pdf --------------------------- Spiral: superoptimization - Introduction to Spiral. - General overview of its architecture: phases, transformation rules, search, etc. - Focus and details on the generation of small kernels for permutations based on the superoptimization technique (see link below). - Performance results. * Keywords: Spiral, FFT, superoptimization, SIMD, intrinsics. * Links * http://spiral.net/ * http://spiral.ece.cmu.edu:8080/pub-spiral/abstract.jsp?id=160 --------------------------- LLVM: a) Code generation and auto-vectorization b) Intermediate Representation (IR) and target description (.td) files - Introduction to LLVM. - General overview of the LLVM framework/architecture: from program to machine code. - Focus on the specific module(s) for each topic. a) Code generation and autovectorization. b) Intermediate representation and the description of the target architecture in the td files. * Keywords: LLVM, vectorization, SIMD, intermediate representation, target description files. * Links: * http://llvm.org/ * http://llvm.org/docs/ * (a) http://llvm.org/docs/Vectorizers.html * (b) http://llvm.org/docs/TableGen/LangIntro.html --------------------------- Peephole optimization - General introduction to code generation and compiler optimizations. - What is the idea behind peephole optimization? - Give examples of characteristic peephole (local) optimizations. - Automatic generation of peephole optimizers. * Keywords: Code generation, peephole optimizers. * Links: * Compilers: Principles, Techniques, and Tools (Chapter 8). Available in the CS library. * Using Peephole Optimization on Intermediate Code: http://dspace.ubvu.vu.nl/bitstream/handle/1871/2606/11047.pdf * Automatic Generation of Peephole Superoptimizers: http://theory.stanford.edu/~aiken/publications/papers/asplos06.pdf --------------------------- Roofline model: memory-bound operations - What is the roofline model? - Goals of the roofline model? - Which architecture information/knowledge is required? - How to get that information? (manufacturer, empirical, tools, ...). - Concept of arithmetic intensity, compute-bound vs memory-bound, ... - Give and idea and examples of at least 3 optimizations to improve memory-bound code. * Keywords: roofline model, arithmetic intensity, memory- vs compute-bound, stream. * Links: * An Insightful Visual Performance Model for Multicore Architectures: http://www.eecs.berkeley.edu/~waterman/papers/roofline.pdf * http://www.spiral.net/software/roofline.html