In the context of the genome-wide association study (GWAS), one has to solve long sequences of generalized least-squares problems; such a problem presents two limiting factors: execution time (often in the range of days or weeks) and data management (in the order of
Terabytes).
We present algorithms that obviate both issues.
By taking advantage of domain-specific knowledge, exploiting parallelism provided by multicores and GPU, and handling data effciently, our algorithms attain unequalled high performance. When compared to GenABEL, one of the most widely used libraries for GWAS, we obtain speedups up to a factor of 50.