PhD candidate at Radboud University. thomas.koopman@ru.nl
(GPG: 964110A2)
Research Interests
Very generally speaking, I am interested in scientific computing.
More specifically, my work can be divided into three categories:
Parallel Algorithms. Specifically I have worked on parallel
algorithms for clusters, GPUs, multiprocessors, and vector processors
(SIMD).
Productivity and Portability through
SaC. Single-Assignment
C (SaC) is a minimalistic language for scientific computing that does
not expose memory or parallelism. This makes it easy to use. The
compiler can generate parallel code for CPU, GPU, and clusters from a single
specification. If the parallelisation is not too sophisticated, the
performance is comparable to code written in C or Fortran.
Numerical Stability. This is a new interest of mine, about
getting a grip on the accuracy for numerical algorithms. I am currently
working on a problem in computational geometry.
Publications
Minimizing Communication in the Multidimensional FFT.
A new parallel algorithm for computing the discrete Fourier transform
on supercomputers.
bib
Rank-Polymorphism for Shape-Guided Blocking.
On showing how locality optimisations can be elegantly expressed in
a language with support for arrays of arbitrary dimension.
bib
Shray: An Owner-Compute Distributed Shared-Memory System.
A software system for easier programming of supercomputers.
bib
Modulo in high-performance code: strength reduction for
modulo-based array indexing in loops.
An optimization for programming languages that translates pretty ways
of programming a certain class of computations (stencils) to efficient
ways.
bib
Under Submission
Multi-GPU Code Generation for Out-Of-Core Problems.
A compiler backend for the programming language SaC that can leverage
multiple GPUs, also on data structures that do not fit on any single
GPU. Submitted to TFP.
Partitioning In-Place on Massively Parallel Systems.
An algorithm for splitting an array in two on a GPU, without needing
a significant amount of additional memory. Submitted to Europar.
Comparing Functional Array Languages.
Large collaborative paper with Futhark, Accelerate, APL, DaCe, and SaC.
Submitted to JPDC.
Not Peer-Reviewed
The Tensor Product of Bulk Synchronous Parallel Algorithms.
This derives a multidimensional parallel FFT algorithm from a
one-dimensional parallel one using category theory.
This is my master thesis, and the paper
Minimizing Communication in the Multidimensional FFT
is essentially this thesis with the category theory stripped out.