Date: 03 Dec 2002
From: jyrki@diku.dk
Subject: Oral defence of M.Sc. thesis
Oral defence of M.Sc. thesis: Cache-oblivious algorithms in practice
Defenders: Jesper Holm Olsen & Søren Christian Skov
Censor: Jørgen Staunstrup
Supervisor: Jyrki Katajainen

Time: Wednesday, 18 December 2002, 15.15–17.00
Place: Auditorium Lille-UP1 at DIKU

The defence will be held in Danish.

Abstract:

The memory of contemporary computers is structured in a hierarchy of successively larger, slower, and cheaper memory levels. Each level contains a working copy---or cache---of the level above. Recent developments in processor and memory technology imply an increasing penalty if programs do not take optimal advantage of the memory hierarchy. To alleviate this, the notion of cache-oblivious algorithms has been developed. The main idea behind cache-oblivious algorithms is to achieve optimal use of caches on all levels of a memory hierarchy without knowledge of their sizes. The purpose of this thesis is to examine cache-oblivious algorithms from a practical point of view. The thesis consists of a discussion of the theory, and a thorough benchmarking of some of the most recently published cache-oblivious algorithms, in the form of two priority-queue algorithms.

The cache-oblivious theory has, so far, not incorporated the virtual memory system. We show that each of the levels in the virtual memory system can be seen as a separate cache level, and is therefore also encompassed by the theoretical model.

Several practical details that are not explained in the original articles had to be handled in order to implement the selected algorithms. Most importantly, we clarify how the algorithms should be constructed to use linear space.

We furthermore develop a new optimal cache-oblivious priority deque, based on one of the cache-oblivious priority queues.

The benchmarking results show that, for the cache-oblivious algorithms implemented, the extra work incurred by making algorithms cache oblivious is too big to make them competitive on all levels in the memory hierarchy.

Cache-oblivious algorithms do outperform traditional RAM algorithms when working on data sets larger than main memory, but in the lower levels of the memory hierarchy, the traditional RAM algorithms are faster, due to a smaller amount of work. The cache-aware implementations exhibit good use of the caches without too much extra work, and are therefore significantly better.

Based on our investigation of two priority-queue algorithms, we conclude that the cache-oblivious approach is currently not able to offer a competitive priority queue, except when working on data sets larger than main memory.

The text, the source code, and all benchmark results can be downloaded from: http://www.dunkel.dk/thesis

PE-lab's homepage: http://www.diku.dk/~jyrki/PE-lab/