Date: 21 Jan 2005
From: jyrki@diku.dk
Subject: PE-lab: Annual Report 2004

Performance Engineering Laboratory (PE-lab)
http://www.diku.dk/~jyrki/PE-lab/

* Annual Report 2004

The following people have been involved in PE-lab's activities in 2004:

Hervé Brönnimann (External collaborator, Polytechnic University)
Christopher Derek Curry (M.Sc. student)
Amr Elmasry (External collaborator, Alexandria University)
Andrei Bjarke Moskvitin Josephsen (M.Sc. student)
Claus Jensen (M.Sc. student)
Jyrki Katajainen (Group leader)
Tomi Armas Pasanen (External collaborator, University of Helsinki)
Fabio Vitale (External M.Sc. student, University of Insubria).

Most work done in the group is on empirical algorithmics or on
theoretical algorithmics. The projects mentioned below have been
completed or started in 2004.


* Scientosocial service [Jyrki]

Scandinavian Algorithm Week was organized in collaboration with three
Danish universities, the University of Copenhagen being the main
organizer. The 9th Scandinavian Workshop on Algorithm Theory and its
satellite events attracted more than 120 algorithmists and computer
scientists to Denmark, and about 80 talks were given during the six
days. A longer account about the events can be found from

http://www.diku.dk/~jyrki/SWAT/press-release.html
http://www.diku.dk/~jyrki/PE-lab/News/All-news/2004.09.10-13%3A13%3A57.html

I would like to use this opportunity to thank the Department of
Computing for the support we received.


* The cost of iterator validity [Jyrki]

Iterators are objects that point to other objects.  An iterator can be
used to iterate over a collection of elements stored in a data
structure, or simply to access the element pointed to. An iterator and
the element pointed to live in a close symbiosis; when the element is
moved, the iterator may become invalid if it is not updated
accordingly. A data structure is said to provide iterator validity if
the iterators to its elements are kept valid at all times independent
of the element moves.  As to the cost of keeping iterators valid, it
can be shown that for all fundamental data structures (linked lists,
resizable arrays, unordered dictionaries, ordered dictionaries, and
priority queues) an implementation exists that supports bidirectional
iterators, keeps the iterators valid under modifications, executes all
iterator operations in constant time in the worst case, and uses
linear space on the number of elements stored. For random-access
iterators the problem appears to be more difficult. In particular, for
vectors the general modification operations insert() and erase(), to
be supported by the C++ standard, are too strong. Efficient solutions
under these strong modification operations can be provided too, though
most of them are mainly of theoretical interest.


* An experimental evaluation of navigation piles [Claus, Jyrki]

A navigation pile is a priority-queue structure proposed recently by
Katajainen and Vitale. It has many nice theoretical features, i.e. in
many cases it can be used as a substitute for a binomial queue even if
it is only a binary tree. We have evaluated the practical efficiency
of navigation piles and made several interesting findings like the two
mentioned below:

1) Pointer-based navigation piles have difficulties in matching the
    performance of classical heaps that are implicit data
    structures. However, the difference is not that big, and when the
    elements being manipulated are large, piles are clearly faster than
    heaps.

2) Compact navigation piles that rely on bit packing can be hopelessly
    slow on contemporary computers. This made us to conclude that most
    packed data structures and algorithms relying on packing have very
    little practical relevance.


* The comparison complexity of priority-queue operations [Amr, Claus, Jyrki]

A priority queue is an abstract data type that should be provided as a
standard library component. In spite of its practical importance, it
has been long an open problem how to achieve the best possible bounds
--- taking into account constant factors --- for various
priority-queue operations. We have been able to devise a realization
of a priority queue which has optimal performance when the operations
to be supported are insert(), find-min(), delete-min(), and delete();
and almost optimal performance if in addition decrease-key() is to be
supported.


* A critical analysis of randomized data structures [Hervé, Tomi, Jyrki]

The main virtue of randomized algorithms is that often offer good
average-case performance without any assumptions about the elements
manipulated. In spite of this, randomized data structures --- in our
study randomized search trees --- are normally analysed under the
assumption that all elements are accessed with equal probability. This
gives an overly optimistic picture about their performance. According
to our theoretical and experimental analysis, the expected worst-case
behaviour of randomized search trees is much worse than that of their
best deterministic counterparts.


* Building heaps faster [Jyrki]

The construction of a heap is a classic data-structuring problem. For
all practical purposes the problem has been solved satisfactorily, but
as to its comparison complexity there is still a gap between the best
known lower bounds and upper bounds. Several conjectures have been
made about the optimal behaviour in the worst case and in the average
case. We have tried hard to disprove the stated conjectures, and have
been able to make some progress in the area of average-case algorithms.