Master's thesis by Ramón Salvador Soto Rouco (August 2010)
The increasing computation power in modern computers in the form of several cores per processor and more processors, makes it necessary to rethink or to redesign sequential algorithms and data structures. An obvious approach would be to use parallelism. Since there are several different programming models to implement parallelism, which one should be used?
In this thesis various parallel programming models are studied from a practical approach and evaluated with emphasis on efficiency on multi-core computers. A multi-threaded framework, based on a shared-memory model, is designed and implemented. The framework offers a pure and simple C++ API, a limit on usage of threads, load-balance between work performed by threads, atomicity when a shared data structure state is changed and low memory usage.
A helping technique is introduced. This technique can distribute work between inactive processors in logarithmic time. Also two types of barriers are introduced: a global thread barrier, which reduces the use of synchronization barriers; and a job barrier, which can execute the work binded to a barrier if no other thread is performing the task instead of just waiting.
Satisfactory results are presented when comparing the presented framework to other available frameworks and libraries. The comparison shows that similar practical efficiency on the parallel implementation of sorting algorithms is obtained, sometimes even better, with use of less memory.
The whole thesis is available for download: [pdf]