The external heaps described above has been implemented in heaplab (extsort1/, extsort3/, exthole/) using a 4-heap as the internal sorting method. The program in extsort1/ is a plain external sort with no optimizations. The one in extsort3/ includes the optimizations in section 6.4.2 to reduce the internal cost, and the last implementation in exthole/ implements a heap with holes as described in 6.4.3, but without indirect addressing of pages, so it is not inplace.
The page size was chosen as 1/4 of the first level cache size and the elements sorted was 8 byte floating point values.
The programs were run on the two standard test machines and the results for the are shown in figure 6.2 and 6.3 respectively. The standard test machines are an 332MHz PowerPC with 32 KB first level cache and 256KB second level cache, and an 160MHz P2SC with 128KB cache.
Since the problem sizes used for the test fit in main memory, main memory should be considered external for the purpose of this test.
Figure 6.2: External heap relative to aligned d-heaps on PPC
Figure 6.3: External heap relative to aligned d-heaps on P2SC
The observations that can be made from these graphs are:
Another test was performed to verify the claim (section 6.3) that the number of comparisons executed during a sort does not depend on the page size.
The numbers of comparisons per element sorted, for a sort of 4096K elements, are shown in figure 6.1, for a number of page sizes. The optimized version includes the optimizations of section 6.4.2 while the unoptimized version does not.
|Heap w. Holes||42.18||33.80||34.75||35.74||36.73||41.61|
The number of comparisons executed by the basic heap when sorting 4096K elements is 40.85. The reason why the external heap with a page size of 1 element does not match this value, is that the current implementation does a few comparisons per page that are unnecessary when the pagesize is only one element. With careful coding this can be avoided, but since it makes absolutely no difference it has not been done.
One may wonder how effective the optimization of section 6.4.2 is. A quick test shows that a sort of 8192 pages with a page size of 500 elements, resulted in 94461 iterations in siftdown with the optimization activated in the 81380 cases. This corresponds to a hit rate of 86%.