Heap Sort is a comparison-based sorting algorithm that utilizes a Max Heap data structure to sort elements with a time complexity of O(n log n) and can be implemented to sort in place, requiring no additional space.
Abstract
Heap Sort is an optimization of the selection sort algorithm that operates by initially building a Max Heap from the input data. The algorithm iteratively removes the maximum element from the heap and places it into the sorted portion of the array. It then rebuilds the heap by shifting down the remaining elements to maintain the heap property. The process continues until the unsorted portion is empty. The article describes two implementations: one that uses a subclassed "SortingHeap" with an updated "Shift down" method to manage the boundary between sorted and unsorted elements, and another that sorts the array in place without additional space overhead. Both methods rely on the "heapify" function to maintain the heap property during the sorting process. The in-place sorting method is particularly advantageous as it achieves the O(n log n) time complexity without requiring extra space, making it an efficient choice for sorting large datasets.
Opinions
The author suggests that Heap Sort is an improvement over the traditional selection sort due to its use of the heap data structure.
By using a Max Heap, the algorithm ensures that the maximum element is always at the root, allowing for efficient extraction of the maximum element during the sorting process.
The author emphasizes the importance of the "Shift down" method in the Heap Sort algorithm, which is modified to handle the boundary between the sorted and unsorted parts of the array.
The article advocates for the in-place sorting method as a space-efficient alternative to the traditional Heap Sort, which requires additional space proportional to the size of the input data.
The author provides a comparative perspective by mentioning Quick Sort and Bucket Sort, hinting at the broader context of sorting algorithms and their respective advantages and trade-offs.
Heap Sort
Heap Sort is an improved Selection Sort based on the (max) Heap data structure;
It divides elements into two subparts (sorted and unsorted). Heap Sort iteratively reduces the unsorted part, removing the “max” element and moving it to the sorted one until the unsorted part is empty.
First, build the Max Heap:
Then sort the heap elements:
There are two ways, use a Heap directly (with some update) or create some functions without the Heap (based on Heap code);
To differentiate the sorted and unsorted parts, we need to update the Heap we created in an earlier article, let subclass it “SortingHeap,” and redefine the “Shift down” method to mark this border between unsorted and sorted parts.
We need a “Shift down” method that takes an extra parameter to define the stop limit when shifting down. But first, let’s override the actual “Shift down” method to call the new “Shift down” method with the extra parameter; in this case, let’s give the count to shift down in the full array.
The original method, “Shift down,” was checking for a given node the priority with its children and was swapping it with a child if one of them had a higher priority, and calling “Shift down” to the updated child. We want the same logic, but only if the child is under the limit. Then now, we check if any child is under that limit and has a higher priority before swapping and shifting down.
Now, we can create a function that receives an array, then create a Max Heap with this array.
Because the Max Heap always has the max element in the first position. We can iterate through each element in a reverse way (from the end to the beginning of the array), swap it with the first element (max), and rebuild the Heap (“heapify” to make the first element the maximum again) down to the swapped element (but not included). Eventually, we return the sorted elements.
This sorting method has a time complexity O(n log n) but also requires extra space O(n) to store the element in the Heap. To improve this, we could create a function that takes an array and “heapify” and sorts in place; it will be the same logic of using a heap but trimmed of the unnecessary code plus updating the given array; this is the code:
Like the sort method using a heap, we need to “build” the Heap and then sort the elements. To build the Heap, we call the “heapify method” on the first half of the array (as we remember for the Heap chapter, the second part of the elements is the last level of the binary tree and will be updated by updating the parents in the first half of the array).
Then, like the first sorting function, we will iterate through all elements, from the end to the beginning of the array, swapping the first element (max) with the current element and heapify the unsorted part again (up to but not including the current element).
All the magic happens in the “heapify” function below:
This function receives three parameters: the array to update, the count (limit to stop shifting down, separating the unsorted and sorted parts), and the current element’s index.
We set the current index with the given index, then calculate its children’s indexes.
If the left child is under the given limit and has higher priority, we update the current index with the left index and repeat for the right index.
If the current index does not change, this element is already in place; there is no need to do anything else. Otherwise, we swap elements at the current and given index and continue to shift down by calling the “heapify” function with the current index.
Using these sorting methods, we get a time complexity O(n log n) and do not require any extra space O(1). Most of the code from the Heap has been trimmed to get a relatively straightforward logic to make this sorting in place.