Algorithm, Data Structure
A Deep Dive into Heap and Heap Sort in Python: From Beginner to Expert
Master the heap and heap sort and take your Python skills to the next level
Are you tired of slow, inefficient sorting algorithms that take forever to process large datasets?
It’s time to check out the heap sort, a high-performance sorting algorithm that can quickly and efficiently sort datasets in O(Nlg(N)) time complexity and O(1) space complexity.
In this article, we’ll take a deep dive into heap and heap sort in Python, exploring the inner workings of this powerful algorithm and learning how to implement it step by step.
Table of Contents · Introduction to Heap: A Tree-Based Data Structure · How Does Heap Sort Work ∘ How to represent a heap in Python? ∘ How to build a heap? ∘ How to update the heap after extracting one element? · Implementing Heap Sort in Python · Use the heapq Module of Python · Time and Space Complexity of Heap Sort
Introduction to Heap: A Tree-Based Data Structure
A heap is a complete binary tree that satisfies the heap property, which states that the value of each node in the tree is greater than or equal to the values of its children.
There are two types of heaps:
- Max heap: In a max heap, the value of each node is greater than or equal to the values of its children, and the root node has the maximum value in the tree.
- Min heap: In a min heap, the value of each node is less than or equal to the values of its children, and the root node has the minimum value in the tree.
How Does Heap Sort Work
Heap sort is a comparison-based sorting algorithm that uses a heap data structure to sort a list of elements. It works by building a heap from the input list, and then repeatedly extracting the root element (the maximum or minimum value) and placing it at the end of the sorted list. This process is repeated until the heap is empty and the list is fully sorted.
Simply put, there are two steps:
- Building a heap
- Extracting the elements from the heap
How to represent a heap in Python?
First and foremost, we need to know how to represent a heap properly before building it.
Given that a heap is a complete binary tree, we can just use a Python list to represent the heap.
The list representation of a tree means the real tree is in our mind. We always operate the list in the real code.
It works because of the following special relations between the list and the tree:
- The root of the tree is always the element at index
0. - The left child of a node at index
iis stored at index2i+1, and the right child is stored at index2i+2.
Therefore, we can always access the tree’s nodes easily through the index of the list.
For example, here is an input list named arr.
Its elements are [5, 2, 7, 1, 3].
It represents an original complete binary tree and we need to convert it as a heap like the following:

- The root of the tree is
arr[0] - The left child of the root is
arr[2*0+1], and the right child isarr[2*0+2].
How to build a heap?
The process of building a heap from a list is named heapify.
The original list is already a representation of a binary tree (the tree is in our mind) as shown above. What we need to do is converting the original tree into a heap.
Since the max heap and the min heap share the similar idea. Let’s just talk about how to build a max heap.
The idea is to traverse the non-leaf nodes of the binary tree from bottom to top to construct a max heap, for each non-leaf node, compare it with its left and right children, and swap the largest value with the parent node of this subtree.
Why start from non-leaf nodes rather than the last node?
Because there are no child nodes below leaf nodes, there is no need to operate.
Why traverse from bottom to top rather than from top to bottom?
If we go up from the bottom, the largest value will be put to the root node for sure (similar idea as bubble sort).
How to update the heap after extracting one element?
After converting the original list into a heap, the rest is simple. We just need to repeatedly extract the root element of the heap, which is the maximum/minimum element, and put it into the end of the sorted list that needs to be returned.
However, there is one thing we need to do — update the whole heap after extracting the root node.
It needs two steps:
- Replace the root node with the last element in the heap: Remove the root node (which has the largest value in a max heap or the smallest value in a min heap) and replace it with the last element in the heap.
- Heapify the whole heap again.
Here is a visualized example of the whole heap building and sorting process:

Implementing Heap Sort in Python
Talk is cheap, let’s see the code:





