avatarbtd

Summary

The provided content offers comprehensive strategies and practical examples for effectively solving problems involving heaps (priority queues) in programming, particularly in the context of LeetCode challenges.

Abstract

The web content titled "LeetCode Pattern: 18 Tips & Strategies for Solving Heap (Priority Queue) Problems (Including 10 Classic Problems & Solutions)" provides an in-depth guide to understanding and utilizing heaps in algorithmic problems. It begins by explaining the fundamental concepts of heaps and their typical use cases, such as maintaining a collection of elements with associated priorities. The guide emphasizes the importance of understanding the problem, choosing the right type of heap, and maintaining heap invariants during operations. It covers various scenarios where heaps are useful, including finding the k-th smallest or largest element, merging sorted lists, and implementing priority queues. The content also delves into advanced topics such as using heaps for sorting (Heapsort), dynamic updates, graph algorithms like Dijkstra's and Prim's, median maintenance, sliding window problems, and custom priority queue implementations. Throughout the text, the author provides code snippets in Python to illustrate the concepts and offers tips for handling edge cases and duplicate elements.

Opinions

  • The author suggests that a clear understanding of the problem statement is crucial for determining the appropriate heap operations to use.
  • The choice between a min-heap and a max-heap is problem-dependent, and making the right choice is key to an efficient solution.
  • Heaps are particularly useful for problems that require efficient retrieval of the smallest or largest element, such as finding the k-th smallest or largest element in a collection.
  • The use of built-in libraries like Python's heapq is recommended for ease of implementation and efficiency.
  • The guide conveys that maintaining heap invariants during insertion, deletion, and heapify operations is essential for the correctness of heap-based algorithms.
  • Dijkstra's and Prim's algorithms are highlighted as classic examples where priority queues (heaps) are used to find the shortest path and minimum spanning tree, respectively.
  • The author emphasizes the efficiency of heaps for dynamic updates, such as insertions and deletions, due to their logarithmic time complexity.
  • For problems involving a stream of elements, such as maintaining the median, the author recommends using two heaps: one for the lower half and one for the upper half of the data.
  • The guide suggests that when dealing with sliding window problems, a heap can be instrumental in tracking the maximum or minimum element within the window.
  • The author provides a custom implementation of a priority queue using a binary heap for languages that do not have a built-in priority queue data structure.
  • The content acknowledges the challenge of handling duplicate elements in heaps and offers solutions such as using additional information or modifying comparison functions.

LeetCode Pattern: 18 Tips & Strategies for Solving Heap (Priority Queue) Problems (Including 10 Classic Problems & Solutions)

Photo by Linus Belanger on Unsplash

I. OVERVIEW:

Heaps, particularly Priority Queues, are data structures that maintain a set of elements with associated priorities.

  • A specialized tree-based data structure that satisfies the heap property.
  • A Heap, often implemented as a Priority Queue, is a data structure that maintains a partially ordered tree structure. The key feature of a heap is that at any node, the key of that node is less than or equal to the keys of its children. This makes it suitable for efficiently maintaining the maximum or minimum element in a collection.
  • Heaps can be implemented as a binary heap or a more generalized structure like a Fibonacci heap. The most common use case is implementing priority queues, where elements with higher priority (or lower priority depending on the use case) are dequeued before those with lower priority.

II. TIPS & STRATEGIES FOR SOLVING HEAP (PRIORITY QUEUE) PROBLEMS:

1. Understand the Problem:

  • Clearly understand the problem statement and the operations you need to perform using a heap. Identify whether it involves finding the maximum, minimum, or k-th element.

1.1. Common Situations Where Heaps Are Useful and How to Identify Them:

i. Finding Kth Smallest/Largest Element:

  • If the problem involves finding the Kth smallest or largest element in a collection, consider using a min-heap for Kth smallest and max-heap for Kth largest.

ii. Merge K Sorted Lists:

  • If you need to merge K sorted lists, a min-heap can help efficiently get the smallest element among the current elements from all lists.

iii. Top K Elements:

  • When you are asked to find the top K elements based on some criteria, a heap can be useful. For example, finding the K most frequent elements or K largest elements.

iv. Dijkstra’s Algorithm:

  • If the problem involves finding the shortest path in a weighted graph, Dijkstra’s algorithm typically uses a priority queue (min-heap).

v. Prim’s Algorithm:

  • When solving minimum spanning tree problems, Prim’s algorithm also uses a priority queue.

vi. Identify Heaps:

  • Look for keywords or phrases like: “smallest,” “largest,” “top K,” “minimum,” “maximum,” “priority,” “shortest path,” or “minimum spanning tree.”

1.2. General Approach for Solving Heaps:

i. Choose the Appropriate Heap Type:

  • Min-heap for finding smallest elements.
  • Max-heap for finding largest elements.

ii. Operations on Heap:

  • Insert elements into the heap.
  • Extract the minimum/maximum element from the heap.
  • Update the heap if necessary.

iii. Use Heap Libraries:

  • In languages like Python and Java, there are built-in libraries for heaps. For example, in Python, you can use the heapq module.

iv. Handle Edge Cases:

  • Ensure you handle edge cases properly, such as an empty heap or when the input size is less than K.

2. Choose the Right Heap Type:

  • Choose the appropriate type of heap based on the problem requirements. A max heap is suitable for finding the maximum element, while a min heap is used for finding the minimum element.

2.1. Min-Heap:

  • Use a min-heap when you need to find the smallest element or the K smallest elements.
  • Commonly used for problems such as finding the Kth smallest element, merging K sorted lists, or extracting the minimum element in a continuous stream of data.
import heapq

# Example: Creating a min-heap
min_heap = [3, 1, 4, 1, 5, 9, 2, 6]
heapq.heapify(min_heap)

2.2. Max-Heap:

  • Use a max-heap when you need to find the largest element or the K largest elements.
  • Commonly used for problems such as finding the Kth largest element, implementing a priority queue for maximum priority, or extracting the maximum element in a continuous stream of data.
import heapq

# Example: Creating a max-heap
max_heap = [-3, -1, -4, -1, -5, -9, -2, -6]
heapq.heapify(max_heap)

3. Use Priority Queues for k-th Element Problems:

  • Priority queues are often used to find the k-th smallest or largest element efficiently. Familiarize yourself with operations like push, pop, and top for min and max heaps.

3.1. Min-Heap for k-th Smallest Element:

  • Use a min-heap to efficiently find the k-th smallest element in a collection.
  • The top of the min-heap will always contain the smallest element.
  • Perform k-1 pop operations to get the k-th smallest element.
import heapq

# Example: Using min-heap for k-th smallest element
def kth_smallest(nums, k):
    min_heap = nums[:k]
    heapq.heapify(min_heap)

    for num in nums[k:]:
        if num > min_heap[0]:
            heapq.heappop(min_heap)
            heapq.heappush(min_heap, num)

    return min_heap[0]

# Usage
nums = [3, 1, 4, 1, 5, 9, 2, 6]
k = 3
result = kth_smallest(nums, k)
print(result)  # Output: 3

3.2. Max-Heap for k-th Largest Element:

  • Use a max-heap to efficiently find the k-th largest element in a collection.
  • The top of the max-heap will always contain the largest element.
  • Perform k-1 pop operations to get the k-th largest element.
import heapq

# Example: Using max-heap for k-th largest element
def kth_largest(nums, k):
    max_heap = [-num for num in nums[:k]]
    heapq.heapify(max_heap)

    for num in nums[k:]:
        if -num > max_heap[0]:
            heapq.heappop(max_heap)
            heapq.heappush(max_heap, -num)

    return -max_heap[0]

# Usage
nums = [3, 1, 4, 1, 5, 9, 2, 6]
k = 3
result = kth_largest(nums, k)
print(result)  # Output: 4

4. Heapify Existing Data:

  • If the problem involves working with an existing array, consider heapifying the array initially to build a heap efficiently.
  • Heapify is often done bottom-up, starting from the last non-leaf node and moving towards the root.
  • Heapifying an existing array is an efficient way to convert the array into a heap, either a min-heap or a max-heap, depending on the problem requirements.
# Example of heapifying an existing array to build a min-heap

import heapq

def heapify(nums):
    # Start from the last non-leaf node and move towards the root
    n = len(nums)
    start = (n // 2) - 1

    # Perform heapify for each non-leaf node
    for i in range(start, -1, -1):
        min_heapify(nums, n, i)

def min_heapify(nums, n, i):
    smallest = i
    left_child = 2 * i + 1
    right_child = 2 * i + 2

    # Compare with left child
    if left_child < n and nums[left_child] < nums[smallest]:
        smallest = left_child

    # Compare with right child
    if right_child < n and nums[right_child] < nums[smallest]:
        smallest = right_child

    # Swap if needed and continue heapify
    if smallest != i:
        nums[i], nums[smallest] = nums[smallest], nums[i]
        min_heapify(nums, n, smallest)

# Example usage
nums = [3, 1, 4, 1, 5, 9, 2, 6]
heapify(nums)
print(nums)  # Output: [1, 1, 2, 3, 5, 9, 4, 6]
  • After heapification, the array satisfies the heap property, making it easier to perform efficient operations such as extracting the minimum element or inserting elements.

5. Maintain Invariants:

  • While performing operations on the heap, ensure that the heap properties (min heap or max heap) are maintained. This is crucial for the correctness of your solution.

5.1. Key Points for Heap Operations:

i. Insertion:

  • When inserting an element into the heap, ensure that the heap property is maintained.
  • For a min-heap, the newly inserted element should be swapped with its parent until the heap property is satisfied.
  • For a max-heap, the process is similar, but the element is swapped with its parent until it is greater than or equal to its parent.

ii. Deletion:

  • When extracting the minimum or maximum element from the heap, ensure that the heap property is restored after the extraction.
  • For a min-heap, replace the root with the last element and then perform a heapify operation to maintain the min-heap property.
  • For a max-heap, the process is similar, but you replace the root with the last element and then perform a max-heapify operation.

iii. Heapify:

  • When heapifying an array or subtree, make sure to follow the proper heapify procedure for the specific type of heap (min-heap or max-heap).
  • Heapify should be done bottom-up, starting from the last non-leaf node and moving towards the root.

iv. Updating Elements:

  • If an element’s value is updated in a way that may violate the heap property, adjust the element’s position accordingly.
  • For example, if you decrease the value of an element in a min-heap, you may need to move it up in the heap.
#  Example illustrating the importance of maintaining heap invariants during insertion

import heapq
# Inserting an element into a min-heap
min_heap = [3, 1, 4, 1, 5, 9, 2, 6]
# Inserting a new element (7) into the min-heap
heapq.heappush(min_heap, 7)
print(min_heap)  # Output: [1, 1, 2, 3, 5, 9, 4, 6, 7]
  • heapq.heappush is used to insert a new element into the min-heap, and the function automatically maintains the min-heap property.

6. Use Heap for Sorting:

  • Heapsort is an in-place sorting algorithm that uses a binary heap. It has a time complexity of O(n log n) and can be beneficial for certain scenarios.

6.1. How Heapsort Works:

i. Heap Construction:

  • Build a max-heap (for ascending order) or a min-heap (for descending order) from the input array. This is done in a bottom-up manner, starting from the last non-leaf node and moving towards the root.

ii. Sorting:

  • After constructing the heap, repeatedly extract the maximum (for ascending order) or minimum (for descending order) element from the heap and place it at the end of the array.
  • After each extraction, the heap property is restored, allowing for the next maximum or minimum element to be efficiently extracted.

iii. In-Place Sorting:

  • Since heapsort operates in-place, no additional data structures are required other than the input array.

iv. Time Complexity:

  • The time complexity of heapsort is O(n log n) in the worst case.
def heapsort(arr):
    n = len(arr)

    # Build max-heap
    for i in range(n // 2 - 1, -1, -1):
        max_heapify(arr, n, i)

    # Extract elements from the heap one by one
    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]  # Swap the maximum element with the last element
        max_heapify(arr, i, 0)  # Restore max-heap property

def max_heapify(arr, n, i):
    largest = i
    left_child = 2 * i + 1
    right_child = 2 * i + 2

    if left_child < n and arr[left_child] > arr[largest]:
        largest = left_child

    if right_child < n and arr[right_child] > arr[largest]:
        largest = right_child

    if largest != i:
        arr[i], arr[largest] = arr[largest], arr[i]
        max_heapify(arr, n, largest)

# Example usage
arr = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
heapsort(arr)
print(arr)  # Output: [1, 1, 2, 3, 3, 4, 5, 5, 5, 6, 9]
  • This example demonstrates the use of heapsort to sort an array in ascending order. Adjustments can be made for descending order sorting by building a min-heap and extracting elements in reverse order.

7. Dynamic Updates:

  • If the problem involves dynamic updates (insertions or deletions), use a data structure that supports efficient operations. Binary Heaps are good for this purpose.

i. Insertions:

  • For min-heaps, insertions involve adding a new element to the heap and then restoring the min-heap property.
  • For max-heaps, insertions are similar, but the max-heap property needs to be maintained.
import heapq

# Example: Inserting an element into a min-heap
min_heap = [3, 1, 4, 1, 5, 9, 2, 6]
heapq.heappush(min_heap, 7)

ii. Deletions:

  • For min-heaps, deleting the minimum element involves swapping it with the last element, removing the last element, and then restoring the min-heap property.
  • For max-heaps, the process is similar but involves the maximum element.
import heapq

# Example: Deleting the minimum element from a min-heap
min_heap = [1, 1, 2, 3, 5, 9, 4, 6, 7]
heapq.heappop(min_heap)

iii. Efficiency of Heap Operations:

  • Both insertions and deletions in binary heaps have logarithmic time complexity (O(log n)), making them efficient for dynamic updates.

iv. Priority Queues for Dynamic Updates:

  • If dynamic updates involve maintaining a collection of elements with priorities, using a priority queue (often implemented with heaps) is a suitable choice.
import heapq

# Example: Using a priority queue for dynamic updates
priority_queue = []
heapq.heappush(priority_queue, 3)
heapq.heappush(priority_queue, 1)
heapq.heappush(priority_queue, 4)

Binary heaps are beneficial for dynamic updates because their structure allows for efficient operations. The logarithmic time complexity ensures that even with a large number of elements, these updates can be performed efficiently, making heaps a valuable tool in various algorithms and data structures.

8. Dijkstra’s Algorithm:

  • If the problem involves finding the shortest path in a weighted graph, consider using Dijkstra’s algorithm, which relies on a priority queue (usually implemented as a min heap).

8.1. Outline for Dijkstra’s Algorithm:

i. Initialization:

  • Assign tentative distances to all vertices. Initialize the distance to the source vertex as 0 and all other distances as infinity.
  • Add all vertices to a priority queue with their tentative distances as keys.

ii. Main Loop:

  • While the priority queue is not empty:
  • Extract the vertex with the smallest tentative distance from the priority queue (min heap).
  • Relax the edges outgoing from this vertex. Update the tentative distances of neighboring vertices if a shorter path is found.

iii. Termination:

  • The algorithm terminates when the priority queue is empty, and all vertices have been explored.

Dijkstra’s algorithm is efficient when used with a priority queue, especially implemented as a min heap, as it allows for quick extraction of the vertex with the smallest tentative distance.

import heapq

def dijkstra(graph, start):
    # Initialize distances and priority queue
    distances = {vertex: float('infinity') for vertex in graph}
    distances[start] = 0
    priority_queue = [(0, start)]

    while priority_queue:
        current_distance, current_vertex = heapq.heappop(priority_queue)

        # Check if a shorter path to the current vertex is found
        if current_distance > distances[current_vertex]:
            continue

        # Relaxation - update distances of neighboring vertices
        for neighbor, weight in graph[current_vertex].items():
            distance = current_distance + weight
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(priority_queue, (distance, neighbor))

    return distances

# Example usage
graph = {
    'A': {'B': 1, 'C': 4},
    'B': {'A': 1, 'C': 2, 'D': 5},
    'C': {'A': 4, 'B': 2, 'D': 1},
    'D': {'B': 5, 'C': 1}
}

start_vertex = 'A'
result = dijkstra(graph, start_vertex)
print(result)
  • This example demonstrates Dijkstra’s algorithm using a priority queue implemented as a min heap to find the shortest paths from a specified starting vertex in a weighted graph.

9. Prim’s Algorithm:

  • For finding the minimum spanning tree in a connected, undirected graph with weighted edges, use Prim’s algorithm, which also employs a priority queue.

9.1. Outline for Prim’s Algorithm:

i. Initialization:

  • Start with an arbitrary vertex as the initial vertex of the MST.
  • Initialize a priority queue (min-heap) with the edges and their weights connected to the chosen vertex.

ii. Main Loop:

  • While the priority queue is not empty:
  • Extract the edge with the smallest weight from the priority queue.
  • If adding this edge does not form a cycle in the current MST, add the edge to the MST.
  • Add the vertices at the other end of the edge to the MST.
  • Add the edges connected to the new vertex (if not already in the MST) to the priority queue.

iii. Termination:

  • The algorithm terminates when all vertices are included in the MST.
import heapq

def prim(graph, start_vertex):
    mst = []
    visited = set()
    priority_queue = [(0, start_vertex, None)]  # (weight, vertex, parent)

    while priority_queue:
        weight, current_vertex, parent = heapq.heappop(priority_queue)

        if current_vertex not in visited:
            visited.add(current_vertex)
            if parent is not None:
                mst.append((parent, current_vertex, weight))

            for neighbor, edge_weight in graph[current_vertex].items():
                if neighbor not in visited:
                    heapq.heappush(priority_queue, (edge_weight, neighbor, current_vertex))

    return mst

# Example usage
graph = {
    'A': {'B': 2, 'D': 5},
    'B': {'A': 2, 'C': 3, 'D': 1},
    'C': {'B': 3, 'D': 4},
    'D': {'A': 5, 'B': 1, 'C': 4}
}

start_vertex = 'A'
result = prim(graph, start_vertex)
print(result)
  • This example demonstrates Prim’s algorithm using a priority queue implemented as a min-heap to find the minimum spanning tree of a connected, undirected graph with weighted edges.
  • The result will contain the edges of the minimum spanning tree along with their weights.

10. Use Heaps for Median Maintenance:

  • Heaps are useful for efficiently maintaining the median of a stream of elements. Use two heaps, a max heap for the left half and a min heap for the right half.
  • This approach ensures that the median can be retrieved with a time complexity of O(1).

10.1. Outline on How to Use Heaps to Maintain Median of a Stream of Elements:

i. Initialization:

  • Create a max heap for the left half of the elements (containing the smaller elements).
  • Create a min heap for the right half of the elements (containing the larger elements).

ii. Insertion:

  • Compare the new element with the current median.
  • If the element is smaller than the current median, insert it into the max heap.
  • If the element is larger than the current median, insert it into the min heap.

iii. Balance Heaps:

  • Ensure that the size difference between the max heap and min heap is at most 1.
  • If the sizes differ by more than 1, move the root of the larger heap to the other heap.

iv. Median Retrieval:

  • If the sizes of the heaps are equal, the median is the average of the roots of the max and min heaps.
  • If the sizes are unequal, the median is the root of the heap with the larger size.
import heapq

class MedianFinder:
    def __init__(self):
        self.max_heap = []  # Left half (smaller elements)
        self.min_heap = []  # Right half (larger elements)

    def addNum(self, num):
        # Insert into the appropriate heap
        if not self.max_heap or num <= -self.max_heap[0]:
            heapq.heappush(self.max_heap, -num)
        else:
            heapq.heappush(self.min_heap, num)

        # Balance the heaps
        if len(self.max_heap) > len(self.min_heap) + 1:
            heapq.heappush(self.min_heap, -heapq.heappop(self.max_heap))
        elif len(self.min_heap) > len(self.max_heap):
            heapq.heappush(self.max_heap, -heapq.heappop(self.min_heap))

    def findMedian(self):
        if len(self.max_heap) == len(self.min_heap):
            return (-self.max_heap[0] + self.min_heap[0]) / 2.0
        else:
            return -self.max_heap[0] / 1.0

# Example usage
median_finder = MedianFinder()
median_finder.addNum(1)
median_finder.addNum(2)
median_finder.addNum(3)
print(median_finder.findMedian())  # Output: 2.0
median_finder.addNum(4)
print(median_finder.findMedian())  # Output: 2.5
  • The MedianFinder class maintains a max heap for the left half of the elements and a min heap for the right half.
  • The addNum method inserts a new element, and the findMedian method retrieves the current median.
  • The heaps are balanced to ensure efficient median maintenance.

11. Efficiently Merge Sorted Arrays:

  • Merge k sorted arrays efficiently using a heap. This involves inserting the first element from each array into the heap and popping the smallest element iteratively.

11.1. Outline for Merging Sorted Arrays:

i. Initialization:

  • Create a min heap to store elements along with their array index and position within the array.
  • Insert the first element from each sorted array into the min heap.

ii. Pop and Insert:

  • While the min heap is not empty:
  • Pop the smallest element from the heap (which includes the value, array index, and position in the array).
  • Add this element to the result array.
  • If there are more elements in the same array, insert the next element from the same array into the min heap.

iii. Repeat:

  • Repeat the process until all elements from all arrays are processed.
import heapq

def merge_sorted_arrays(arrays):
    min_heap = []  # (value, array index, position in array)
    result = []

    # Initialize the heap with the first element from each array
    for i, array in enumerate(arrays):
        if array:  # Ensure the array is not empty
            heapq.heappush(min_heap, (array[0], i, 0))

    # Pop and insert until the heap is empty
    while min_heap:
        val, arr_idx, pos_in_arr = heapq.heappop(min_heap)
        result.append(val)

        # If there are more elements in the same array, insert the next element
        if pos_in_arr + 1 < len(arrays[arr_idx]):
            heapq.heappush(min_heap, (arrays[arr_idx][pos_in_arr + 1], arr_idx, pos_in_arr + 1))

    return result

# Example usage
arrays = [
    [1, 3, 5],
    [2, 4, 6],
    [0, 7, 8, 9]
]

result = merge_sorted_arrays(arrays)
print(result)  # Output: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
  • The min heap is used to keep track of the smallest element from each array along with its array index and position in the array.
  • The algorithm efficiently merges the k sorted arrays by repeatedly popping the smallest element from the heap and inserting the next element from the same array until all elements are processed.

12. Sliding Window Problems:

  • For problems involving sliding windows, use a heap to efficiently track the maximum or minimum element in the window as it slides.
  • This is particularly useful for problems where you need to maintain a dynamic range of elements and update it as the window moves.

12.1. Outline for Using Heaps on Sliding Window Problems:

i. Initialization:

  • Create a heap to track the elements within the sliding window.
  • Insert the initial elements of the window into the heap.

ii. Maintain Heap Properties:

  • While the window is sliding:
  • Insert the next element into the heap.
  • Remove the element that is no longer in the window (if needed).
  • Ensure the heap properties are maintained.

iii. Retrieve Maximum or Minimum:

  • Retrieve the maximum or minimum element from the heap, depending on the problem requirements.
import heapq

def max_in_sliding_window(nums, k):
    max_values = []
    max_heap = []

    for i in range(len(nums)):
        # Remove elements outside the window
        while max_heap and max_heap[0][1] < i - k + 1:
            heapq.heappop(max_heap)

        # Insert the current element into the heap
        heapq.heappush(max_heap, (-nums[i], i))

        # Retrieve the maximum element for the current window
        if i >= k - 1:
            max_values.append(-max_heap[0][0])

    return max_values

# Example usage
nums = [1, 3, -1, -3, 5, 3, 6, 7]
k = 3
result = max_in_sliding_window(nums, k)
print(result)  # Output: [3, 3, 5, 5, 6, 7]
  • The max heap is used to efficiently track the maximum element within the sliding window.
  • The max_in_sliding_window function iterates through the array, updates the heap as the window slides, and retrieves the maximum element for each window.

13. Implement a Priority Queue:

  • If your programming language does not provide a built-in priority queue, consider implementing one using a binary heap.
class PriorityQueue:
    def __init__(self):
        self.heap = []

    def push(self, value):
        # Append the value to the end of the heap
        self.heap.append(value)
        self._heapify_up()

    def pop(self):
        if not self.heap:
            raise IndexError("pop from an empty priority queue")

        # Swap the root (minimum element) with the last element
        self.heap[0], self.heap[-1] = self.heap[-1], self.heap[0]
        popped_value = self.heap.pop()

        # Restore the min-heap property
        self._heapify_down()

        return popped_value

    def peek(self):
        if not self.heap:
            raise IndexError("peek from an empty priority queue")

        return self.heap[0]

    def is_empty(self):
        return not bool(self.heap)

    def _heapify_up(self):
        index = len(self.heap) - 1

        while index > 0:
            parent_index = (index - 1) // 2
            if self.heap[index] < self.heap[parent_index]:
                self.heap[index], self.heap[parent_index] = self.heap[parent_index], self.heap[index]
                index = parent_index
            else:
                break

    def _heapify_down(self):
        index = 0
        size = len(self.heap)

        while True:
            left_child_index = 2 * index + 1
            right_child_index = 2 * index + 2
            smallest = index

            if left_child_index < size and self.heap[left_child_index] < self.heap[smallest]:
                smallest = left_child_index

            if right_child_index < size and self.heap[right_child_index] < self.heap[smallest]:
                smallest = right_child_index

            if smallest != index:
                self.heap[index], self.heap[smallest] = self.heap[smallest], self.heap[index]
                index = smallest
            else:
                break

# Example usage:
priority_queue = PriorityQueue()
priority_queue.push(3)
priority_queue.push(1)
priority_queue.push(4)
priority_queue.push(1)
priority_queue.push(5)

print(priority_queue.pop())  # Output: 1
print(priority_queue.pop())  # Output: 1
print(priority_queue.pop())  # Output: 3
print(priority_queue.pop())  # Output: 4
print(priority_queue.pop())  # Output: 5
  • push to insert elements, pop to extract the minimum element, peek to view the minimum element without removal, and is_empty to check if the priority queue is empty.
  • Adjustments can be made for a max-heap or for a priority queue that supports custom priorities.

14. Handle Duplicate Elements:

  • If the problem involves handling duplicate elements, consider using additional information in the heap nodes or modifying the comparison function accordingly.
  • The standard implementation of heaps usually assumes that all elements are unique, and duplicate elements can pose challenges. Here are a few approaches you can consider:

i. Using Additional Information:

  • Include additional information in the heap nodes to differentiate between elements with the same value.
  • For example, you can use a tuple with both the element value and a unique identifier.
import heapq

class Node:
    def __init__(self, value, identifier):
        self.value = value
        self.identifier = identifier

    def __lt__(self, other):
        # Customize the comparison function to handle duplicate values
        if self.value == other.value:
            return self.identifier < other.identifier
        return self.value < other.value

class PriorityQueue:
    def __init__(self):
        self.heap = []

    def push(self, value, identifier):
        node = Node(value, identifier)
        heapq.heappush(self.heap, node)

    def pop(self):
        if not self.heap:
            raise IndexError("pop from an empty priority queue")
        return heapq.heappop(self.heap).value

# Example usage
priority_queue = PriorityQueue()
priority_queue.push(3, 1)
priority_queue.push(1, 2)
priority_queue.push(4, 3)
priority_queue.push(1, 4)
priority_queue.push(5, 5)

print(priority_queue.pop())  # Output: 1
print(priority_queue.pop())  # Output: 1
print(priority_queue.pop())  # Output: 3
print(priority_queue.pop())  # Output: 4
print(priority_queue.pop())  # Output: 5

ii. Modifying the Comparison Function:

  • Customize the comparison function to consider additional factors when elements are equal.
import heapq

class PriorityQueue:
    def __init__(self):
        self.heap = []
        self.counter = 0

    def push(self, value):
        # Include a counter to differentiate between elements with the same value
        heapq.heappush(self.heap, (value, self.counter))
        self.counter += 1

    def pop(self):
        if not self.heap:
            raise IndexError("pop from an empty priority queue")
        return heapq.heappop(self.heap)[0]

# Example usage
priority_queue = PriorityQueue()
priority_queue.push(3)
priority_queue.push(1)
priority_queue.push(4)
priority_queue.push(1)
priority_queue.push(5)

print(priority_queue.pop())  # Output: 1
print(priority_queue.pop())  # Output: 1
print(priority_queue.pop())  # Output: 3
print(priority_queue.pop())  # Output: 4
print(priority_queue.pop())  # Output: 5

15. Handle Edge Cases:

  • Be mindful of edge cases, such as an empty heap or handling situations where the number of elements is less than the specified k.

15.1. Edge Cases of Heaps:

i. Empty Heap:

  • Before performing any operations that involve extracting or accessing elements from the heap, check if the heap is empty.
  • For example, when popping from an empty heap, it’s common to raise an exception or return a special value indicating an empty heap.
import heapq

def pop_from_heap(heap):
    if not heap:
        raise IndexError("pop from an empty heap")
    return heapq.heappop(heap)

ii. Insufficient Elements for k:

  • When working with problems that involve extracting the top k elements from a heap, check if there are enough elements in the heap before attempting to extract.
  • Adjust the algorithm to handle cases where the number of elements is less than the specified k.
import heapq

def top_k_elements(heap, k):
    if len(heap) < k:
        raise ValueError("Insufficient elements for k")
    result = []
    for _ in range(k):
        result.append(heapq.heappop(heap))
    return result

iii. Handling None or Null Values:

  • If the elements being inserted into the heap may be None or have a special meaning, ensure that your comparison function or additional information in the nodes handles such cases appropriately.
import heapq

class Node:
    def __init__(self, value):
        self.value = value

    def __lt__(self, other):
        # Handle None values
        if self.value is None:
            return False
        if other.value is None:
            return True
        return self.value < other.value

# Example usage
heap = []
heapq.heappush(heap, Node(3))
heapq.heappush(heap, Node(None))
heapq.heappush(heap, Node(1))
print(heapq.heappop(heap).value)  # Output: 1

iv. Edge Cases in Custom Comparisons:

  • If you’re using a custom comparison function in your heap, make sure it handles edge cases correctly. For example, if you’re comparing strings and some of them are empty strings, ensure the comparison logic handles these cases appropriately.
import heapq

def custom_comparison(a, b):
    # Custom comparison function handling edge cases
    if not a and b:
        return True
    elif a and not b:
        return False
    else:
        return a < b

# Example usage
heap = []
heapq.heappush(heap, "apple")
heapq.heappush(heap, "")
heapq.heappush(heap, "banana")
print(heapq.heappop(heap))  # Output: ""

16. Use Heaps for Huffman Coding:

  • Heaps are used in Huffman coding for efficient compression of data. Familiarize yourself with this application.
  • Huffman coding is a popular algorithm used for lossless data compression, and it relies on binary heaps for efficient implementation. The main idea behind Huffman coding is to assign variable-length codes to input characters, with shorter codes assigned to more frequent characters. This allows for efficient compression, as more common characters have shorter codes.

16.1. Overview of How Heaps Are Used in Huffman Coding:

i. Frequency Table:

  • Begin by constructing a frequency table that represents the occurrence of each character in the input data.

ii. Build Huffman Tree:

  • Create a priority queue (min heap) where each node represents a character along with its frequency.
  • Insert all the characters and their frequencies into the priority queue.

iii. Merge Nodes:

  • Repeatedly extract two nodes with the lowest frequencies from the priority queue.
  • Merge them into a new node whose frequency is the sum of the frequencies of the two nodes.
  • Insert the merged node back into the priority queue.
  • Repeat this process until only one node remains in the priority queue, creating a Huffman tree.

iv. Generate Codes:

  • Traverse the Huffman tree to generate variable-length codes for each character.
  • Assign shorter codes to more frequent characters, ensuring that no code is a prefix of another (prefix-free property).

v. Encode and Decode:

  • Use the generated Huffman codes to encode the input data, replacing each character with its corresponding code.
  • Decode the encoded data by traversing the Huffman tree based on the encoded bits.
import heapq
from collections import defaultdict

class Node:
    def __init__(self, char, freq):
        self.char = char
        self.freq = freq
        self.left = None
        self.right = None

    def __lt__(self, other):
        return self.freq < other.freq

def build_huffman_tree(freq_table):
    heap = [Node(char, freq) for char, freq in freq_table.items()]
    heapq.heapify(heap)

    while len(heap) > 1:
        left = heapq.heappop(heap)
        right = heapq.heappop(heap)
        merged = Node(None, left.freq + right.freq)
        merged.left, merged.right = left, right
        heapq.heappush(heap, merged)

    return heap[0]

def generate_huffman_codes(node, code="", codes=None):
    if codes is None:
        codes = {}
    if node is not None:
        if node.char is not None:
            codes[node.char] = code
        generate_huffman_codes(node.left, code + "0", codes)
        generate_huffman_codes(node.right, code + "1", codes)
    return codes

# Example usage
input_data = "hello, world!"
frequency_table = defaultdict(int)

for char in input_data:
    frequency_table[char] += 1

huffman_tree = build_huffman_tree(frequency_table)
huffman_codes = generate_huffman_codes(huffman_tree)

print("Huffman Codes:")
for char, code in huffman_codes.items():
    print(f"{char}: {code}")
  • The build_huffman_tree function creates a Huffman tree using a priority queue (min heap), and the generate_huffman_codes function traverses the tree to generate Huffman codes for each character.
  • The resulting Huffman codes can be used for efficient compression and decompression of data.

17. Explore Fibonacci Heaps:

  • For certain scenarios involving a mix of operations, explore advanced data structures like Fibonacci Heaps that provide improved time complexities for certain operations.
  • Fibonacci Heaps are a type of heap data structure that was designed to improve the time complexity of certain operations compared to traditional binary heaps, especially in scenarios involving a mix of operations.

17.1. Key Characteristics and Advantages of Fibonacci Heaps:

i. Amortized Time Complexity:

  • Fibonacci Heaps provide amortized constant time complexity for many operations, making them particularly efficient in scenarios where there is a mix of insertions, decrease key operations, and deletions.

ii. Decrease Key Operation:

  • The decrease key operation in Fibonacci Heaps has an amortized time complexity of O(1), making it more efficient than binary heaps, which have O(log n) time complexity for the same operation.

iii. Efficient Melding:

  • Melding, or merging, two Fibonacci Heaps can be done in constant time. This is an advantage over binary heaps, where merging requires copying elements from one heap to another, resulting in O(m + n) time complexity, where m and n are the sizes of the heaps being merged.

iv. Lazy Evaluation:

  • Fibonacci Heaps use lazy evaluation to delay some of the work until it is necessary. This allows for better amortized time complexity.

v. Applications:

  • Fibonacci Heaps are often used in graph algorithms such as Dijkstra’s algorithm and Prim’s algorithm, where a mix of operations like insertions, decrease key, and deletions are performed.

While Fibonacci Heaps offer improved time complexities for certain operations, it’s essential to note that they come with increased complexity in terms of implementation and space requirements. The constant factors in the time complexity for Fibonacci Heaps are often larger than those in binary heaps, making them less efficient for small inputs or cases where the operations are not well-balanced.

Due to the complexity of implementation and the overhead associated with Fibonacci Heaps, they are not always the best choice for every scenario. In practice, binary heaps are often preferred for simplicity and efficiency in scenarios where the advantages of Fibonacci Heaps are not critical. However, in specialized scenarios with a mix of operations and a need for optimal time complexities, Fibonacci Heaps can be a valuable tool.

18. Practice with Classic Problems:

  • Practice solving classic problems that involve heaps, such as finding the k-th largest element, sorting a nearly sorted array, or implementing efficient algorithms for graph problems.

II. COMMON PROBLEM TYPES:

1. Top K Elements:

  • Problem Type: Find the top k elements in an array.
  • To solve the “Top K Frequent Elements” problem using heaps, we can use a min heap to keep track of the k most frequent elements. We’ll follow these steps:
  1. Create a frequency map to store the frequency of each element in the array.
  2. Use a min heap to maintain the k elements with the highest frequencies.
  3. Iterate through the frequency map, adding elements to the min heap.
  4. If the size of the heap exceeds k, remove the element with the lowest frequency.
  5. The heap will contain the k most frequent elements when the iteration is complete.
# Problem: Top K Frequent Elements 
# Given a non-empty array of integers, return the k most frequent elements.

import heapq
from collections import Counter

def topKFrequent(nums, k):
    # Step 1: Create a frequency map
    frequency_map = Counter(nums)

    # Step 2: Use a min heap to maintain the k most frequent elements
    min_heap = []

    # Step 3: Iterate through the frequency map
    for num, freq in frequency_map.items():
        # Step 4: Add elements to the min heap
        heapq.heappush(min_heap, (freq, num))

        # If the size of the heap exceeds k, remove the element with the lowest frequency
        if len(min_heap) > k:
            heapq.heappop(min_heap)

    # Step 5: The heap contains the k most frequent elements
    result = [element[1] for element in min_heap]

    return result

# Example usage
nums = [1, 1, 1, 2, 2, 3]
k = 2
result = topKFrequent(nums, k)
print(result)
  • We use the Counter class from the collections module to create a frequency map (frequency_map) of the input array nums.
  • We initialize an empty min heap (min_heap) to keep track of the k most frequent elements. The heap will store tuples of the form (frequency, element).
  • We iterate through the frequency map, adding elements to the min heap.
  • If the size of the heap exceeds k, we remove the element with the lowest frequency using heapq.heappop(min_heap).
  • After the iteration is complete, the min heap contains the k most frequent elements, and we extract them to the result list.

This approach ensures that we maintain the k most frequent elements efficiently using a min heap. The time complexity of this solution is O(n log k), where n is the size of the input array. The heapq operations have logarithmic complexity, and we perform them for each element in the frequency map.

2. Priority Queue Operations:

  • Problem Type: Perform operations like insertion and extraction of the minimum or maximum element.
  • To solve the “Kth Largest Element in an Array” problem using heaps, we can use a min heap or a max heap, depending on the specific requirements. Here, I’ll provide a solution using a max heap to find the kth largest element. The basic idea is to maintain a max heap of size k, and after processing the entire array, the heap will contain the k largest elements, with the top element being the kth largest.
# Problem: Kth Largest Element in an Array 
# Find the kth largest element in an unsorted array.

import heapq

def findKthLargest(nums, k):
    # Create a max heap to store the k largest elements
    max_heap = []

    # Insert the first k elements into the max heap
    for i in range(k):
        heapq.heappush(max_heap, -nums[i])

    # Iterate through the remaining elements in the array
    for i in range(k, len(nums)):
        # If the current element is larger than the smallest element in the max heap,
        # replace the smallest element with the current element
        if nums[i] > -max_heap[0]:
            heapq.heappop(max_heap)
            heapq.heappush(max_heap, -nums[i])

    # The top of the max heap is the kth largest element
    return -max_heap[0]

# Example usage
nums = [3, 2, 1, 5, 4, 6]
k = 2
result = findKthLargest(nums, k)
print(result)
  • We use a max heap (max_heap) to store the k largest elements. To achieve this, we negate the values and use a min heap.
  • We insert the first k elements from the array into the max heap, negating each element before insertion.
  • We iterate through the remaining elements in the array and compare them with the smallest element in the max heap.
  • If the current element is larger than the smallest element in the max heap, we replace the smallest element with the current element.
  • After processing the entire array, the max heap will contain the k largest elements, and the top element is the kth largest.
  • We return the negation of the top element since we negated the values during insertion.

This solution has a time complexity of O(n log k), where n is the size of the input array. The heapq operations have logarithmic complexity, and we perform them for each element in the array. The space complexity is O(k) for the max heap.

3. Merge K Sorted Lists:

  • Problem Type: Merge k sorted linked lists into one sorted list.
  • To solve the “Merge k Sorted Lists” problem efficiently, we can use a min heap. The idea is to push the first element from each linked list into the min heap. As we pop elements from the heap, we insert the next element from the same linked list into the heap if it exists. This process ensures that we always have the smallest remaining element from all the linked lists at the top of the heap.
# Problem: Merge k Sorted Lists 
# Merge k sorted linked lists and return it as one sorted list.

import heapq

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

def mergeKLists(lists):
    # Create a min heap to store tuples (value, node)
    min_heap = []

    # Push the first element from each linked list into the min heap
    for linked_list in lists:
        if linked_list:
            heapq.heappush(min_heap, (linked_list.val, linked_list))

    # Dummy node to simplify the result list construction
    dummy = ListNode()
    current = dummy

    # Continue until the min heap is empty
    while min_heap:
        # Pop the smallest element (value, node) from the min heap
        val, node = heapq.heappop(min_heap)

        # Append the popped element to the result list
        current.next = ListNode(val)
        current = current.next

        # If there is a next node in the linked list, push it into the heap
        if node.next:
            heapq.heappush(min_heap, (node.next.val, node.next))

    return dummy.next  # The result list starts from the next node of the dummy node

# Example usage
# Create linked lists
list1 = ListNode(1, ListNode(4, ListNode(5)))
list2 = ListNode(1, ListNode(3, ListNode(4)))
list3 = ListNode(2, ListNode(6))

lists = [list1, list2, list3]
result = mergeKLists(lists)

# Print the merged result
while result:
    print(result.val, end=" ")
    result = result.next
  • We use a min heap (min_heap) to store tuples (value, node) where value is the value of the node, and node is the actual node.
  • We push the first element from each linked list into the min heap.
  • We create a dummy node to simplify the construction of the result list.
  • We iterate until the min heap is empty:
  • Pop the smallest element from the heap.
  • Append the popped element to the result list.
  • If there is a next node in the linked list from which the element was popped, push it into the heap.
  • The result is a merged sorted linked list.

This solution has a time complexity of O(N log k), where N is the total number of nodes across all linked lists, and k is the number of linked lists. The heapq operations have logarithmic complexity, and we perform them for each node in the linked lists. The space complexity is O(k) for the min heap.

4. Sort Almost Sorted Array:

  • Problem Type: Sort an almost sorted array using a heap.
  • To solve the “Sort an Almost Sorted Array” problem efficiently, we can use a min heap to keep track of the elements within a window of size k. The basic idea is to insert the first k + 1 elements into the heap. As we extract elements from the heap, we insert the next element from the array into the heap. This way, the heap always maintains a window of size k, ensuring that the elements are sorted within this window.
# Problem: Sort an Almost Sorted Array 
# Given an almost sorted array where each element is at most k positions away from its sorted position, sort the array.

import heapq

def sortAlmostSortedArray(arr, k):
    n = len(arr)

    # Create a min heap to maintain the window of size k + 1
    min_heap = []

    # Insert the first k + 1 elements into the min heap
    for i in range(min(k + 1, n)):
        heapq.heappush(min_heap, arr[i])

    result = []  # Result array to store the sorted elements

    # Continue until the heap is empty
    for i in range(k + 1, n):
        # Extract the smallest element from the heap
        result.append(heapq.heappop(min_heap))

        # Insert the next element from the array into the heap
        heapq.heappush(min_heap, arr[i])

    # Extract the remaining elements from the heap
    while min_heap:
        result.append(heapq.heappop(min_heap))

    return result

# Example usage
arr = [6, 5, 3, 2, 8, 10, 9]
k = 3
result = sortAlmostSortedArray(arr, k)
print(result)
  • We create a min heap (min_heap) to maintain the window of size k + 1.
  • We insert the first k + 1 elements from the array into the min heap.
  • We iterate from index k + 1 to the end of the array:
  • Extract the smallest element from the heap and append it to the result array.
  • Insert the next element from the array into the heap.
  • After the loop, extract the remaining elements from the heap and append them to the result array.

This solution has a time complexity of O(n log k), where n is the size of the input array. The heapq operations have logarithmic complexity, and we perform them for each element in the array. The space complexity is O(k) for the min heap. The algorithm efficiently handles the “almost sorted” property of the array.

5. Running Median:

  • Problem Type: Compute the running median of a sequence of numbers.
  • To solve the “Find Median from Data Stream” problem efficiently, we can use two heaps: a max heap to store the smaller half of the elements and a min heap to store the larger half of the elements. The median will be the average of the tops of the two heaps if the number of elements is even, or the top of the max heap if the number of elements is odd.
# Problem: Find Median from Data Stream 
# Design a data structure that supports the following two operations: addNum and findMedian.

import heapq

class MedianFinder:
    def __init__(self):
        # Max heap for the smaller half of the elements
        self.max_heap = []
        # Min heap for the larger half of the elements
        self.min_heap = []

    def addNum(self, num):
        # Insert the number into the max heap
        heapq.heappush(self.max_heap, -num)

        # Move the top element from the max heap to the min heap
        heapq.heappush(self.min_heap, -heapq.heappop(self.max_heap))

        # Balance the sizes of the heaps
        if len(self.min_heap) > len(self.max_heap):
            heapq.heappush(self.max_heap, -heapq.heappop(self.min_heap))

    def findMedian(self):
        if len(self.max_heap) == len(self.min_heap):
            # If the number of elements is even, return the average of the tops
            return (self.min_heap[0] - self.max_heap[0]) / 2.0
        else:
            # If the number of elements is odd, return the top of the max heap
            return -self.max_heap[0]

# Example usage
median_finder = MedianFinder()
median_finder.addNum(1)
median_finder.addNum(2)
result = median_finder.findMedian()
print(result)  # Output: 1.5
median_finder.addNum(3)
result = median_finder.findMedian()
print(result)  # Output: 2.0
  • We use a max heap (max_heap) to store the smaller half of the elements and a min heap (min_heap) to store the larger half of the elements.
  • The addNum method inserts the new number into the max heap and then moves the top element from the max heap to the min heap.
  • The sizes of the two heaps are balanced to ensure that the difference in sizes is at most 1.
  • The findMedian method calculates and returns the median based on the tops of the two heaps.
  • If the number of elements is even, the median is the average of the tops of the two heaps.
  • If the number of elements is odd, the median is the top of the max heap.

This solution has an amortized time complexity of O(log n) for the addNum operation and a constant time complexity of O(1) for the findMedian operation. The space complexity is O(n), where n is the total number of elements added. The two-heap approach efficiently maintains the median as elements are added to the data stream.

6. Dijkstra’s Algorithm:

  • Problem Type: Find the shortest path in a graph with non-negative weights.
  • To solve the “Network Delay Time” problem, we can use Dijkstra’s algorithm, which is well-suited for finding the shortest path in a graph with non-negative weights. Here, we’ll implement Dijkstra’s algorithm to find the minimum time it takes for a signal to travel from a source node to all other nodes in the network.
# Problem: Network Delay Time 
# There are N network nodes, labeled 1 to N. 
# You are given times, a list of travel times as directed edges times[i] = (u, v, w), where u is the source node, v is the target node, and w is the time it takes for a signal to travel from source to target.

import heapq
from collections import defaultdict

def networkDelayTime(times, N, K):
    # Create an adjacency list to represent the graph
    graph = defaultdict(list)
    for u, v, w in times:
        graph[u].append((v, w))

    # Priority queue to store (time, node) tuples
    priority_queue = [(0, K)]
    
    # Dictionary to store the minimum time for each node
    min_times = {}
    
    while priority_queue:
        time, node = heapq.heappop(priority_queue)
        
        # Skip if the node has already been visited
        if node in min_times:
            continue

        # Update the minimum time for the current node
        min_times[node] = time

        # Explore neighbors and update their arrival times in the priority queue
        for neighbor, neighbor_time in graph[node]:
            heapq.heappush(priority_queue, (time + neighbor_time, neighbor))

    # Check if all nodes have been visited
    if len(min_times) == N:
        return max(min_times.values())
    else:
        return -1  # Some nodes are not reachable

# Example usage
times = [(2, 1, 1), (2, 3, 1), (3, 4, 1)]
N = 4
K = 2
result = networkDelayTime(times, N, K)
print(result)  # Output: 2
  • We represent the graph as an adjacency list (graph) using a defaultdict.
  • We use a priority queue (priority_queue) to store tuples of the form (time, node) representing the time it takes to reach a node.
  • We maintain a dictionary (min_times) to store the minimum arrival time for each node.
  • The algorithm iterates until the priority queue is empty, exploring nodes and updating arrival times.
  • We return the maximum arrival time among all nodes if all nodes are reachable; otherwise, we return -1.

This solution has a time complexity of O(E + V log V), where E is the number of edges and V is the number of vertices. The priority queue operations dominate the time complexity. The space complexity is O(V + E) to store the graph and the priority queue. Dijkstra’s algorithm is efficient for graphs with non-negative weights.

7. Task Scheduling:

  • Problem Type: Schedule tasks to maximize profit or minimize completion time.
  • The “Task Scheduler” problem can be effectively solved using heaps. A priority queue (min heap) can be employed to keep track of the tasks based on their frequencies. The idea is to repeatedly select and execute the task with the highest frequency until all tasks are completed.

Step 1:

  • Count the frequency of each task using Counter.

Step 2:

  • Create a max heap (max_heap) to store (-frequency, task) tuples.
  • Use heapify to convert the list into a max heap.

Step 3:

  • Iterate until the max heap is empty:
  • Execute tasks in the current interval.
  • Decrease the frequency of executed tasks and reinsert them back into the max heap.
  • Increment the total intervals by the number of tasks executed in the current interval.
  • If there are still tasks to execute, add idle intervals.
  • Add the number of idle intervals to the total intervals.

Step 4:

  • Output the result, representing the minimum time needed to execute all tasks.
# Problem: Task Scheduler 
# Given a characters array tasks, representing the tasks a CPU needs to do, where each letter represents a different task.

import heapq
from collections import Counter

def leastInterval(tasks, n):
    # Count the frequency of each task
    task_counts = Counter(tasks)

    # Create a max heap to store (-frequency, task) tuples
    max_heap = [(-count, task) for task, count in task_counts.items()]
    heapq.heapify(max_heap)

    total_intervals = 0

    while max_heap:
        # Temporary list to store tasks to be reinserted
        temp = []

        # Execute tasks in the current interval (up to n+1 tasks)
        for _ in range(n + 1):
            if max_heap:
                count, task = heapq.heappop(max_heap)
                count += 1  # Decrease frequency
                if count < 0:
                    temp.append((count, task))  # Reinsert the task with decreased frequency
            total_intervals += 1

        # Reinsert tasks back into the max heap
        for entry in temp:
            heapq.heappush(max_heap, entry)

        # If the heap is empty, we have completed all tasks
        if not max_heap:
            break

        # If there are still tasks to execute, add idle intervals
        total_intervals += len(temp) if len(max_heap) >= n + 1 else n + 1

    return total_intervals

# Example usage
tasks = ['A', 'A', 'A', 'B', 'B', 'B']
n = 2
result = leastInterval(tasks, n)
print(result)  # Output: 8

Count the Frequency of Each Task:

  • task_counts = Counter(tasks) creates a Counter object to count the frequency of each task in the input list tasks.

Create a Max Heap:

  • max_heap = [(-count, task) for task, count in task_counts.items()] creates a max heap (max_heap) by storing (-frequency, task) tuples. The negative frequency is used to create a max heap that prioritizes tasks with higher frequencies. heapify is then applied to convert the list into a max heap.

Execute Tasks in Intervals:

  • A while loop runs until the max heap is empty.
  • Inside the loop:
  • A temporary list (temp) is used to store tasks that will be reinserted into the max heap.
  • A for loop executes tasks in the current interval (up to n + 1 tasks).
  • For each task popped from the max heap:
  • The frequency is decreased.
  • If the frequency is still greater than zero, the task is added to the temporary list with the updated frequency.
  • The total intervals are incremented by the number of tasks executed in the current interval.

Reinsert Tasks and Handle Idle Intervals:

  • Tasks in the temporary list (temp) are reinserted back into the max heap with their updated frequencies.
  • If the max heap is empty, the loop breaks, indicating that all tasks have been executed.
  • If there are still tasks in the max heap, idle intervals are added to the total intervals. The number of idle intervals is determined by the remaining tasks in the max heap.

Return Result:

  • The function returns the total intervals, representing the minimum time needed to execute all tasks.

8. Running Maximum:

  • Problem Type: Compute the running maximum of a sequence of numbers.
  • To solve the “Sliding Window Maximum” problem efficiently, we can use a double-ended queue (deque) to maintain the indices of elements in the current window in a way that allows us to efficiently find the maximum in each window.
# Problem: Sliding Window Maximum 
# Given an array nums, there is a sliding window of size k which is moving from the very left of the array to the very right.

from collections import deque

def maxSlidingWindow(nums, k):
    result = []
    window = deque()

    for i in range(len(nums)):
        # Remove elements that are out of the current window
        while window and window[0] < i - k + 1:
            window.popleft()

        # Remove elements that are smaller than the current element from the back
        while window and nums[window[-1]] < nums[i]:
            window.pop()

        # Add the current index to the window
        window.append(i)

        # Add the maximum of the current window to the result
        if i >= k - 1:
            result.append(nums[window[0]])

    return result

# Example usage
nums = [1,3,-1,-3,5,3,6,7]
k = 3
result = maxSlidingWindow(nums, k)
print(result)  # Output: [3,3,5,5,6,7]
  • We use a deque (window) to store the indices of elements within the current window.
  • For each element in the array, we perform the following steps:
  1. Remove elements from the front of the deque that are outside the current window.
  2. Remove elements from the back of the deque that are smaller than the current element.
  3. Add the current index to the deque.
  4. If the index is greater than or equal to k — 1, add the maximum of the current window to the result.
  • The result is the list of maximum values for each window as it slides through the array.

This solution has a time complexity of O(n), where n is the size of the input array. The deque operations maintain the indices within the window efficiently. The space complexity is O(k) for the deque, as it stores at most k indices.

9. Connect Ropes with Minimum Cost:

  • Problem Type: Connect ropes with minimum cost.
  • To solve the “Minimum Cost to Connect Sticks” problem efficiently, we can use a priority queue (min heap) to always select the two smallest sticks and connect them, updating the total cost. This ensures that we always connect the sticks with the minimum lengths, minimizing the overall cost.
# Problem: Minimum Cost to Connect Sticks 
# You have some sticks with positive integer lengths. 
# You can connect any two sticks of lengths X and Y into one stick by paying a cost of X + Y.

import heapq

def connectSticks(sticks):
    # Use a min heap to maintain the smallest sticks
    heapq.heapify(sticks)
    
    total_cost = 0
    
    # Continue until only one stick remains
    while len(sticks) > 1:
        # Extract the two smallest sticks from the heap
        stick1 = heapq.heappop(sticks)
        stick2 = heapq.heappop(sticks)
        
        # Connect the sticks and add the cost to the total
        cost = stick1 + stick2
        total_cost += cost
        
        # Insert the connected stick back into the heap
        heapq.heappush(sticks, cost)

    return total_cost

# Example usage
sticks = [2, 4, 3]
result = connectSticks(sticks)
print(result)  # Output: 14
  • We use a min heap (sticks) to store the lengths of sticks.
  • We convert the list of sticks into a min heap using heapify.
  • We continue the process until only one stick remains in the heap.
  • In each iteration, we extract the two smallest sticks from the heap, connect them, and add the cost to the total.
  • We insert the connected stick (resulting from the addition) back into the heap.
  • The final result is the total cost of connecting all the sticks.

This solution has a time complexity of O(n log n), where n is the number of sticks. The heapq operations dominate the time complexity. The space complexity is O(n) to store the min heap. The algorithm efficiently connects the sticks with the minimum lengths, minimizing the overall cost.

10. Merging Files with Minimum Cost:

  • Problem Type: Merge files with minimum cost.
  • The “Minimum Cost to Merge Stones” problem can be solved using dynamic programming. This problem is a variation of the classical “Matrix Chain Multiplication” problem. To solve it efficiently, we can use a bottom-up dynamic programming approach.
# Problem: Minimum Cost to Merge Stones 
# There are N piles of stones arranged in a row. The i-th pile has stones[i] stones.

def mergeStones(stones):
    n = len(stones)

    # Check if it is possible to merge the stones into one pile
    if (n - 1) % (2 - 1) != 0:
        return -1

    # Prefix sum to quickly calculate the sum of a subarray
    prefix_sum = [0] + [sum(stones[:i]) for i in range(1, n + 1)]

    # Initialize the DP table with zeros
    dp = [[0] * n for _ in range(n)]

    # Iterate over the subarray lengths
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            dp[i][j] = float('inf')

            # Iterate over possible partitions
            for k in range(i, j, 1):
                dp[i][j] = min(dp[i][j], dp[i][k] + dp[k + 1][j] + prefix_sum[j + 1] - prefix_sum[i])

    return dp[0][n - 1]

# Example usage
stones = [3, 2, 4, 1]
result = mergeStones(stones)
print(result)  # Output: 20
  • We first check if it is possible to merge the stones into one pile. If the condition (n - 1) % (2 - 1) != 0 is not satisfied, it means we cannot merge the stones, and we return -1.
  • We use a prefix sum array (prefix_sum) to quickly calculate the sum of any subarray.
  • We initialize a DP table (dp) with zeros, where dp[i][j] represents the minimum cost to merge the stones from pile i to pile j.
  • We iterate over the subarray lengths, filling in the DP table by considering all possible partitions.
  • The final result is stored in dp[0][n - 1], representing the minimum cost to merge all stones into one pile.

This solution has a time complexity of O(n³), where n is the number of piles of stones. The space complexity is O(n²) for the DP table. The dynamic programming approach efficiently computes the minimum cost to merge the stones.

Leetcode
Leetcode Solution
Coding Interviews
Python
Heap
Recommended from ReadMedium