1. What is Heap Sort?¶
Heap sort is a sorting algorithm that utilizes the “heap” data structure. A heap is a special type of complete binary tree, which can be visualized as a “pyramid”: in a max-heap, the top element is the largest, and all child nodes are smaller; in a min-heap, the top is the smallest, and all child nodes are larger. Heap sort typically uses a max-heap, repeatedly extracting the maximum element from the top to gradually sort the entire array.
2. Basic Concepts of Heap¶
-
Heap Definition: A heap is a complete binary tree (each level is filled from left to right, with the last level possibly incomplete) satisfying the following properties:
- Max-Heap: Each parent node’s value is greater than or equal to its children’s values (the root is the maximum).
- Min-Heap: Each parent node’s value is less than or equal to its children’s values (the root is the minimum). -
Array Representation of Heap: For easy manipulation, heaps are often represented as arrays. For a node at index
iin the array:
- Left child index:2i + 1
- Right child index:2i + 2
- Parent index:(i - 1) // 2(integer division)
3. Core Idea of Heap Sort¶
The core of heap sort is “Build the heap first, then sort”:
1. Build the Heap: Convert the unsorted array into a max-heap (or min-heap), where the top element is the maximum (or minimum) of the array.
2. Extract Top and Adjust: Swap the top element (maximum) with the last unsorted element in the array. The last element is now sorted. Reduce the heap size, re-heapify the remaining elements, and repeat until all elements are sorted.
4. Implementation Steps of Heap Sort¶
Taking max-heap as an example, the steps are as follows:
Step 1: Build the Max-Heap
- Start from the last non-leaf node and adjust upward layer by layer to ensure each node satisfies the max-heap property (parent ≥ children).
- Key Operation: Heapify. For a node, compare it with its left and right children, swap the maximum to the current node, then recursively adjust the child subtree until the subtree also satisfies the max-heap property.
Step 2: Sorting Process
- Swap the top element (maximum) with the last unsorted element. The last element is now sorted.
- Decrease the heap size (exclude the sorted element at the end), re-heapify the remaining elements, and repeat the swap and heapify until only one element remains in the heap.
5. Code Example for Heap Sort (Python)¶
For clarity, here’s the core implementation of heap sort in Python:
def heap_sort(arr):
n = len(arr)
# Step 1: Build the max-heap (start from the last non-leaf node and heapify upward)
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i) # Heapify function
# Step 2: Sorting process
for i in range(n - 1, 0, -1):
arr[i], arr[0] = arr[0], arr[i] # Swap the max (top) with the last unsorted element
heapify(arr, i, 0) # Re-heapify the remaining elements
return arr
def heapify(arr, n, i):
# Heapify function: Adjust the subtree rooted at index i to max-heap
largest = i # Assume current node is the largest
left = 2 * i + 1 # Left child index
right = 2 * i + 2 # Right child index
# Update largest if left child is larger
if left < n and arr[left] > arr[largest]:
largest = left
# Update largest if right child is larger
if right < n and arr[right] > arr[largest]:
largest = right
# Swap and recursively heapify if the largest is not the current node
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest) # Recursively adjust the subtree
6. Time Complexity Analysis¶
Heap sort’s time complexity is analyzed in two parts:
-
Heap Construction Time Complexity:
Building the heap requiresn/2heapify operations starting from the last non-leaf node. Each heapify takesO(log k)time (wherekis the height of the subtree). The total time for construction is O(n) (sum of all heapify operations is approximatelyn). -
Sorting Process Time Complexity:
The sorting process involvesn-1heapify operations (after each swap, the remaining elements are re-heapified). Each heapify takesO(log n)time (since the heap height islog n). Thus, the total sorting time is O(n log n).
Summary: The total time complexity of heap sort is O(n log n) (best, worst, and average cases are all O(n log n)), with a space complexity of O(1) (in-place sorting with constant extra space).
7. Characteristics of Heap Sort¶
- Stability: Unstable (e.g., in
[3, 2, 2], the relative order of the two2s may change after sorting). - Applicable Scenarios: Suitable for large-scale data sorting, especially when memory is limited (in-place sorting).
8. Conclusion¶
Heap sort converts an unsorted array into a sorted array through the cycle of “build heap → extract top → adjust heap”. Its core advantage is a stable time complexity of O(n log n), making it ideal for large datasets. Although the implementation is slightly complex, the logic is clear, and it remains a classic choice in sorting algorithms.