- Published on
How to Approach Finding Top K Frequent Elements in Python
- Authors

- Name
- Ina Lee
What does “Top K Frequent” mean?
Given an array of integers nums and an integer k,
the goal is to find the k elements that appear most frequently in the array.
For example:
nums = [1, 1, 1, 2, 2, 3]
k = 2
The result should be:
[1, 2]
Because 1 appears 3 times and 2 appears 2 times.
First thought: count everything
No matter which approach we use, it's unavoidable to count how many times each number appeared in nums. Python's dictionary is perfect for this.
from collections import defaultdict
freq = defaultdict(int)
for num in nums:
freq[num] += 1
Now, freq should have each number as key and its frequency as value. It looks like this: Now freq looks like:
{
1: 3,
2: 2,
3: 1
}
Solution 1: Sort by frequency (O(n log n))
The most straightforward idea is:
- Count frequencies
- Sort numbers by frequency
- Take the first
k
from collections import defaultdict
from typing import List
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
freq = defaultdict(int)
for num in nums:
freq[num] += 1
arr = sorted(freq.items(), key=lambda item: item[1], reverse=True)
result = []
for i in range(k):
result.append(arr[i][0])
return result
Why this works
freq.items()produces(number, frequency)pairskey=lambda item: item[1]sorts by frequencyreverse=Truesorts in descending order
Complexity
- Time:
O(n log n)due to sorting - Space:
O(n)for the dictionary and array
Solution 2: Using a Heap (O(n + k log n))
Sorting everything is sometimes unnecessary.
We only care about the top k, not the full order.
This is where a heap helps.
Python’s heapq is a min-heap, so we store negative frequencies to simulate a max-heap.
import heapq
from collections import defaultdict
from typing import List
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
freq = defaultdict(int)
for num in nums:
freq[num] += 1
heap = []
for num, count in freq.items():
heap.append((-count, num))
heapq.heapify(heap)
result = []
for _ in range(k):
result.append(heapq.heappop(heap)[1])
return result
Key idea
- Each heap element is
(-frequency, number) - The smallest element in the heap actually represents the largest frequency
heappop()removes the most frequent element each time
Complexity
- Time:
O(n + k log n) - Space:
O(n)
Solution 3: Bucket Sort (O(n))
There’s an even faster approach if we notice this:
The maximum possible frequency of any number is len(nums).
So we can create an array where:
- index = frequency
- value = list of numbers with that frequency
from collections import defaultdict
from typing import List
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
freq = defaultdict(int)
for num in nums:
freq[num] += 1
buckets = [[] for _ in range(len(nums) + 1)]
for num, count in freq.items():
buckets[count].append(num)
result = []
for i in range(len(buckets) - 1, 0, -1):
for num in buckets[i]:
result.append(num)
if len(result) == k:
return result
Why this works
- Frequencies are bounded → perfect for bucket indexing
- We iterate from highest frequency down
- No sorting or heap needed
Complexity
- Time:
O(n) - Space:
O(n)
Python syntax notes used in this problem
defaultdict(int)
- Automatically initializes missing keys with
0 - Ideal for frequency counting
sorted(freq.items(), key=lambda item: item[1], reverse=True)
freq.items()produces(number, frequency)tupleslambda item: item[1]extracts the frequency from each tuple and uses it as the sorting keyreverse=Truesorts the result in descending order
(-value, key) in heap
- Python heap is a min-heap
- Negating values simulates a max-heap
List initialization with comprehension
[[] for _ in range(n)]
- Creates
nindependent empty lists, because a new list is created on each iteration - Modifying one sublist does not affect the others
- Avoids the common bug from
[[]] * n, where all elements reference the same list object
Backward iteration
for i in range(n - 1, 0, -1):
- Iterates from large to small
- Useful when higher indices has higher priority
Comparing the approaches
| Method | Time Complexity | Space | Notes |
|---|---|---|---|
| Sorting | O(n log n) | O(n) | Simple and readable |
| Heap | O(n + k log n) | O(n) | Efficient when k is small |
| Bucket Sort | O(n) | O(n) | Fastest, but less intuitive |
Final thoughts
- Hashmaps are essential whenever counting is involved
- Heaps are great when you need top / min / max repeatedly
- Bucket sort is helpful when value ranges are limited
- Understanding multiple solutions matters more than memorizing one!