How to Approach Finding Top K Frequent Elements in Python

What does “Top K Frequent” mean?

Given an array of integers nums and an integer k,
the goal is to find the k elements that appear most frequently in the array.

For example:

nums = [1, 1, 1, 2, 2, 3]
k = 2

The result should be:

[1, 2]

Because `1` appears 3 times and `2` appears 2 times.

First thought: count everything

No matter which approach we use, it's unavoidable to count how many times each number appeared in nums. Python's dictionary is perfect for this.

from collections import defaultdict

freq = defaultdict(int)
for num in nums:
    freq[num] += 1

Now, freq should have each number as key and its frequency as value. It looks like this: Now freq looks like:

{
    1: 3,
    2: 2,
    3: 1
}

Solution 1: Sort by frequency (`O(n log n)`)

The most straightforward idea is:

Count frequencies
Sort numbers by frequency
Take the first k

from collections import defaultdict
from typing import List

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        freq = defaultdict(int)
        for num in nums:
            freq[num] += 1

        arr = sorted(freq.items(), key=lambda item: item[1], reverse=True)

        result = []
        for i in range(k):
            result.append(arr[i][0])
        return result

Why this works

freq.items() produces (number, frequency) pairs
key=lambda item: item[1] sorts by frequency
reverse=True sorts in descending order

Complexity

Time: O(n log n) due to sorting
Space: O(n) for the dictionary and array

Solution 2: Using a Heap (`O(n + k log n)`)

Sorting everything is sometimes unnecessary.
We only care about the top k, not the full order.

This is where a heap helps.

Python’s heapq is a min-heap, so we store negative frequencies to simulate a max-heap.

import heapq
from collections import defaultdict
from typing import List

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        freq = defaultdict(int)
        for num in nums:
            freq[num] += 1

        heap = []
        for num, count in freq.items():
            heap.append((-count, num))

        heapq.heapify(heap)

        result = []
        for _ in range(k):
            result.append(heapq.heappop(heap)[1])

        return result

Key idea

Each heap element is (-frequency, number)
The smallest element in the heap actually represents the largest frequency
heappop() removes the most frequent element each time

Complexity

Time: O(n + k log n)
Space: O(n)

Solution 3: Bucket Sort (`O(n)`)

There’s an even faster approach if we notice this:

The maximum possible frequency of any number is len(nums).

So we can create an array where:

index = frequency
value = list of numbers with that frequency

from collections import defaultdict
from typing import List

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        freq = defaultdict(int)
        for num in nums:
            freq[num] += 1

        buckets = [[] for _ in range(len(nums) + 1)]
        for num, count in freq.items():
            buckets[count].append(num)

        result = []
        for i in range(len(buckets) - 1, 0, -1):
            for num in buckets[i]:
                result.append(num)
                if len(result) == k:
                    return result

Why this works

Frequencies are bounded → perfect for bucket indexing
We iterate from highest frequency down
No sorting or heap needed

Complexity

Time: O(n)
Space: O(n)

Python syntax notes used in this problem

`defaultdict(int)`

Automatically initializes missing keys with 0
Ideal for frequency counting

`sorted(freq.items(), key=lambda item: item[1], reverse=True)`

freq.items() produces (number, frequency) tuples
lambda item: item[1] extracts the frequency from each tuple and uses it as the sorting key
reverse=True sorts the result in descending order

`(-value, key)` in heap

Python heap is a min-heap
Negating values simulates a max-heap

List initialization with comprehension

[[] for _ in range(n)]

Creates n independent empty lists, because a new list is created on each iteration
Modifying one sublist does not affect the others
Avoids the common bug from [[]] * n, where all elements reference the same list object

Backward iteration

for i in range(n - 1, 0, -1):

Iterates from large to small
Useful when higher indices has higher priority

Comparing the approaches

Method	Time Complexity	Space	Notes
Sorting	O(n log n)	O(n)	Simple and readable
Heap	O(n + k log n)	O(n)	Efficient when `k` is small
Bucket Sort	O(n)	O(n)	Fastest, but less intuitive

Final thoughts

Hashmaps are essential whenever counting is involved
Heaps are great when you need top / min / max repeatedly
Bucket sort is helpful when value ranges are limited
Understanding multiple solutions matters more than memorizing one!

What does “Top K Frequent” mean?

Because 1 appears 3 times and 2 appears 2 times.

First thought: count everything

Solution 1: Sort by frequency (O(n log n))

Why this works

Complexity

Solution 2: Using a Heap (O(n + k log n))

Key idea

Complexity

Solution 3: Bucket Sort (O(n))

Why this works

Complexity

Python syntax notes used in this problem

defaultdict(int)

sorted(freq.items(), key=lambda item: item[1], reverse=True)

(-value, key) in heap

List initialization with comprehension

Backward iteration

Comparing the approaches

Final thoughts

Because `1` appears 3 times and `2` appears 2 times.

Solution 1: Sort by frequency (`O(n log n)`)

Solution 2: Using a Heap (`O(n + k log n)`)

Solution 3: Bucket Sort (`O(n)`)

`defaultdict(int)`

`sorted(freq.items(), key=lambda item: item[1], reverse=True)`

`(-value, key)` in heap