Hash Table

Say you have a list of students and their phone numbers. You want to find Riya’s number fast. If you keep them in a plain array, you have to walk through every student one by one until you reach Riya. With a million students that is slow. A hash table fixes this. A hash table is a data structure that stores key-value pairs. It lets you find a value by its key almost instantly.

In this tutorial we build one from scratch. We will store keys like “riya” and values like a phone number. Then we add three operations. They are insert, search, and delete.

🪣 The Real-World Picture

Think of a wall of post office boxes. Each box has a number. When a letter comes in, the clerk does a quick calculation on the name. That tells the clerk which box the letter goes into. Next time you want that letter, you do the same calculation. Then you walk straight to that one box and grab it. You do not check every box.

A hash table works the same way:

  • The boxes are slots in an array. We call each slot a bucket.
  • The “quick calculation” is the hash function. A hash function turns a key into a number.
  • That number tells us which bucket to use.

🤔 What Is Hashing and Why We Need It

Hashing means turning a key into a small number that we use as an array index. That number is called the hash. Because we jump straight to an index, we do not search the whole array. That is why a hash table is fast.

Here is the small problem we have to handle. Two different keys can sometimes produce the same bucket number. That clash is called a collision. We cannot just throw away one value. So we need a plan.

The plan we use here is called chaining. With chaining, each bucket does not hold one item. It holds a small list of items. So two keys can share a bucket. Both items live in that bucket’s list. To find a key, you go to its bucket. Then you check the few items in that small list.

key: riya

hash function

index 7

bucket 7

riya | 98765

dan | 33445

In the picture “riya” and “dan” both landed in bucket 7. They sit together in that bucket’s list. No data is lost. Later in this tutorial the runnable code inserts these exact two keys. You will see them collide and chain for real.

🧮 The Hash Function

Our hash function is simple. We take each character of the key. We add up its character codes. Then we take the remainder when divided by the number of buckets. The remainder is always a valid bucket index.

The remainder trick uses the modulo operator (%). If we have 10 buckets, then sum % 10 is always a number from 0 to 9. So it always points at a real bucket.

Note

A real hash table uses a much stronger hash function so keys spread out evenly. Ours is kept simple so you can see exactly what happens. The idea is identical.

🛠️ Steps to Build the Hash Table

Before we look at code, here is the plan in order.

  1. Create an array of buckets. Each bucket starts as an empty list.
  2. Write the hash function. It turns a key into a bucket index.
  3. Insert(key, value): hash the key to get the bucket. If the key is already in that bucket, update its value. Otherwise add the new pair to the bucket’s list.
  4. Search(key): hash the key, go to that bucket, and scan its small list for the key. Return the value if found.
  5. Delete(key): hash the key, go to that bucket, find the key in the list, and remove that pair.

The important part is that insert, search, and delete all start the same way. Hash the key, then jump to one bucket. After that we only deal with the few items inside that bucket.

💻 Building It in Code

Below is the full hash table in all five languages. We create a table with a handful of buckets. We insert “riya” and “dan”, which both hash to bucket 7, so they share that bucket through chaining. We add “alex” too, which lands in a different bucket. Then we search for both colliding keys, delete one, and search again. You will see that “dan” is still found after “riya” is gone. That proves chaining kept both keys safe in the same bucket.

hash_table.py
SIZE = 10
class HashTable:
def __init__(self):
# Each bucket is a list of [key, value] pairs.
self.buckets = [[] for _ in range(SIZE)]
# Hash function: sum character codes, then take remainder by SIZE.
def hash(self, key):
total = 0
for ch in key:
total += ord(ch)
return total % SIZE
# Insert or update a key.
def insert(self, key, value):
index = self.hash(key)
for pair in self.buckets[index]:
if pair[0] == key: # key exists, update it
pair[1] = value
return
self.buckets[index].append([key, value]) # new pair in this bucket
# Search returns value, or -1 if the key is missing.
def search(self, key):
index = self.hash(key)
for pair in self.buckets[index]:
if pair[0] == key:
return pair[1]
return -1
# Delete removes a key from its bucket.
def delete(self, key):
index = self.hash(key)
bucket = self.buckets[index]
for i in range(len(bucket)):
if bucket[i][0] == key:
bucket.pop(i)
return
table = HashTable()
table.insert("riya", 98765) # riya and dan both hash to bucket 7
table.insert("dan", 33445) # they chain together in the same bucket
table.insert("alex", 30099)
print("Search riya:", table.search("riya"))
print("Search dan:", table.search("dan"))
table.delete("riya") # remove riya, dan stays in the same bucket
print("Search riya after delete:", table.search("riya"))
print("Search dan after delete:", table.search("dan"))

The output of the above code will be:

Search riya: 98765
Search dan: 33445
Search riya after delete: -1
Search dan after delete: 33445

Tip

Notice all five versions do the same thing. Hash the key to pick a bucket, then work inside that one small bucket. The language changes but the idea stays the same.

⏱️ How Fast Is It?

Now let us talk about speed, which is the part students always ask about.

In the normal case the buckets are spread out evenly. Each bucket holds very few items. So when you insert, search, or delete, you hash once and check one or two items. That is constant time, written O(1). Constant time means the work does not grow as you add more keys.

But there is a bad case. Suppose your hash function is weak and every key lands in the same bucket. Now that one bucket holds all n items in a long list. To find a key you walk the whole list. That is O(n), the same slow speed as a plain array.

So the speed depends on the buckets staying spread out. A good hash function keeps them spread out. That keeps you in the fast O(1) case.

Caution

The worst case O(n) happens only with a poor hash function or a very full table. Pick a good hash function and keep the table from getting too crowded, and you stay in the fast average case.

Here is the full speed picture.

Operation Average Case Worst Case
Insert O(1) O(n)
Search O(1) O(n)
Delete O(1) O(n)

⚠️ Common Mistakes

A few things trip people up when they build their first hash table.

  • Forgetting collisions. Two keys can share a bucket. If your bucket holds only one item, you lose data. Always let a bucket hold a small list.
  • Not checking for an existing key on insert. If you skip that check, inserting the same key twice creates two copies. Then search may return the old value.
  • Returning the wrong “not found” signal. Decide on one missing-key answer (here we use -1) and use it everywhere so the caller knows the key was absent.
  • Using a hash function that ignores part of the key. If two different keys hash the same way too often, you get many collisions and slow searches.

🧩 What You’ve Learned

✅ A hash table stores key-value pairs and finds them fast using a hash function.

✅ A hash function turns a key into a bucket index using a sum of character codes and the modulo operator.

✅ Collisions happen when two keys land in the same bucket, and chaining solves this by keeping a small list in each bucket.

✅ Insert, search, and delete all start by hashing the key to jump to one bucket.

✅ The speed is O(1) on average and O(n) in the worst case when too many keys collide.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    What is the main job of a hash function in a hash table?

    Why: The hash function converts a key into a number that points to a bucket, so we can jump straight to it.

  2. 2

    What is a collision in a hash table?

    Why: A collision is when two different keys produce the same bucket index, and chaining lets both live in that bucket.

  3. 3

    How does chaining handle a collision?

    Why: With chaining each bucket holds a small list, so several colliding pairs can sit together without losing data.

  4. 4

    When does a hash table fall to O(n) worst-case time?

    Why: If a weak hash function piles every key into one bucket, that bucket becomes a long list you must scan fully.

🚀 What’s Next?

Now you know how a hash table works inside. Go deeper on the two ideas that make it fast and reliable.

  • Hash Functions — learn how to design a hash function that spreads keys out evenly.
  • Collision Handling — explore other ways to deal with collisions beyond chaining.

Share & Connect