Hash Functions

Say you have a name like β€œRiya” and you want to store it in an array. But arrays only understand numbers. You can’t tell an array β€œput this at slot Riya”. So how do you turn a word into a slot number? That is the exact job of a hash function. A hash function takes a key and gives back a number. You use that number as an array index.

πŸ“š What Is a Hash Function?

A hash function is just a small piece of code. You give it a key. A key is the thing you want to store, like a name or a word. The function gives you back a number. That number tells you which slot in the array to use.

So the flow is simple:

  • You start with a key. This is the thing you want to store, like β€œRiya”.
  • The hash function reads that key.
  • It returns an index. This is a plain number, like 3.
  • You store the value at that index in the array.

The array we store things in is called a hash table. It is just a normal array. The only difference is that we decide the slot using a hash function. We do not pick the slot ourselves.

Key: Apple

Hash Function

Index: 3

Store at slot 3

🏠 A Real-World Way to Think About It

Think about a big locker room at a gym. There are 100 lockers. When you walk in, the staff does not let you pick any locker you like. They look at the first letter of your name. Then they send you to a fixed locker.

So β€œArjun” always goes to the same locker. Every single time. You never search around. You go straight there.

That is hashing. The staff’s rule is the hash function. The locker number is the index. The rule always sends β€œArjun” to the same place. So you can find your stuff fast later.

Note

The big win of hashing is speed. Once the hash function tells you the index, you go straight to that slot. You do not check every item one by one. That is why hash tables are so fast for lookups.

🎯 Why We Even Need a Hash Function

Here is the pain. Imagine you store names in a normal array with no rule. Later you want to find β€œRiya”. You have to walk through the array. You compare every name until you find it. If there are a million names, that is slow.

The hash function fixes this. You do not search at all. You just compute the index from the key and jump straight there.

But the hash function has to follow some rules to be useful. Let us look at those next.

βœ… What Makes a Good Hash Function

A hash function is not just any random math. A good one has a few clear qualities.

  • It is fast. You call it on every store and every lookup. So it must run quickly. No heavy work inside.
  • It is consistent. The same key must always give the same index. Say β€œRiya” gives 3 today and 7 tomorrow. Then you can never find your data again.
  • It spreads keys evenly. Different keys should land in different slots as much as possible. You do not want everyone crowding into slot 0.

That last point is the one people get wrong. Sometimes two different keys land on the same index. We call that a collision. A collision is when two keys want the same slot. Some collisions are normal. But a good hash function keeps them rare. It does this by spreading keys all over the table.

Tip

Here is a simple test for even spreading. Imagine throwing 100 keys at a table of 100 slots. A good hash function fills the slots roughly one each. A bad one dumps 90 keys into 10 slots and leaves the rest empty.

πŸ”’ A Simple Hash Function, Step by Step

Let us build the most common beginner hash function. It works on strings. The idea has two small steps.

First, turn the string into a number. Every character has a number code behind it. For example, β€œA” is 65 and β€œB” is 66, and so on. This is called the character code. So we walk through the string and add up all the character codes. Now we have one big number for the whole string.

Second, that big number could be huge. But our table only has, say, 10 slots. So we use modulo to fold it into range. Modulo is the remainder after division. We do sum % tableSize. The remainder is always between 0 and tableSize minus 1. That is perfect for an index.

So the full recipe is:

  1. Start a running total at 0.
  2. For each character in the key, add its character code to the total.
  3. Return total % tableSize.

Let us trace it by hand for the word β€œcat” with a table size of 10. The codes are c = 99, a = 97, t = 116. The sum is 99 + 97 + 116 = 312. Then 312 % 10 = 2. So β€œcat” goes to slot 2.

πŸ’» The Hash Function in Code

Here is that same hash function in all five languages. Each one takes a string and a table size. Then it returns an index. We test it with a few words so you can see the numbers it produces.

hash_function.py
# Add up character codes, then fold into table range with modulo
def hash_key(key, table_size):
total = 0
for ch in key:
total += ord(ch) # ord gives the character code
return total % table_size # squeeze into 0..table_size-1
table_size = 10
print("cat ->", hash_key("cat", table_size))
print("dog ->", hash_key("dog", table_size))
print("Riya ->", hash_key("Riya", table_size))

The output of the above code will be:

cat -> 2
dog -> 4
Riya -> 5

Note

See how each word gets a steady index? Run it again and β€œcat” is still 2. That is the consistent part. Same key, same index, every time.

⚠️ When a Hash Function Is Bad

Now let us see the other side. A bad hash function still returns a number. But it returns bad numbers. They pile up in a few slots.

Take our sum-of-codes idea but with a tiny twist. What if you only used the first letter of the string? Then β€œcat”, β€œcar”, and β€œcup” all start with β€œc”. They all give the same index. Three keys, one slot. That is a lot of collisions right away.

Here is what bad hash functions usually do wrong:

  • They ignore most of the key. Using only the first letter throws away information. Less information means more keys clashing.
  • They pick a weak table size. Say your table size shares factors with your sums. Then keys bunch up. Many people use a prime number as the table size. A prime spreads things better.
  • They return the same index too often. When most keys map to a handful of slots, your fast lookup turns slow again. You are back to searching long chains.

Caution

A collision is not a crash. The hash table keeps working. But every collision adds extra work. Now two keys share a slot, so the table has to tell them apart. Too many collisions, and your hash table is no faster than a plain list.

Here is a small picture of the difference. On the left, a good hash function spreads keys out. On the right, a bad one crowds them.

Bad

cat

Slot 0

dog

Riya

Good

cat

Slot 2

dog

Slot 4

Riya

Slot 5

πŸ“Š Good vs Bad Hash Function

Here is a quick side-by-side so the qualities stick.

Quality Good Hash Function Bad Hash Function
Speed Runs fast on every call Does heavy work each time
Consistency Same key gives same index Index can change for same key
Spreading Keys land all over the table Keys pile into a few slots
Collisions Rare Very common

🧩 What You’ve Learned

  • βœ… A hash function takes a key and returns a number you use as an array index.
  • βœ… The array we store into is called a hash table.
  • βœ… A good hash function is fast, gives the same index for the same key, and spreads keys evenly.
  • βœ… A common beginner hash sums the character codes of a string, then takes sum % tableSize.
  • βœ… When two keys land on the same index, that is a collision.
  • βœ… A bad hash function causes many collisions, which makes lookups slow again.

Check Your Knowledge

Test what you learned. Pick an answer for each question, then click Check.

  1. 1

    What does a hash function return?

    Why: A hash function turns a key into a number that tells you which slot of the array to use.

  2. 2

    Why do we use modulo (sum % tableSize) in a string hash function?

    Why: Modulo gives the remainder, which always falls inside the table's index range.

  3. 3

    Which is a quality of a GOOD hash function?

    Why: Spreading keys evenly keeps collisions rare and lookups fast.

  4. 4

    What is a collision?

    Why: A collision happens when two different keys are sent to the same slot by the hash function.

πŸš€ What’s Next?

Share & Connect