Database Selection Strategy
Table of Contents + −
You’re cruising through a system design interview. You’ve drawn your boxes, you’ve got a load balancer, some app servers, and then you draw one more box and label it “Database”. The interviewer leans in and asks:
- “Okay, which database would you use here?”
- “And why that one?”
And suddenly your mind goes blank, right? You know a few names. MySQL, MongoDB, Redis, maybe Cassandra. But which one fits this problem? In this lesson we’ll build a simple way to pick a database on purpose, and to explain your choice out loud like an engineer who’s done it before.
🎯 There’s No Default Answer
The first thing to get straight is that there’s no database that’s always right. People want a safe default they can name every time, but that’s exactly the trap:
- The “best” database depends entirely on your data and how you use it.
- A choice that’s perfect for a banking app can be a terrible fit for a chat app.
- So the interviewer isn’t checking which database you like. They’re checking whether you can match a database to the problem in front of you.
This builds directly on the SQL vs NoSQL topic, which compares the two big families. Here we go one step further and turn that comparison into an actual decision you can defend.
Say it depends, then explain
If you blurt out one database name instantly, you look like you memorized it. The strong move is to say “it depends on the data, let me think through it” and then walk the interviewer through your reasoning. The reasoning is the answer they want.
🔍 Start With Access Patterns
Before you name a single database, look at how the data gets used. This is called the access pattern, and getting clear on it does more for your choice than anything else:
- An access pattern is simply how you read and write the data. What questions you ask of it, and how often.
- Are you looking up one thing by its ID, or searching across millions of rows? Are you mostly reading, or mostly writing?
- Different databases are built for different patterns, so once you know the pattern, the right tool often becomes obvious.
Here’s the thing, two systems can store the exact same data and still need different databases, just because they use it differently:
- A store that looks up an order by its ID a million times a day reads in a simple, predictable way.
- A reports tool that scans every order to add up monthly sales reads in a totally different way.
- Same orders, different access patterns, and that difference can point you to two different databases.
Access pattern in one line
An access pattern is just the shape of how your app touches the data: read or write, one record or many, by key or by search. Nail this down first and the rest of the decision gets a lot easier.
🧩 Key Questions to Ask
So how do you figure out the access pattern and everything around it? You ask a few simple questions, the same way you’d clarify requirements at the start of any design. Here are the ones that matter:
- Is the data structured with relationships? Structured means it fits neatly into tables with fixed columns, like rows in a spreadsheet. Relationships mean records link to each other, like an order belonging to a customer. If yes, that leans toward a relational database.
- Do you need strong consistency or transactions? Strong consistency means everyone always sees the latest data the instant it’s written, no stale copies. A transaction means a group of changes all succeed together or all fail together, like moving money between two accounts. Money systems need this.
- How much scale are we talking? Scale here means the amount of data and traffic. Thousands of records is one world, billions is another. Huge scale pushes you toward databases built to spread across many machines.
- Is it read-heavy or write-heavy? Read-heavy means far more people fetch data than change it, like a news site. Write-heavy means lots of new data pours in constantly, like sensors logging readings. This shapes what you optimize for.
Notice these are the same flavors of question from requirements gathering. You’re just pointing them at the data. Answer these four and the database almost picks itself.
🗄️ Matching Needs to Databases
Now let’s turn those answers into a pick. Each kind of database is good at a certain job, so once you know your need, you can map it to a type. Here’s the quick mental flow you can run in your head:
Walk that flow out loud in the interview and you’ll never freeze. Now here’s the same idea as a lookup table, mapping a common need to the database type that fits it best.
| What you need | Database type | Why it fits |
|---|---|---|
| Structured data, relationships, transactions | Relational (SQL) | Strict tables and rock-solid transactions, like orders and payments |
| Flexible or nested records, fast changes | Document | Stores whole objects without a fixed shape, like a user profile |
| Super-fast lookups by a key, caching | Key-value | Grabs a value by its key in a blink, like a session or a cache |
| Heavy analytics over huge data | Column store | Reads one column across billions of rows fast, like reports |
A quick word on each type so the table actually means something:
- Relational, also called SQL. Keeps data in strict tables with fixed columns and links between them. It’s the gold standard when correctness and relationships matter.
- Document. Stores each record as a flexible document, like a little bundle of data that can hold nested bits inside it. Great when the shape of your data keeps changing.
- Key-value. The simplest of all. You hand it a key, it hands back the value, blazingly fast. Perfect for caches and sessions.
- Column store. Stores data by column instead of by row, which makes scanning one field across mountains of data quick. Built for analytics, not for everyday app reads.
You can dig into these in the database tutorials later. For the interview, knowing which job each type is built for is enough to make a confident pick.
🧪 A Worked Example
Let’s make this real. Imagine Alex is designing a shopping app and needs to store a few different things. Same app, but each piece has its own access pattern, so each gets its own database. Watch how the questions guide each choice:
- The orders table. Orders link to customers and products, and an order plus its payment must succeed together or not at all. That’s structured data with relationships and a real need for transactions. So Alex picks a relational database. Losing or half-saving an order is not an option.
- The social feed. Posts have likes, comments, images, tags, and the shape keeps growing as features get added. It’s flexible, nested, and read-heavy at huge scale. So Alex picks a document database, which happily stores each post as one flexible bundle and scales out.
- The session store. When a user logs in, the app needs to check “is this person still logged in?” on every single request, instantly, by a session ID. That’s a pure fast lookup by a key. So Alex picks a key-value store, which answers that in microseconds.
See the pattern? Alex never asked “what’s the best database?” once. For each piece, the answer fell out of its access pattern, consistency need, and scale. That’s exactly the reasoning to say out loud.
🧰 Using More Than One
That worked example just showed something important. One app used three different databases, and that’s completely normal. This idea has a name:
- It’s called polyglot persistence. “Polyglot” means speaking many languages, and “persistence” just means storing data. So it means storing your data across several kinds of databases.
- Real systems do this all the time, because no single database is great at everything.
- You use a relational database for the orders, a key-value store for sessions, maybe a column store for analytics, each doing the job it’s best at.
So in an interview, don’t feel you must crown one winner for the whole system. Saying “I’d use a relational database for orders and a key-value cache for sessions” shows mature thinking. The trade-off is more moving parts to run and keep in sync, so only reach for a new database when a real need calls for it.
More databases, more to manage
Every extra database is one more thing to deploy, back up, monitor, and keep consistent. Polyglot persistence is powerful, but each one you add has a real operational cost. Add a database because the data demands it, not because it sounds clever.
⚠️ Common Mistakes and Misconceptions
A few habits trip people up the moment this question comes. Let’s clear them out:
- “Always pick SQL” or “always pick NoSQL.” Neither is a universal answer. The right pick depends on the data and access pattern, so a fixed default will sometimes be flat wrong.
- “Ignore access patterns.” Choosing by gut feel, before thinking about how the data is read and written, is the biggest miss. The access pattern should drive the decision, not be an afterthought.
- “One database for everything.” Forcing every piece of data into a single database often means a poor fit somewhere. Different jobs can want different tools, and that’s fine.
- “Pick by hype.” Choosing a database because it’s trendy, not because it fits, is a classic blunder. The interviewer wants a reason rooted in the problem, not in what’s popular this year.
🛠️ Practice Challenge
Time to run this yourself. Imagine you’re designing a ride-sharing app, and you need to store these pieces of data:
- Trip records. Each trip links a rider, a driver, and a payment, and the money has to be exactly right.
- Driver locations. Live positions updated every few seconds, looked up instantly by driver ID.
- App settings per user. Flexible preferences that change shape as new features ship.
For each one, walk through the key questions on paper. Is it structured with relationships? Does it need transactions? How much scale? Read-heavy or write-heavy? Then name the database type you’d pick and say why in one sentence, just like Alex did. Do it out loud, as if an interviewer is listening, because that’s exactly the skill you’re building.
🧩 What You’ve Learned
You now have a repeatable way to pick a database and defend it. Here’s what you’ve picked up.
- ✅ There’s no default database. The right one depends on your data and how you use it.
- ✅ Start with the access pattern, how the data is read and written, since it drives the choice.
- ✅ Ask about structure and relationships, consistency and transactions, scale, and read-versus-write balance.
- ✅ Relational fits structured data with transactions, document fits flexible or nested data, key-value fits fast lookups, and column stores fit heavy analytics.
- ✅ Polyglot persistence is normal: real systems use several databases, each for its best job.
- ✅ Always justify your pick out loud, and name the cost of adding another database.
Check Your Knowledge
Test what you learned. Pick an answer for each question, then click Check.
- 1
What should you look at first when choosing a database?
Why: The access pattern drives the choice, because the same data can need different databases depending on how it is used.
- 2
When does a relational (SQL) database fit best?
Why: Relational databases shine with structured data, clear relationships, and reliable transactions, like orders and payments.
- 3
Which database type is best for instant lookups by a session ID?
Why: A key-value store hands back a value by its key in microseconds, which is perfect for sessions and caches.
- 4
What is polyglot persistence?
Why: It means storing data across several database types, such as relational for orders and key-value for sessions, each doing what it does best.
🚀 What’s Next?
You can now choose a database on purpose. Next, go deeper on the comparisons and the bigger picture.
- SQL vs NoSQL breaks down the two big database families side by side, so your picks rest on a solid foundation.
- Identifying Bottlenecks shows how to spot where a system, including its database, strains first as traffic grows.
Run the four questions on a few sample systems, and “which database, and why?” will turn from a scary opener into a checklist you can work through calmly.