The Day I Learned SQL Doesn’t Care About My Feelings

What is SQL Database

The Day I Learned SQL Doesn’t Care About My Feelings

Last Thursday, right after standup, one of our senior engineers walked over to my desk.

“Got an hour?” he asked.
“Let’s look at how queries actually run.”

No alarms. No urgency. Just curiosity.

What followed wasn’t a lecture or a code review. It was a quiet, focused session—one of those conversations that subtly rewires how you think about a core tool you use every day.

We didn’t talk about syntax.
We didn’t talk about ORMs.

We talked about how the database thinks.

This is what I learned.

The Ghost in the Machine

What Is the Query Planner, Really?

When you write SQL, you’re not writing instructions—you’re writing a wish list.


SELECT *
FROM orders
WHERE customer_id = 42;

This doesn’t tell PostgreSQL how to find those rows.
It just says what you want.

The query planner is the invisible architect that figures out the how.

It evaluates multiple strategies:

  • Should I scan the whole table?
  • Should I use an index?
  • In what order should I join tables?

Then it picks the cheapest option based on:

  • Statistics
  • Row estimates
  • Available indexes

The Catch

The planner doesn’t always get it right.

It makes educated guesses using:

  • Outdated statistics
  • Incomplete information
  • Assumptions about data distribution

Two queries that look similar can behave very differently.

Query A

SELECT *FROM users WHERE email = 'alice@example.com';

— Query B

SELECT * FROM users WHERE status = 'active';

If email has an index and status doesn’t:

  • Query A → instant
  • Query B → scans millions of rows

The planner knows this.
You need to know it too.

Key Insight
The planner is cost-based, not rule-based.
Having an index does not guarantee it’ll be used.

If status = ‘active’ matches 90% of rows, PostgreSQL may ignore your beautiful index entirely.

CTEs: The Helper You Trust a Bit Too Much

What’s a CTE?

A Common Table Expression (CTE) is a named subquery you can reference later.


WITH recent_orders AS (

  SELECT user_id, order_id, total_amount
  FROM orders
  WHERE created_at > NOW()  - INTERVAL '7 days'
)

SELECT user_id,
  COUNT(*) AS order_count,
  SUM(total_amount) AS total_spent
FROM recent_orders
GROUP BY user_id;

Developers love CTEs because they:

  • Improve readability
  • Replace deeply nested subqueries
  • Make complex logic feel clean and logical

But here’s what many junior devs miss:

CTEs can lie to the optimizer.

Optimization Fences (The Old Days)

In PostgreSQL versions before 12, CTEs were always materialized.

That means:

  1. PostgreSQL executes the CTE
  2. Stores the result
  3. Uses that stored result—even if it blocks better optimizations

Modern PostgreSQL is smarter, but materialization still matters.

When the Database Takes Your CTE Literally

What Is Materialization?

Materialization means:

  • The CTE runs once
  • Results are written to memory (or disk)
  • The rest of the query treats it like a physical table

When It’s Good


WITH expensive_calculation AS (

  SELECT product_id, AVG(rating) AS avg_rating
  FROM reviews
  WHERE review_date > '2024-01-01'
  GROUP BY product_id
)

SELECT p.name, ec.avg_rating
FROM products p
JOIN expensive_calculation ec
  ON p.id = ec.product_id

WHERE ec.avg_rating > 4.5;

If expensive_calculation is reused multiple times:

  • Compute once
  • Reuse everywhere
  • 👍 Worth it

When It’s Terrible


WITH all_users AS (
  SELECT * FROM users -- 10 million rows
)

SELECT * FROM all_users
WHERE account_type = 'premium';

Here:

  • PostgreSQL materializes 10 million rows
  • Then applies the filter

Without the CTE, PostgreSQL could:

  • Push the filter down
  • Use an index

Avoid loading everything

PostgreSQL 12+ Controls

Force materialization
WITH active_users AS MATERIALIZED (...)

— Let the planner inline it
WITH active_users AS NOT MATERIALIZED (...)

Rule of Thumb

  • Used once + downstream filters → NOT MATERIALIZED
  • Expensive + reused → MATERIALIZED

Two Ways to Dance: Nested Loop vs Hash Join

Why Joins Decide Performance

Joins are where queries live or die.

PostgreSQL chooses join strategies based on:

  • Data size
  • Indexes
  • Available memory

Nested Loop Join

Conceptually:

for each row in outer_table:

find matching rows in inner_table

Example:


SELECT  o.order_id, c.name

FROM orders o
JOIN customers c 
  ON o.customer_id = c.id

WHERE o.created_at = '2025-01-10';

If:

  • The date filter returns ~50 rows
  • id is indexed

→ Perfect use case.

When it’s good

  • Small outer table
  • Indexed inner table
  • High selectivity

When it’s a disaster

  • Large outer table
  • No index
  • Two big tables joined blindly

Hash Join


SELECT u.name, COUNT(p.post_id) AS post_count
FROM users u
JOIN posts p 
  ON u.id = p.author_id
GROUP BY u.name;

How it works:

  1. Build a hash table in memory
  2. Probe it for matches

Time complexity: O(n + m)

When it’s good

  • Large tables
  • Equality joins
  • Enough work_mem

The catch
If the hash doesn’t fit in memory → disk spill → slow query.

Breaking the Fourth Wall: LATERAL Joins

What Is LATERAL?

A LATERAL join allows a subquery to reference columns from earlier tables.

Perfect for Top-N per group problems.


SELECT c.name,  recent.order_id,  recent.created_at
FROM customers c
CROSS JOIN LATERAL (
  SELECT order_id, created_at
  FROM orders
  WHERE orders.customer_id = c.id
  ORDER BY created_at DESC
  LIMIT 3
) 
recent;

Why it’s powerful

  • Clean syntax
  • Avoids window-function complexity
  • Planner can optimize it well

Why it’s risky

  • Subquery runs once per outer row
  • 100k customers → 100k subqueries

Use it when

  • Outer table is small or heavily filtered
  • Proper indexes exist

Why GROUP BY Is Quietly Expensive

The Hidden Cost of Aggregation

GROUP BY does more than it looks:

  1. Groups rows
  2. Applies aggregates
  3. Returns one row per group

Two strategies:

Hash Aggregation

  • Fast
  • Memory-heavy

Group Aggregation

  • Sorts first
  • Slower without pre-sorted data

SELECT  department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;

If there are 50 departments → tiny hash table → efficient.

But grouping by a unique column?
You’re rebuilding the entire table in memory.

Optimization tip
Only group by what you need.
Group by IDs, join names later.

Indexes that match GROUP BY columns can eliminate sorting entirely.

The Autopsy Report

Why EXPLAIN ANALYZE Is Your Best Friend

You can’t optimize what you don’t measure.


EXPLAIN (ANALYZE, BUFFERS)
SELECT  department, COUNT(*) FROM employees
WHERE  hire_date > '2024-01-01'
GROUP BY department;

Key things to watch:

  • Seq Scan on large tables → missing index
  • Estimated vs actual rows → stale statistics
  • Nested Loop with many loops → danger
  • High buffer reads → disk I/O

The most important line:

actual time=45.231..45.298 rows=48 loops=1

If loops = 10,000, even “fast” steps become slow.

Before vs After

CREATE INDEX idx_hire_date ON employees (hire_date);

Rerun:

Index Scan using idx_hire_date

(actual time=0.045..12.123)

45ms → 12ms

Measurement wins.

Closing Thoughts: Thinking Like the Database

What stayed with me wasn’t a trick or command—it was a shift in perspective.

SQL isn’t a black box.
Performance issues aren’t random.

Every slow query is the result of a logical decision made by the planner using the information it had.

Senior developers don’t fight the database.
They listen to it.

They:

  • Open EXPLAIN ANALYZE
  • Read the plan top to bottom
  • Ask simple questions:
    • Why this scan?
    • Why this join?
    • Why this estimate?

Then they change one thing and measure again.

That’s the real skill.

Not memorizing syntax—but learning how to have a conversation with the planner.

And once you do, SQL stops hurting your feelings.

Read More Articles

Serverless application
AWS Serverless

Serverless Application

Serverless architecture is a software design pattern where applications’ hosting is outsourced to a third-party service provider, eliminating the developer’s need for server software and

 Contact Us Now

Talk to us to find out about our flexible engagement models.

Get In Touch With Us