The Day I Learned SQL Doesn’t Care About My Feelings

Vishnu Kainadan

Last Thursday, right after standup, one of our senior engineers walked over to my desk.

“Got an hour?” he asked.
“Let’s look at how queries actually run.”

No alarms. No urgency. Just curiosity.

What followed wasn’t a lecture or a code review. It was a quiet, focused session—one of those conversations that subtly rewires how you think about a core tool you use every day.

We didn’t talk about syntax.
We didn’t talk about ORMs.

We talked about how the database thinks.

This is what I learned.

The Ghost in the Machine

What Is the Query Planner, Really?

When you write SQL, you’re not writing instructions—you’re writing a wish list.


SELECT *
FROM orders
WHERE customer_id = 42;

This doesn’t tell PostgreSQL how to find those rows.
It just says what you want.

The query planner is the invisible architect that figures out the how.

It evaluates multiple strategies:

Should I scan the whole table?
Should I use an index?
In what order should I join tables?

Then it picks the cheapest option based on:

Statistics
Row estimates
Available indexes

The Catch

The planner doesn’t always get it right.

It makes educated guesses using:

Outdated statistics
Incomplete information
Assumptions about data distribution

Two queries that look similar can behave very differently.


— Query A

SELECT *FROM users WHERE email = 'alice@example.com';

— Query B

SELECT * FROM users WHERE status = 'active';

If email has an index and status doesn’t:

Query A → instant
Query B → scans millions of rows

The planner knows this.
You need to know it too.

Key Insight
The planner is cost-based, not rule-based.
Having an index does not guarantee it’ll be used.

If status = ‘active’ matches 90% of rows, PostgreSQL may ignore your beautiful index entirely.

CTEs: The Helper You Trust a Bit Too Much

What’s a CTE?

A Common Table Expression (CTE) is a named subquery you can reference later.


WITH recent_orders AS (

  SELECT user_id, order_id, total_amount
  FROM orders
  WHERE created_at > NOW()  - INTERVAL '7 days'
)

SELECT user_id,
  COUNT(*) AS order_count,
  SUM(total_amount) AS total_spent
FROM recent_orders
GROUP BY user_id;

Developers love CTEs because they:

Improve readability
Replace deeply nested subqueries
Make complex logic feel clean and logical

But here’s what many junior devs miss:

CTEs can lie to the optimizer.

Optimization Fences (The Old Days)

In PostgreSQL versions before 12, CTEs were always materialized.

That means:

PostgreSQL executes the CTE
Stores the result
Uses that stored result—even if it blocks better optimizations

Modern PostgreSQL is smarter, but materialization still matters.

When the Database Takes Your CTE Literally

What Is Materialization?

Materialization means:

The CTE runs once
Results are written to memory (or disk)
The rest of the query treats it like a physical table

When It’s Good


WITH expensive_calculation AS (

  SELECT product_id, AVG(rating) AS avg_rating
  FROM reviews
  WHERE review_date > '2024-01-01'
  GROUP BY product_id
)

SELECT p.name, ec.avg_rating
FROM products p
JOIN expensive_calculation ec
  ON p.id = ec.product_id

WHERE ec.avg_rating > 4.5;

If expensive_calculation is reused multiple times:

Compute once
Reuse everywhere
👍 Worth it

When It’s Terrible


WITH all_users AS (
  SELECT * FROM users -- 10 million rows
)

SELECT * FROM all_users
WHERE account_type = 'premium';

Here:

PostgreSQL materializes 10 million rows
Then applies the filter

Without the CTE, PostgreSQL could:

Push the filter down
Use an index

Avoid loading everything

PostgreSQL 12+ Controls


— Force materialization
WITH active_users AS MATERIALIZED (...)

— Let the planner inline it
WITH active_users AS NOT MATERIALIZED (...)

Rule of Thumb

Used once + downstream filters → NOT MATERIALIZED
Expensive + reused → MATERIALIZED

Two Ways to Dance: Nested Loop vs Hash Join

Why Joins Decide Performance

Joins are where queries live or die.

PostgreSQL chooses join strategies based on:

Data size
Indexes
Available memory

Nested Loop Join

Conceptually:

for each row in outer_table:

find matching rows in inner_table

Example:


SELECT  o.order_id, c.name

FROM orders o
JOIN customers c 
  ON o.customer_id = c.id

WHERE o.created_at = '2025-01-10';

If:

The date filter returns ~50 rows
id is indexed

→ Perfect use case.

When it’s good

Small outer table
Indexed inner table
High selectivity

When it’s a disaster

Large outer table
No index
Two big tables joined blindly

Hash Join


SELECT u.name, COUNT(p.post_id) AS post_count
FROM users u
JOIN posts p 
  ON u.id = p.author_id
GROUP BY u.name;

How it works:

Build a hash table in memory
Probe it for matches

Time complexity: O(n + m)

When it’s good

Large tables
Equality joins
Enough work_mem

The catch
If the hash doesn’t fit in memory → disk spill → slow query.

Breaking the Fourth Wall: LATERAL Joins

What Is LATERAL?

A LATERAL join allows a subquery to reference columns from earlier tables.

Perfect for Top-N per group problems.


SELECT c.name,  recent.order_id,  recent.created_at
FROM customers c
CROSS JOIN LATERAL (
  SELECT order_id, created_at
  FROM orders
  WHERE orders.customer_id = c.id
  ORDER BY created_at DESC
  LIMIT 3
) 
recent;

Why it’s powerful

Clean syntax
Avoids window-function complexity
Planner can optimize it well

Why it’s risky

Subquery runs once per outer row
100k customers → 100k subqueries

Use it when

Outer table is small or heavily filtered
Proper indexes exist

Why GROUP BY Is Quietly Expensive

The Hidden Cost of Aggregation

GROUP BY does more than it looks:

Groups rows
Applies aggregates
Returns one row per group

Two strategies:

Hash Aggregation

Fast
Memory-heavy

Group Aggregation

Sorts first
Slower without pre-sorted data


SELECT  department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;

If there are 50 departments → tiny hash table → efficient.

But grouping by a unique column?
You’re rebuilding the entire table in memory.

Optimization tip
Only group by what you need.
Group by IDs, join names later.

Indexes that match GROUP BY columns can eliminate sorting entirely.

The Autopsy Report

Why EXPLAIN ANALYZE Is Your Best Friend

You can’t optimize what you don’t measure.


EXPLAIN (ANALYZE, BUFFERS)
SELECT  department, COUNT(*) FROM employees
WHERE  hire_date > '2024-01-01'
GROUP BY department;

Key things to watch:

Seq Scan on large tables → missing index
Estimated vs actual rows → stale statistics
Nested Loop with many loops → danger
High buffer reads → disk I/O

The most important line:

actual time=45.231..45.298 rows=48 loops=1

If loops = 10,000, even “fast” steps become slow.

Before vs After

CREATE INDEX idx_hire_date ON employees (hire_date);

Rerun:

Index Scan using idx_hire_date

(actual time=0.045..12.123)

45ms → 12ms

Measurement wins.

Closing Thoughts: Thinking Like the Database

What stayed with me wasn’t a trick or command—it was a shift in perspective.

SQL isn’t a black box.
Performance issues aren’t random.

Every slow query is the result of a logical decision made by the planner using the information it had.

Senior developers don’t fight the database.
They listen to it.

They:

Open EXPLAIN ANALYZE
Read the plan top to bottom
Ask simple questions:
- Why this scan?
- Why this join?
- Why this estimate?

Then they change one thing and measure again.

That’s the real skill.

Not memorizing syntax—but learning how to have a conversation with the planner.

And once you do, SQL stops hurting your feelings.

blog

Common Product Development Challenges Faced By Product Manager

A Product Manager is incharge of developing a product from start to end, in this process the product manager faces many unpredictable challenges. Product managers

SD Developer

AWS Serverless

Serverless Application

Serverless architecture is a software design pattern where applications’ hosting is outsourced to a third-party service provider, eliminating the developer’s need for server software and

Dhanyajeeth Kanhangad

Talk to us to find out about our flexible engagement models.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

The Day I Learned SQL Doesn’t Care About My Feelings

The Day I Learned SQL Doesn’t Care About My Feelings

The Ghost in the Machine

What Is the Query Planner, Really?

The Catch

CTEs: The Helper You Trust a Bit Too Much

What’s a CTE?

Optimization Fences (The Old Days)

When the Database Takes Your CTE Literally

What Is Materialization?

When It’s Good

When It’s Terrible

PostgreSQL 12+ Controls

Two Ways to Dance: Nested Loop vs Hash Join

Why Joins Decide Performance

Nested Loop Join

Example:

Hash Join

Breaking the Fourth Wall: LATERAL Joins

What Is LATERAL?

Why GROUP BY Is Quietly Expensive

The Hidden Cost of Aggregation

The Autopsy Report

Why EXPLAIN ANALYZE Is Your Best Friend

Before vs After

Closing Thoughts: Thinking Like the Database

Read More Articles

Common Product Development Challenges Faced By Product Manager

Serverless Application

We build custom software solutions, faster hassle-free with quality

Engineering Intelligent Platforms for the AI-Driven Enterprise. AI Engineering Enterprise Platforms Intelligent Automation

Company

Career We are Hiring

Our Services

Explore More

Contact Us

The Day I Learned SQL Doesn’t Care About My Feelings

The Day I Learned SQL Doesn’t Care About My Feelings

The Ghost in the Machine

What Is the Query Planner, Really?

The Catch

CTEs: The Helper You Trust a Bit Too Much

What’s a CTE?

Optimization Fences (The Old Days)

When the Database Takes Your CTE Literally

What Is Materialization?

When It’s Good

When It’s Terrible

PostgreSQL 12+ Controls

Two Ways to Dance: Nested Loop vs Hash Join

Why Joins Decide Performance

Nested Loop Join

Example:

Hash Join

Breaking the Fourth Wall: LATERAL Joins

What Is LATERAL?

Why GROUP BY Is Quietly Expensive

The Hidden Cost of Aggregation

The Autopsy Report

Why EXPLAIN ANALYZE Is Your Best Friend

Before vs After

Closing Thoughts: Thinking Like the Database

Read More Articles

Common Product Development Challenges Faced By Product Manager

Serverless Application

General Data Protection Regulation (GDPR)

We build custom software solutions, faster hassle-free with quality

Engineering Intelligent Platforms for the AI-Driven Enterprise. AI Engineering Enterprise Platforms Intelligent Automation