João Duarte

Why I'm building a database from scratch (and why you should too)

This article is part of a series.

Series: How Databases Work: Building a PostgreSQL-Inspired DB in Rust

The production incident that ignited this idea

A few months ago, we had a database performance crisis at work. It wasn’t your typical “missing index” problem. The system would run fine for hours, handling thousands of queries across multiple indexes. Then, seemingly randomly, query times would degrade from milliseconds to seconds. The database wasn’t out of memory. CPU wasn’t maxed. Rows fetched peaking afterwards, which looked… weird, but not obviously wrong.

After days of investigation, we found the culprit hiding deep in PostgreSQL’s internals. It was a combination of factors that created the perfect storm. We were able to handle it, but I walked away with an uncomfortable realization:

I’d been using PostgreSQL for years, but I didn’t actually understand how it worked.

Sure, I knew the concepts. B+ trees, ACID properties, indexes. I could draw you a high-level diagram of how a query planner works. But when faced with real performance mysteries, my knowledge was too shallow. It was like being a pilot who knows which buttons to press but doesn’t understand aerodynamics.

So I decided to fix that the only way I know how: by building my own database from scratch.

Why PostgreSQL?

I’ve built simple key-value stores before. You know how it goes: throw a hash map at the problem, add some persistence, call it a day. They’re fun weekend projects, but they don’t teach you all the nuances that a large production database has to deal.

By following PostgreSQL’s architecture, I’ll learn:

Actually, these aren’t PostgreSQL-specific ideas. They’re the foundations of almost every relational database. Learn them once, understand databases forever.

Why Rust?

I come from a web development background, so Ruby, Elixir, JavaScript, Python, the occasional Go service. These languages are fantastic for applications, but they hide the exact details I need to understand.

Rust forces you to think about things databases care about:

 1// In Python, this is just a string
 2data = "Hello, World!"
 3
 4// In Rust, everything is explicit
 5let data: &[u8] = b"Hello, World!";  // Immutable bytes
 6let page: [u8; 8192] = [0; 8192];    // Exactly 8KB, stack allocated
 7let offset: usize = 42;               // Platform-specific size
 8unsafe {
 9    // Sometimes you need to cast bytes to structs
10    let header = &*(page.as_ptr() as *const PageHeader);
11}

This explicitness is perfect for database work. You’re always thinking about:

Plus, Rust gives us modern tooling (cargo, great compiler errors, built-in testing) and safety guarantees that make development faster, not slower. No segfaults when manipulating page buffers. No data races when we eventually add concurrency.

Is Rust the “best” language for databases? Maybe not. But I think it’s a good language for learning to build databases.

What we’re building

Meet InoxDB, the database we’re building together!

InoxDB is our PostgreSQL-inspired database written in Rust. The name? Stainless steel for a Rust project, because even our puns are over-engineered.

Over the next weeks, we’ll build a real SQL database that:

Core Features

Architecture Overview

┌────────────────────────────────────────────┐
│              SQL Layer                     │
│ Parser (sqlparser-rs) → Planner → Executor │
└────────────────────┬───────────────────────┘
                     │
┌────────────────────▼───────────────────────┐
│          Access Methods                    │  
│   Sequential Scan, Index Scan (B+ Tree)    │
└────────────────────┬───────────────────────┘
                     │
┌────────────────────▼───────────────────────┐
│          Buffer Pool Manager               │
│   Page Cache, Eviction (LRU), Dirty Pages  │
└────────────────────┬───────────────────────┘
                     │
┌────────────────────▼───────────────────────┐
│           Storage Engine                   │
│   Heap Files, Pages (8KB), WAL, Records    │
└────────────────────────────────────────────┘

Smart decisions

One key decision: we’re using sqlparser-rs instead of building our own SQL parser. Why? Parsing is a fascinating but separate problem domain. By using a battle-tested parser, we can focus 100% on database internals: the storage engine, query execution, buffer management. That’s where the real database magic happens.

Not building (yet)

This scope is carefully chosen. Every feature teaches a fundamental concept without getting lost in complexity.

Demo: What InoxDB will look like

Here’s what we’re working toward:

 1$ cargo run --release
 2InoxDB v0.1.0 (PostgreSQL-inspired database)
 3    
 4Type \h for help, \q to quit
 5
 6inox> CREATE TABLE employees (
 7    id INTEGER PRIMARY KEY,
 8    name VARCHAR(100) NOT NULL,
 9    department VARCHAR(50),
10    salary INTEGER,
11    active BOOLEAN DEFAULT true
12);
13CREATE TABLE
14Time: 12ms
15
16inox> INSERT INTO employees (id, name, department, salary, active) VALUES
17    (1, 'Alice Johnson', 'Engineering', 95000, true),
18    (2, 'Bob Smith', 'Sales', 65000, true),
19    (3, 'Charlie Brown', 'Engineering', 85000, false);
20INSERT 3
21Time: 8ms
22
23inox> SELECT name, salary FROM employees 
24      WHERE department = 'Engineering' AND active = true
25      ORDER BY salary DESC;
26┌───────────────┬────────┐
27│ name          │ salary │
28├───────────────┼────────┤
29│ Alice Johnson │ 9500030└───────────────┴────────┘
31(1 row)
32Time: 3ms
33
34inox> \d employees
35Table "employees"
36┌────────────┬──────────────┬──────────┐
37│ Column     │ Type         │ Nullable │
38├────────────┼──────────────┼──────────┤
39│ id         │ INTEGER      │ NOT NULL │
40│ name       │ VARCHAR(100) │ NOT NULL │
41│ department │ VARCHAR(50)  │ NULL     │
42│ salary     │ INTEGER      │ NULL     │
43│ active     │ BOOLEAN      │ NULL     │
44└────────────┴──────────────┴──────────┘
45Indexes:
46  "employees_pkey" PRIMARY KEY (id)
47
48inox> \q
49Goodbye!

The Roadmap

Each major milestone builds on the previous one:

Milestone 1: Storage Foundations
Build heap files and page management. Watch your data survive a restart.

Milestone 2: Write-Ahead Logging
Implement durability. Crash InoxDB mid-write and see it recover perfectly.

Milestone 3: Buffer Pool Magic
Add caching. Watch query performance improve dramatically without changing query logic.

Milestone 4: System Catalogs
Make the database self-aware. Store table definitions… in tables. (This one will bend your brain!)

Milestone 5: SQL Comes Alive
Connect everything. Parse SQL, plan queries, execute them against real data.

Join the InoxDB journey

Next week, we dive into the deep end:

Storage Foundations: How to Persist a Row

We’ll answer questions like:

We’ll write actual Rust code that takes a row and turns it into bytes on disk. Bytes that survive process restarts, system crashes, and even me accidentally deleting the wrong files (there will be stories).

The InoxDB code for each post will be available on GitHub, tagged by milestone so you can:

Ready to understand databases from the inside out? Welcome to the InoxDB project. See you next week.


Building InoxDB alongside me? Have questions or insights to share? Find me on Twitter/X or jump into discussions on the GitHub repo. This is more fun when we learn together.

This article is part of a series.

Series: How Databases Work: Building a PostgreSQL-Inspired DB in Rust