Databases — Guide

Databases generally fall into two broad categories: relational databases, which store data in structured tables with defined relationships, and non-relational (NoSQL) databases, which store data in flexible formats like documents, key-value pairs, or graphs.

Relational databases use SQL and enforce strong consistency through transactions, making them a natural fit for structured, transactional data. Non-relational databases trade some of that structure for flexibility and horizontal scalability, which suits high-volume or rapidly changing data.

Understanding database fundamentals — indexing, normalization, and transactions — helps data engineers design schemas that are both fast to query and safe to update.

Database Types

Relational vs Non-Relational

Two broad families of databases, each suited to different workloads.

📊

Relational (SQL)

Structured tables with defined schemas and relationships, queried using SQL.

💾

Non-Relational (NoSQL)

Flexible document, key-value, or graph stores optimized for scale and flexibility.

🔑

Indexing

Data structures that speed up lookups at the cost of extra storage and write overhead.

Why It Matters

Core Database Concepts

Concepts every data engineer should be comfortable with.

Normalization — organizing tables to reduce data duplication
Primary and foreign keys — defining relationships between tables
Transactions — grouping operations so they succeed or fail together
Indexes — speeding up common query patterns
ACID properties — atomicity, consistency, isolation, durability

FAQ

Databases — Common Questions

Quick answers to frequent questions on this topic.

When should I use a relational database? +

Relational databases are ideal when data has a clear structure and relationships, and when strong consistency and transactions matter.

When should I use a NoSQL database? +

NoSQL databases work well for flexible or rapidly evolving data models, very high write volumes, or simple key-value access patterns.

What is normalization? +

Normalization is the process of structuring tables to minimize data duplication and maintain consistency across related records.

Keep Learning

Related Guides

Continue building context around this topic.

📊

Data Warehousing

See how databases evolve into analytical warehouses for reporting.

🖥️

SQL Guides

Practice writing queries against relational database structures.

⚡

Big Data

Learn how databases scale to handle very large datasets.