Home Data Engineering ETL & ELT Data Pipelines Tutorials Blog Databases Data Warehousing Big Data Cloud Data SQL Guides Python Guides Tools Glossary Resources About Contact

Databases generally fall into two broad categories: relational databases, which store data in structured tables with defined relationships, and non-relational (NoSQL) databases, which store data in flexible formats like documents, key-value pairs, or graphs.

Relational databases use SQL and enforce strong consistency through transactions, making them a natural fit for structured, transactional data. Non-relational databases trade some of that structure for flexibility and horizontal scalability, which suits high-volume or rapidly changing data.

Understanding database fundamentals — indexing, normalization, and transactions — helps data engineers design schemas that are both fast to query and safe to update.

Database Types

Relational vs Non-Relational

Two broad families of databases, each suited to different workloads.

📊

Relational (SQL)

Structured tables with defined schemas and relationships, queried using SQL.

💾

Non-Relational (NoSQL)

Flexible document, key-value, or graph stores optimized for scale and flexibility.

🔑

Indexing

Data structures that speed up lookups at the cost of extra storage and write overhead.

Why It Matters

Core Database Concepts

Concepts every data engineer should be comfortable with.

  • Normalization — organizing tables to reduce data duplication
  • Primary and foreign keys — defining relationships between tables
  • Transactions — grouping operations so they succeed or fail together
  • Indexes — speeding up common query patterns
  • ACID properties — atomicity, consistency, isolation, durability
FAQ

Databases — Common Questions

Quick answers to frequent questions on this topic.

When should I use a relational database? +
Relational databases are ideal when data has a clear structure and relationships, and when strong consistency and transactions matter.
When should I use a NoSQL database? +
NoSQL databases work well for flexible or rapidly evolving data models, very high write volumes, or simple key-value access patterns.
What is normalization? +
Normalization is the process of structuring tables to minimize data duplication and maintain consistency across related records.
Keep Learning

Related Guides

Continue building context around this topic.

📊

Data Warehousing

See how databases evolve into analytical warehouses for reporting.

🖥️

SQL Guides

Practice writing queries against relational database structures.

Big Data

Learn how databases scale to handle very large datasets.