Skip to main content

🏛️ Databases

Organized storage for your application data

The Spreadsheet Analogy

You've used spreadsheets? Then you understand databases!

Spreadsheet:
  - Rows = individual records
  - Columns = fields/attributes
  - Sheets = tables
  - Workbook = database

Database = Super-powered spreadsheet that:
  - Handles millions of rows
  - Multiple users simultaneously
  - Guarantees data integrity
  - Finds data instantly

Why Not Just Use Files?

The File Storage Problem

Store users in a text file:
  Alice,alice@email.com,28
  Bob,bob@email.com,35
  Carol,carol@email.com,22

Problems:
  ✗ Finding user 50,000? Scan entire file
  ✗ Two people editing? Conflicts!
  ✗ Data validation? You're on your own
  ✗ Power outage mid-write? Corrupted data

Databases Solve These

✓ Indexes for fast lookups
✓ Concurrent access handling
✓ Data type validation
✓ Transaction safety
✓ Backup and recovery
✓ Query language (SQL)

Relational Databases

Tables with Relationships

Users Table:
  ┌────┬───────┬────────────────────┐
  │ id │ name  │ email              │
  ├────┼───────┼────────────────────┤
  │ 1  │ Alice │ alice@example.com  │
  │ 2  │ Bob   │ bob@example.com    │
  └────┴───────┴────────────────────┘

Orders Table:
  ┌────┬─────────┬────────┬────────────┐
  │ id │ user_id │ amount │ date       │
  ├────┼─────────┼────────┼────────────┤
  │ 1  │ 1       │ 50     │ Jan 15     │
  │ 2  │ 1       │ 75     │ Jan 20     │
  │ 3  │ 2       │ 25     │ Jan 22     │
  └────┴─────────┴────────┴────────────┘

user_id LINKS to Users table.
This is a RELATIONSHIP.

Why Relationships?

Without relationships:
  Store user info in EVERY order
  Alice changes email? Update EVERY order!

With relationships:
  Store user info once
  Orders link to users by ID
  Change once, reflected everywhere

Key Concepts

Primary Key

Unique identifier for each row:
  ┌────────────┬───────┐
  │ id (PK)    │ name  │
  ├────────────┼───────┤
  │ 1          │ Alice │
  │ 2          │ Bob   │
  └────────────┴───────┘

No two rows can have the same primary key.

Foreign Key

Link to another table's primary key:
  ┌────┬─────────────┐
  │ id │ user_id(FK) │
  ├────┼─────────────┤
  │ 1  │ 1           │ ← Points to Users.id = 1
  │ 2  │ 1           │
  └────┴─────────────┘

Ensures data integrity (can't order for non-existent user).

Indexes

Like a book's index:
  "Show me all Alices"

  Without index: Scan all 1 million rows
  With index: Jump directly to Alices

Dramatically speeds up queries on indexed columns.

ACID Properties

What makes databases reliable:

A - Atomicity
    "All or nothing"
    Transfer $100: debit AND credit both happen, or neither

C - Consistency
  "Rules are consistently enforced"
    Balance can't go negative if that's a rule

I - Isolation
    "No interference"
    Two transfers at same time don't mess each other up

D - Durability
    "Survives crashes"
    Once committed, data survives power outage

Types of Databases

Relational (SQL)

Tables with relationships
Query with SQL
Strong consistency

Examples: PostgreSQL, MySQL, SQLite
Often used for: Most applications, complex queries

Document (NoSQL)

JSON-like documents
Flexible schema
Horizontal scaling

Examples: MongoDB, CouchDB
Often used for: Flexible data, rapid development

Key-Value

Simple key → value pairs
Super fast
Limited queries

Examples: Redis, DynamoDB
Commonly used for: Caching, sessions

Graph

Nodes and relationships
Optimized for connections
Complex relationship queries

Examples: Neo4j, Amazon Neptune
Commonly used for: Social networks, recommendations

Common Operations

Create a Table

Define structure:
  users table with id, name, email columns
  id is primary key (unique identifier)
  name is required (NOT NULL)
  email is often enforced as unique

Insert Data

Add a row:
  id=1, name="Alice", email="alice@example.com"

Query Data

Find something:
  "Give me all users with name containing 'Alice'"
  "Give me orders over $100 from last week"

Update Data

Change something:
  "Change user 1's email to new@example.com"

Delete Data

Remove something:
  "Delete user with id 1"
  "Delete all orders older than a couple of years"

Database Scaling

Vertical Scaling (Scale Up)

Bigger server:
  More RAM, faster CPU, more storage

Limits:
  Eventually hit hardware limits
  Single point of failure
  Expensive at the top end

Horizontal Scaling (Scale Out)

More servers:
  Distribute data across machines
  Add servers as needed

Complexity:
  Data distribution (sharding)
  Consistency challenges
  Network overhead

Replication

Copies of data:
  Primary: handles writes
  Replicas: handle reads

Benefits:
  Read scalability
  Fault tolerance
  Geographic distribution

DatabaseTypeOften Used For
PostgreSQLRelationalGeneral purpose, complex queries
MySQLRelationalWeb applications
SQLiteRelationalEmbedded, small apps
MongoDBDocumentFlexible schemas
RedisKey-ValueCaching, real-time
CassandraColumnMassive scale
Neo4jGraphRelationships

Common Mistakes

1. No Backups

Data loss = business loss.
Automate backups. Test restores!

2. No Indexes on Queried Columns

Slow queries on large tables.
Index columns you search by frequently.

3. Storing Everything in One Table

Leads to data duplication and update anomalies.
Normalize your data (split into related tables).

4. Ignoring Security

Use parameterized queries (prevent SQL injection).
Limit database user permissions.
Encrypt sensitive data.

FAQ

Q: SQL or NoSQL?

Start with SQL unless you have a specific reason for NoSQL. SQL is more commonly needed.

Q: Which database should I learn first?

PostgreSQL - full-featured, free, industry standard.

Q: Cloud database or self-hosted?

Cloud (AWS RDS, Azure SQL) for most cases. Simpler to manage.

Q: How much data can a database handle?

Modern databases handle billions of rows. The limit is usually your hardware and query design.


Summary

Databases are organized systems for storing and managing data reliably.

Key Takeaways:

  • Tables, rows, columns (like spreadsheets)
  • Relationships link data between tables
  • Primary keys uniquely identify rows
  • ACID ensures reliability
  • SQL databases for structured data
  • NoSQL for flexibility and scale
  • Indexes speed up queries
  • Backups are essential

Databases are the backbone of almost every application!

Leave a Comment

Comments (0)

Be the first to comment on this concept.

Comments are approved automatically.