The Checkout Lane Analogy
At a grocery store:
One lane: Long line, slow checkout Multiple lanes: Shorter lines, faster service
A store manager directs customers to lanes with shorter lines.
Load balancer is that manager. It distributes requests across multiple servers so no single server gets overwhelmed.
What Is Load Balancing?
Distribute incoming traffic across multiple servers.
Without load balancer:
All traffic → One server → Overloaded!
With load balancer:
Traffic → Load Balancer → Server 1
→ Server 2
→ Server 3
Spread the load, handle more traffic.
Why Load Balance?
1. Performance
One server: 1000 requests/second max
Three servers: 3000 requests/second!
More servers = More capacity.
2. Reliability
Server 1 dies?
Load balancer removes it from rotation.
Traffic continues to servers 2 and 3.
No single point of failure.
3. Scaling
Traffic increasing?
Add more servers.
Load balancer includes them automatically.
Scale horizontally.
How It Works
┌─────────────────┐
│ Clients │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Load Balancer │
│ │
│ - Health checks │
│ - Routing algo │
└────────┬────────┘
│
┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│Server 1 │ │Server 2 │ │Server 3 │
└─────────┘ └─────────┘ └─────────┘
Load Balancing Algorithms
Round Robin
Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1 (repeat)
Simple, fair, even distribution.
Doesn't account for server capacity.
Weighted Round Robin
Server 1: 4 cores (weight 4)
Server 2: 2 cores (weight 2)
Server 3: 2 cores (weight 2)
Server 1 gets 4× more traffic.
Account for different server sizes.
Least Connections
Server 1: 10 active connections
Server 2: 5 active connections
Server 3: 8 active connections
Next request → Server 2 (least busy)
Better for long-running connections.
IP Hash
hash(client IP) → Server
Same client often hits the same server.
Useful for session affinity.
Random
Pick server randomly.
Simple.
Works well with many servers.
Types of Load Balancers
Layer 4 (Transport)
Works at TCP/UDP level.
Routes based on IP and port.
Very fast.
Doesn't understand HTTP content.
Can't route based on URL or headers.
Layer 7 (Application)
Works at HTTP level.
Routes based on URL, headers, cookies.
More flexible.
/api/* → API servers
/static/* → CDN
Slightly more overhead.
Comparison
| Aspect | Layer 4 | Layer 7 |
|---|---|---|
| Speed | Faster | Slightly slower |
| Routing | Mostly IP/port | URL, headers, cookies |
| SSL termination | No | Yes |
| Content inspection | No | Yes |
Health Checks
Load balancer checks if servers are healthy.
/health endpoint returns 200 OK? Healthy!
Returns 500 or timeout? Unhealthy!
Unhealthy server removed from rotation.
When healthy again, added back.
Health Check Configuration
Check interval: every few seconds
Timeout: short
Unhealthy threshold: 3 failures
Healthy threshold: 2 successes
Fine-tune based on your needs.
Session Persistence (Sticky Sessions)
The Problem
User logs in → Server 1 (session created)
Next request → Server 2 (no session!)
User logged out!
Solutions
1. Sticky sessions:
Same user often goes to the same server.
Uses cookie or IP hash.
2. Shared session store:
Sessions in Redis/database.
Any server can read session.
3. Stateless design:
JWT tokens contain all user info.
No server-side sessions.
#3 is often a strong option for scalability.
Popular Load Balancers
| Tool | Type | Often Used For |
|---|---|---|
| Nginx | Software | General purpose |
| HAProxy | Software | High performance |
| AWS ALB/NLB | Cloud | AWS applications |
| Google Cloud LB | Cloud | GCP applications |
| Traefik | Software | Container-native |
| F5 | Hardware | Enterprise |
Common Patterns
Active-Passive
Load balancer hot standby.
Active fails → Passive takes over.
Active-Active
Multiple load balancers.
DNS or another balancer distributes.
Full redundancy.
Global Load Balancing
Route to nearest datacenter.
US user → US datacenter
EU user → EU datacenter
Uses DNS or Anycast.
Common Mistakes
1. No Health Checks
Dead server still receiving traffic.
Users get errors.
Usually configure health checks.
2. Session Problems
Forgot about session affinity.
Users randomly logged out.
Use sticky sessions or shared store.
3. Single Load Balancer
Load balancer becomes single point of failure!
Use multiple LBs or managed cloud service.
4. Ignoring Capacity
Round robin with mixed server sizes.
Small server overwhelmed.
Use weighted algorithms.
FAQ
Q: Hardware vs software load balancer?
Software (Nginx, HAProxy) is flexible and cheap. Hardware (F5) for extreme performance.
Q: Can I load balance databases?
Yes, but carefully. Read replicas can be load balanced. Writes usually go to primary.
Q: How many servers behind a load balancer?
Depends on traffic. Start with 2-3 for redundancy. Scale as needed.
Q: What about WebSockets?
Use sticky sessions or Layer 7 load balancer that understands WebSocket upgrades.
Summary
Load balancing distributes traffic across servers for performance, reliability, and scalability.
Key Takeaways:
- Spread load across multiple servers
- Algorithms: round robin, least connections, etc.
- Layer 4 (fast) vs Layer 7 (flexible)
- Health checks remove failing servers
- Handle sessions with sticky or shared stores
- No single load balancer = no single failure
Load balancing is the foundation of scalable web architecture!
Related Concepts
Leave a Comment
Comments (0)
Be the first to comment on this concept.
Comments are approved automatically.