Load Balancing: L4 vs L7 & Algorithms

Load balancing distributes incoming traffic across many backend servers so no single one becomes a bottleneck. It’s also how you get high availability — if one server dies, the load balancer simply stops sending traffic to it.

Layer 4 vs Layer 7

Load balancers operate at one of two layers of the OSI model:

	Layer 4 (Transport)	Layer 7 (Application)
What it sees	IPs and ports only	Full HTTP, headers, paths, cookies
Speed	Very fast (millions of conn/sec)	Slower (TLS terminate, parse HTTP)
Routing decisions	By IP/port hash	By URL path, host, header, cookie
Examples	AWS NLB, HAProxy TCP mode	NGINX, Envoy, AWS ALB, Cloudflare

The classic algorithms

Round-robin: rotate through servers in order. Simple, fair when all servers are equal.
Least connections: send the next request to the server with the fewest open connections. Better when request times vary wildly.
IP hash: the client’s IP picks the server. Provides session stickiness without cookies.
Weighted: bigger servers get a higher share. Useful for heterogeneous fleets.
Power-of-two-choices: pick two servers at random, send to the less loaded. Surprisingly close to optimal at very low cost.

Health checks are everything

A load balancer is only as good as its health checks. The classic mistakes:

Checking / instead of a real /health endpoint — your homepage might 200 while the database is on fire.
Health endpoint checks downstream dependencies — one slow database takes the whole fleet out of rotation simultaneously.
Too aggressive thresholds — a single failed check pulls a node, causing thundering herds and cascading failures.

The sweet spot: a lightweight /health that checks only local liveness, plus a separate /ready for orchestrators.

Connection draining

When you remove a backend (deploy, scale-in), the load balancer should stop sending new connections but let existing ones finish. AWS calls this “deregistration delay,” NGINX calls it graceful shutdown. Always set this — without it, every deploy means dropped requests.

Global load balancing

One load balancer per region only takes you so far. For a global service you layer GeoDNS or Anycast on top — users hit the closest region, and inside that region a regional LB picks a server. Cloudflare, Fastly, and the major clouds all offer this as managed services.

What to learn next

Load balancers live in front of cloud VPCs and often pair with a CDN. Understanding TCP and TLS is essential to debug them.

Modern Load Balancing: L4 vs L7, Algorithms, and Health Checks

Layer 4 vs Layer 7

The classic algorithms

Health checks are everything

Connection draining

Global load balancing

What to learn next

Private vs Public IPs (RFC 1918 Explained)

The 7-Layer OSI Model Explained

HTTP Request and Response Cycle Explained

ping and traceroute Explained

Static vs Dynamic Routing

Wi-Fi Channels, Frequencies, and Roaming

Leave a Reply Cancel reply

Layer 4 vs Layer 7

The classic algorithms

Health checks are everything

Connection draining

Global load balancing

What to learn next

Similar Posts

Leave a Reply Cancel reply