Foundations & Layering of Load Balancing
A comprehensive deck covering key concepts, motivations, and techniques in load balancing.
1. What is the primary purpose of a load balancer?
To distribute client requests across multiple backend servers so no single server becomes a bottleneck, improving scalability, availability, latency, and allowing zero-downtime deployments.
2. Name the four core motivations for deploying load balancing.
High availability, horizontal scalability, performance/latency smoothing, and zero-downtime (rolling) deployments.
3. How do load balancers contribute to high availability?
They perform health checks and automatically reroute traffic away from failed or overloaded servers.
4. What is a Virtual IP (VIP)?
A service IP address that clients connect to; it is owned by the load balancer, not any single server, abstracting a pool of backend hosts behind one constant endpoint.
5. At which OSI layer does L3 load balancing operate and what information does it use?
Network layer (Layer 3); it makes decisions based solely on IP addresses.
6. Give one technology commonly used for L3 global load balancing.
IP anycast (via BGP advertisements).
7. Which four-tuple does an L4 load balancer typically examine?
Source IP, source port, destination IP, destination port.
8. Why are L4 load balancers generally faster than L7 load balancers?
They treat application data as opaque bytes, avoiding deep packet inspection and heavy parsing.
9. List two common transport protocols an L4 balancer can distribute.
TCP and UDP (also QUIC, which runs over UDP).
10. What extra capability does an L7 load balancer have over an L4 balancer?
It parses application-layer messages (e.g., HTTP) and can route based on URLs, headers, cookies, or body content.
11. In HTTP load balancing, what header does an L7 proxy often inject to preserve the client’s IP?
X-Forwarded-For.
12. Describe the basic request/response flow involving a client, load balancer, and backend server.
DNS resolves → client connects to VIP → LB selects healthy backend → forwards request → backend responds → LB relays response to client.
13. What is zero-downtime deployment in the context of load balancing?
Taking servers out of rotation one at a time for upgrades while traffic continues to flow to other healthy servers.
14. Which load-balancing layer is typically used for SSL/TLS termination and why?
L7, because it needs to decrypt and inspect HTTP/HTTPS traffic to route intelligently.
15. Explain stateless vs stateful load balancer designs.
Stateless LBs make deterministic choices (e.g., hashing) without storing per-connection state; stateful LBs track active sessions to support features like stickiness but add complexity.
16. What is session affinity (a.k.a. sticky sessions)?
A policy ensuring successive requests from the same client are routed to the same backend server.
17. In global DNS-based load balancing, what limits the speed of fail-over?
DNS caching (TTL); clients may continue using old IPs until the cache expires.
18. Name one scenario where a simple L4 balancer is preferable to L7.
Distributing database TCP connections where content-aware routing is unnecessary and low latency overhead is critical.
19. How does Anycast achieve global load distribution?
The same IP is advertised from multiple locations; routing protocols direct clients to the nearest announcement.
20. What is Direct Server Return (DSR) and at which layer is it used?
An L4 technique where inbound traffic passes through the LB but outbound responses bypass it directly to the client, reducing LB bandwidth usage.
21. Which load-balancing tier typically handles DDoS protection and why?
Edge (global or regional) LBs, because they are the first point of contact and can filter malicious traffic before it reaches internal systems.
22. Define health check in load-balancing terminology.
A periodic test (ping, TCP connect, HTTP GET, etc.) used by the LB to verify backend server readiness; failed checks remove servers from rotation.
23. What is the main trade-off between placing multiple LB layers in a request path?
Additional complexity and latency versus specialized, context-aware decision-making at each tier (global vs local vs service-level).
24. Which load-balancing algorithm equally cycles through backends in order?
Round-robin.
25. Which algorithm routes new flows to the backend with the fewest current connections?
Least-connections (or least-load).
26. How does consistent hashing help with cache or shard affinity?
It deterministically maps similar request keys to the same backend, reducing cache misses and rebalance churn.
27. Why might a system choose edge L7 plus internal L4 layering?
Edge L7 handles TLS termination and smart routing, while internal L4 fans out uniform traffic with minimal overhead.
28. Give two examples of software-based L7 reverse proxies commonly used as load balancers.
NGINX and Envoy (also HAProxy, Traefik).
29. What protocol feature makes QUIC interesting for modern L4 balancing?
QUIC multiplexes streams over UDP with built-in encryption and faster connection setup, but still appears as UDP to an L4 balancer.
30. In a blue-green deployment, how does a load balancer enable quick rollback?
Traffic can be switched back from the new (green) pool to the old (blue) pool with a simple configuration update.
31. What document introduced Google’s high-performance software LB called Maglev?
The USENIX 2016 paper 'Maglev: A Fast and Reliable Software Network Load Balancer.'
32. How does ECMP (Equal-Cost Multi-Path) relate to L3 load balancing?
Routers use ECMP to distribute flows across multiple next hops with equal route cost, effectively balancing at the IP layer.
33. Why can DNS-based GSLB be considered stateless once resolution completes?
After a client gets an IP, the DNS system is no longer in the direct traffic path; routing decisions are embedded in the cached answer.
34. What is the typical downside of stateful LB fail-over?
Active sessions may be lost or require state replication, increasing complexity and recovery time.
35. Summarize the main advantage and disadvantage of L7 over L4 load balancing.
Advantage: application-aware, granular routing and security. Disadvantage: higher CPU/latency overhead and lower raw throughput.