Traffic Distribution & Data-Path Mechanics in Load Balancers Quiz

Q1. A load balancer uses round-robin to distribute requests to two servers. Server A has double the CPU and memory of Server B. Under round-robin, both servers get equal traffic and Server B overloads. Which algorithm would better utilize Server A’s extra capacity by sending it more requests?




Q2. Two servers have identical specs, but clients connected to Server X tend to stay connected much longer than those on Server Y. Under equal distribution, Server X becomes overloaded with many open connections. Which load-balancing algorithm alleviates this by routing new clients to the server with fewer active connections?




Q3. A web application stores user session data in-memory on each server. Through the load balancer, a user’s later requests sometimes hit a different server and lose their session. What LB feature should be enabled to ensure each user’s requests go to the same backend that holds their session?




Q4. An Envoy proxy sits between clients and microservice instances. Clients make many short HTTP/1.1 requests, each opening a new TCP connection to the backend. If Envoy reuses persistent connections (keep-alives) to the microservices instead of opening a new connection per request, what is the primary performance benefit?




Q5. A data-center load balancer forwards incoming packets to backend servers but is configured so that each server’s response bypasses the LB and goes directly back to the client. What is this high-performance forwarding mode called?




Q6. In a NAT-44 “full-proxy” load balancer mode, the LB accepts a client’s connection and opens a new connection to the server, replacing the source IP. What source IP will the backend server typically see for all incoming requests in this mode?




Q7. In a microservices deployment, an Envoy load balancer is using a “least request” algorithm: it randomly picks two servers and forwards the new request to whichever has fewer active connections. What is the main advantage of this “power of two choices” approach over simple round-robin?




Q8. A global service has servers in multiple regions. The load balancer continuously measures each server’s response times and dynamically sends more traffic to the fastest-responding servers and less to slower ones. Which distribution algorithm is this?




Q9. An array of L4 load balancer instances uses stateless ECMP at the network layer to distribute flows (each packet’s 5-tuple hash decides the backend). The LBs do not share session state. What happens if one load balancer instance suddenly fails?




Q10. A service initially used source-IP affinity for stickiness (e.g. Kubernetes Service `sessionAffinity: ClientIP`). However, many users share the same public IP (behind NAT), overloading one server. Switching to cookie-based session affinity helped because:




Q11. A web API is served via an L7 proxy load balancer. Clients use HTTP/1.1, so the LB opens a new backend TCP connection for each request. This overhead is high. The team enables HTTP/2 (or HTTP/3/QUIC) between the LB and servers. Why does this improve efficiency?




Q12. A load balancer is set up in Direct Server Return mode, but the back-end servers reside in a different L3 network (no shared subnet with the LB). To still allow servers to respond directly to clients, what forwarding method could the LB use?




Q13. A distributed cache system needs the load balancer to consistently send the same client (or key) to the same backend server to maximize cache hits. It also wants minimal disruption when servers are added/removed (only a small portion of clients remap on changes). Which load balancing technique is best suited for this?




Q14. After a new server instance is added to a pool, the load balancer immediately starts sending it a full share of traffic. The new server, with cold caches and not yet optimized, becomes overwhelmed. What load balancer feature would help avoid this by gradually increasing traffic to a new or recovered backend?




Q15. Under heavy load, clients occasionally time out when opening connections through the LB to a backend service. Investigation shows the application isn’t accepting new TCP connections fast enough. Which tuning knob is most likely to help in this scenario?




Q16. A load balancer is proxying large file downloads to many users. Some clients on slow networks cause the LB to buffer a lot of data in memory while sending to them, pushing memory limits. How can this be mitigated through tuning?




Q17. A high-traffic website must handle millions of concurrent connections with minimal added latency. Architects are debating using a cluster of L7 proxy load balancers versus a simpler stateless L4 (ECMP) load balancing approach. What is a key reason to choose the stateless L4 ECMP design in this case?




Q18. A load balancer in SNAT mode (using one virtual IP to source NAT backend connections) is nearing the 64 k port limit to one backend server (ephemeral ports are exhausting). What is a practical way to increase the number of concurrent connections the LB can handle to that server?




system-design