Everything about Load Balancer with Cheat Sheet

The Load Balancers play a significant role in any System Design. And the beauty of its design is that every engineer takes it for granted that this will work — one of the best system designs, and there is a lot to learn from this simple yet powerful solution. The LB acts as a middle-man between clients and servers; the client sends requests to LB and internally, via physical NIC to VIPs, forwards to attached web servers.

Load Balancers

In a nutshell, Load balancing is distributing network traffic across multiple servers based on several different algorithms. Load balancing handles by a tool or application called a Load Balancer. A Load Balancer can be either hardware-based or software-based. Hardware load balancers require the installation of a dedicated load-balancing device; software-based load balancers can run on a server, on a virtual machine, or in the cloud.

Suppose the total traffic on the service converges only to a few machines. In that case, this will not only overload them, resulting in an increase in the application's latency and killing its performance, but it will also eventually bring them down.

Load balancing helps us avoid all this mess. While processing a user request, the load balancer automatically routes the future requests to other up-and-running servers in the cluster if a server goes down. It enables the service as a whole to stay available.

We can balance the Load at each system layer to utilize full scalability and redundancy. We can add LBs at three places:

Between the user and the web server
Between web servers and an internal platform layer, like application servers or cache servers
Between the internal platform layer and the database.

Benefits of Load Balancing

Users experience faster, uninterrupted service. Users won't have to wait for a single struggling server to finish its previous tasks. Instead, their requests were passed immediately on to a more readily available resource.
Service providers experience less downtime and higher throughput. Even a complete server failure won't affect the end-user experience as the load balancer will route around it to a healthy server.
Load balancing makes it easier for system administrators to handle incoming requests while decreasing user wait time.
Intelligent load balancers provide benefits like predictive analytics that determine traffic bottlenecks before they happen. As a result, the intelligent load balancer gives an organization actionable insights. These are key to automation and can help drive business decisions.
System administrators experience fewer failed or stressed components. Instead of a single device performing much work, load balancing has several devices performing a little work.

Types of Load Balancers

In the seven-layer Open System Interconnection (OSI) model, network firewalls are at levels one to three (L1-Physical Wiring, L2-Data Link, and L3-Network). Meanwhile, load balancing happens between layers four to seven (L4-Transport, L5-Session, L6-Presentation, and L7-Application).

Load balancers have different capabilities, which include:

L4: directs traffic based on data from network and transport layer protocols, such as IP address and TCP port.
L7: adds content switching to load balancing. It allows routing decisions based on attributes like HTTP header, uniform resource identifier, SSL session ID, and HTML form data.
GSLB: Global Server Load Balancing extends L4 and L7 capabilities to servers in different locations.

DNS Load Balancing: DNS-based load balancing is a specific type of Load balancing that uses DNS to distribute traffic across several servers. More about it

Load-balancing algorithm techniques

Different load-balancing algorithms can be used depending on the load distribution. The algorithms consider two aspects of the server i) Server health and ii) Predefined conditions.

Round Robin Algorithm: Round-robin (RR) algorithm is a circular distribution of requests. There are two types of Round Robin:
Weighted Round Robin: The server is assigned a weight depending on its composition. Based on the preassigned efficiency, the Load distributes in a cyclical procedure.
Dynamic Round Robin: forward requests to associated servers based on the real-time calculation of assigned server weights.
Least Connections: Distributes the Load by choosing the server with the least active transactions (connections).
Weighted least connections: load distribution based on both the factors — the number of current and active links to each server and the relative capacity of the server.
Source IP hash: a server is selected based on a unique hash key — the Hash key generates by taking the source and destination of the request. Based on the generated hash key, servers assign to clients.
URL hash: URL Hashing is a load-balancing method typically used when load-balanced servers serve content that is mainly (but not necessarily) unique per server. It is used, for example, in a deployment where a pool of cache servers responds to requests for content. Like Source IP and URL Hash, many such Hashing algorithms could be considering Cookie, Methods, etc.
Least response time: the backend server with the least active connections and the petite average response time is selected. It ensures quick response time for end clients.
Least bandwidth method: backend servers are selected based on the server's bandwidth consumption, i.e., the server consuming the least bandwidth is selected (measured in Mbps).
Custom load method: the backend servers are chosen based on the Load. CPU usage, memory, and server response time are considered to calculate the server load. This algorithm is suitable for predictable and stable traffic; it's inconvenient in case of uneven and sudden traffic changes.

Web-server monitorings

Load balancers conduct continuous health checks on servers to ensure they can handle requests. If a server or group performs slowly, the load balancer distributes less traffic. If necessary, the load balancer removes unhealthy servers from the pool until they restore. Server failover is crucial for reliability: a server crash could bring down a website or application if there is no backup. Failover must take place quickly to avoid a gap in service. Some load balancers even trigger the creation of new virtualized application servers to cope with increased demand.

Single point of failure

In your complex, robust system design, the Load Balancer will become a single point of failure. If it goes down, your system goes down as well. Here are some approaches to achieve High Availability and avoid a Single Point of Failure.

Active passive offers many advantages, so consider buying a pair of load balancers and configuring them in H/A mode. The primary load balancer distributes the network traffic to the most suitable server. In contrast, the second load balancer operates in listening mode to constantly monitor the performance of the primary load balancer and is ready at any time to step in and take over the Load balancing duties should the direct load balancer fail. Maintaining uninterrupted customer service is achievable by operating load balancers in Active/Passive mode. Another advantage this configuration presents is the ability to deal with planned or unplanned service outages.

Active/Active: two or more servers aggregate the network traffic load, and working as a team, they distribute it to the network servers. The load balancers can also remember user information requests and keep this information in the cache. If they return looking for the same information, the user locks onto the load balancer previously served them. The information is provided again from the cache without the network server having to respond. This process reduces network traffic load.

Security

TLS traffic is decrypted at the load balancer. When a load balancer decrypts traffic before passing the request on, it is called TLS termination. The load balancer saves the web servers from doing it individually. It can expose the application to possible attacks. However, the risk lessens when the load balancer is within the same data center as the web servers.

Note: For FIPS 140–2 Compliance, You need TLS termination on both Load Balancers and Web Servers.

Concluding with the cheat sheet

On popular demand from Principles & Best practices of REST API Design, here is the cheat sheet for Load Balancer. Remember to share it with interested folks.

I conclude this learning; I hope you have learned something new today. Finally, Consider becoming a Medium member; thank you!