Load balancing distributes a workload across multiple servers to improve performance. Server clustering, on the other hand, combines multiple servers to function as a single entity.
Both load balancing and server clustering coordinate multiple servers to handle a greater workload, and these technologies are often used together. Server clusters generally require identical hardware to function, but load balancers can be used to distribute workload to different types of servers and can be more easily integrated into existing architecture.
These technologies have several characteristics in common:
- To an external device, either technology usually appears to be a single machine that manages all of the requests.
- Both technologies often incorporate reverse-proxy technologies that allow for a single IP address to redirect traffic to different IP or MAC addresses.
- Both technologies were developed for managing a data center’s physical servers, but have been extended to applications, virtual servers, cloud servers, and container technology. For brevity, we’ll simply use ‘server’ as a shorthand for the collective technologies.
When resource constrained, an IT manager may need to choose between these two concepts, but in practice, these technologies will often be implemented together. To understand the nuance and the use cases, let’s examine each concept in more detail.
Load balancing seeks to avoid overstressing any single device by splitting up jobs or traffic flow. For this article, we will focus on the concept as it applies to servers; however, the concept is also used in networking.
Load balancers can be simple or sophisticated. Simple load balancers consist of DNS Round Robin and OSI Layer 3/Layer 4 (L3/L4) load balancers which work at the IP and TCP layers. More sophisticated load balancers usually distribute work based upon application data (OSI Layer 7).
A DNS Round Robin load balancer simply directs traffic to a series of IP addresses on a list, once after another. This application is most often used as a simple way to distribute traffic to various servers hosting a website.
L3/L4 Load balancers will route traffic to servers based upon the type of data, port, or protocol. For example, video conferencing data, HTTPS web traffic, and Voice Over Internet Protocol (VoIP) traffic will communicate through different ports and a load balancer can redirect traffic coming in on each port to a different server. A simple L3/L4 implementation will simply route the traffic, but a more sophisticated implementation may check for server status and be able to route port traffic to more than one destination.
Application load balancing typically requires sophisticated load balancing that inspects the data and monitors the status of destination servers. Application load balancing reroutes data to servers optimized for specific data or to servers with more capacity at that time.
Load balancers can be implemented for a specific application or specific hardware. Alternatively, load balancers may manage traffic for a wide variety of destinations including local devices, cloud resources, containers, and server clusters. Complex load balancers even have a dedicated Gartner Quadrant for Application Delivery Controllers, so there is a huge variety of options available to consider.
Server Clustering combines multiple resources to make them appear to be a single resource. For example, several servers can be connected to appear as a single server and share a single IP address. This technology can be used to improve resilience and performance.
The most simple clusters use a pair of redundant servers. More advanced clusters contain many different machines, each connected and exchanging information about states and other resources. Each server, virtual server, or container within a cluster is called a node. While it is technically possible for clusters to contain different types of nodes, performance may be diminished if the nodes are not identical.
IT managers can build four different types of server clusters to emphasize a specific benefit:
- High Availability Server Clusters
- Load Balancing Clusters
- High Performance Clusters
- Storage Clusters
High Availability Clusters prioritize resilience over other benefits and can be implemented in either Active-Passive or Active-Active architecture. Active-Passive architecture uses primary nodes that handle the workload with secondary nodes in hot standby. If the primary node crashes, software immediately switches the workload to the secondary node. Active-Active architecture will distribute the workload to all available nodes and is also the architecture used for the other types of server clusters.
Load Balancing Clusters prioritize balancing the jobs among all of the servers in the cluster and incorporate load balancing software in the controlling node.
High Performance Clusters use multiple servers to perform a specific task extremely quickly and facilitate data intensive projects such as live-streaming, and real-time data processing. High Performance Clusters or Load Balancing Clusters may also have multiple network connections to increase data flow.
Storage Clusters provide enormous storage arrays, sometimes in support of High Performance Clusters, but always in a support role for other servers or clusters. There are many regulations that require certain types of data (credit card databases, personal information, etc.) to be isolated from the internet, so storing this data on storage clusters allows for further separation and additional security.
As with load balancing, there are a large number of providers and a Gartner Quadrants focused on subsets of Server Clustering technology such as Hyperconverged Infrastructure Software. Hyperconverged Infrastructure focuses on fully virtual deployment of the entire infrastructure: server clusters, controllers, network connections, and more. IT managers can deploy many different nuanced deployments within the four overarching categories so they must carefully consider current and future needs when designing their architecture.
Pros & Cons of Each Approach
If load balancing and clustering can often be used together, do they still have pros and cons? Of course. However, managing the pros and cons may be less about picking one technology over another and more about what combinations suit the requirements.
Load balancing can be more simple to deploy in established environments with different types of servers because clustering usually requires identical servers within the cluster. Server clusters are self-contained and managed by a controller automatically, but load balancers require additional networking expertise to set up and manage the different types of connected servers. Server clusters require node managers and node agents to communicate within the cluster which occupies bandwidth and processing on the servers, but load balancers can operate independently of the destination servers and thus consumes less resources.
Server clusters will be more resilient than Load Balancing for applications. If the a consumer is processing a website purchase on a clustered system, the user may be able to continue doing the transaction even if one server in the cluster fails. If the system is load balanced without clustering, the user’s state will likely be lost and the consumer will be forced to re-enter data to complete the transaction.
These solutions aren’t cheap and are not as simple to set up as individual servers. They only serve a purpose as the organization begins to grow and the needs would overwhelm a simple architecture. Each deployed server increases the complexity and cost for setup, security, monitoring, maintenance, and future upgrades. Virtualization and containers can lower some of these costs, but also require expensive specialized knowledge.
Consider two use cases: a website for a municipality and a website streaming pay-per-view sports content.
A municipality may want a new webiset to connect to both established departments and to remote access for employees. In this situation, a load balancer would be the priority to deploy on the web server for a variety of reasons. First, the municipality likely has low web traffic and a single server meets their needs adequately. Deploying a load balancer would not require revising the architecture for the existing departments and can route traffic via IP address (departments) or by port (VPN access) quite inexpensively.
When streaming pay-per-view content, dedicated high-performance server clusters will need to be deployed to livestream the most popular events, and storage clusters will need to be deployed to process credit card transactions and to archive footage for archival broadcast. They may also need to deploy load balancers to multiple server clusters if they have high demand, or within the server clusters to manage the delivery of the videogram data.
Ultimately, as an organization grows, it will find itself struggling to meet evolving demands. The proper solution for that problem will be the one that solves the problem inexpensively and provides flexibility for change. In many organizations, simple load balancers may be the first purchase because it plays well with existing legacy architecture. However, for growing websites, server clusters may be a more practical way to manage surges in demand with high uptime.