Load-Balancing as a Network Primitive

With the emergence of large web applications in the late 1990s, there was much interest in load-balancing to spread incoming requests across a set of identical web servers. Usually, they exploit some trick in the network (e.g. DNS, anycast, etc) to make it work without altering the network logic. Many commercial load-balancing products have been built that sit on the path of incoming requests and spread them over a set of servers. Load-balancing is used increasingly for other tasks beyond balancing web requests too. For example, it is used in CDNs for serving content from multiple servers. It has become a commonly used element of all scale-out network services.

Current load-balancing methods make a number of assumptions about the services:

Our work is premised on the following observation: load-balancing is essentially congestion-aware routing (based on network and server congestion). It, therefore, leads us to the belief that load-balancing should be an integral property of the network. If we think of the network datapath as implementing a basic small set of plumbing primitives‚ (e.g. forward a packet to one or more ports) then load-balancing fits nicely into this model. It just means that the datapath has to intelligently (and dynamically, and quickly) decide which outgoing port to send a request to, and make sure all the packets associated with the request follow the same path to the same server. We therefore seek a solution with the following characteristics:

To this end, we have built Aster*x, a prototype distributed load-balancer. Aster*x is premised on the belief that every switch and router should easily be able to do load-balancing, and that it is cheap to do so. Our approach builds on the growing trend towards ‚ software-defined networking such as OpenFlow/NOX. In these approaches, the network switches are treated as a dumb, minimal flow-based datapath, under the control of a remote, software control plane. OpenFlow is the common, narrow, vendor-agnostic interface to the flow switches; NOX is the control plane upon which services like Aster*x are built.

Aster*x treats each individual request - or a bundle of aggregated requests - as a flow, and decides how to route the flow. The flow could be, for example, a single HTTP request, or it could be all the requests for a particular service. Aster*x can decide whether to route each individual request, or use ECMP-like oblivious load-balancing in any combination. For example, it could choose to send all HTTP requests to one pool of servers, and all video requests to another pool; and then do ECMP-like load-balancing over each pool. Or it could choose to do oblivious load-balancing over different regions of a data center, then do careful, per-request load- balancing within one region. The key here is that Aster*x can be used flexibly to define how the load-balancing is done, under the control of the service or application. It is not pre-defined by a fixed-function load-balancer. Aster*x has the following characteristics:

Comparing the performance of random and smart load-balancing with Aster*x

Demo: Aster*x load-balancing HTTP traffic in a CDN-like WAN environment