Load Balancing 101: Load Balancer Power

The first step into the realm of load balancing can be among the most confusing, and that’s specing out the hardware requirements. The same performance characteristics important to load balancers aren’t the same as other networking devices, such as routers and firewalls.

In addition, the metrics that web sites deal with (unique visitors, hits per day) don’t immediately translate into a measured specification for load balancers, so picking an appropriately powered load balancer can be daunting, and often becomes a guessing game.

It need not be a guessing game however. This article will discuss which performance metrics are important to load balancers and which are not, as well as how to take metrics that are typically available to a website and translate them into the metrics that will help you spec out an appropriate load balancer.

Load Balancer Metrics

It’s first important to know what sort of performance metrics are used in the load balancing industry. For the most part, there are three primary categories.

  • Connection rate
  • Bandwidth (measured in bits per second)
  • Concurrent connections

Of the three, bandwidth is typically the most talked about metric. I’m often asked “can X load balancer push Y amount of bandwidth”. But while it’s the most talked about, bandwidth alone is the most useless in determining a load balancer’s performance characteristics. The reason is that although it’s a network device, load balancers have the performance characteristics more akin to web servers than routers, and are burdened not so much by bandwidth as they are by connection rate.

Pushing pure bandwidth with a low rate of connections (such as with large file HTTP downloads or FTP) is not much of a resource burden for load balancers. A Pentium II system running at 300 MHz with a decent network card is capable of pushing 100 Mbps. Even modest modern load balancers are capable of pushing 100 Mbps or more in low-connection rate traffic.

In fact, the only limiting factor in bandwidth is the Ethernet interface speed (100 Mbps with Fast Ethernet versus 1,000 Mbps for Gigabit Ethernet) and the hardware platform’s internal bandwidth, typically restricted by the bus (PCI, PCI-Express), etc. Again, even modest modern hardware is capable of pushing 100 Mbps or more.

It’s the connection rate, the most important metric, that determines a load balancers performance capabilities. This is because accepting a connection and then possibly delving into the HTTP protocol to look for cookies and other header information is CPU-intensive. You are far more likely to hit the connection rate limitation than any other performance metric.

Connection Rates and Bandwidth

Take two traffic profiles, one of HTTP downloads (a file several Megabytes in size), and another size serving up very small text files.

Even at one connection per second, the HTTP download site could easily fill a 100 Mbps line, and would do so without really burdening the load balancer. With the small file profile, you can reach tens of thousand connections per second while not going above 50 Mbps, and drive the load balancer to its limits.

And additional factor with connection rate is that a given load balancer will typically be able to service a higher rate of Layer 4 (just IP and port) load balancing connections than Layer 5-7 connections (looking at cookie, URL, and other protocol-aware functions).

The third metric is simultaneous connections, measuring the number of active connections that the load balancer is servicing. This metric typically isn’t much of a factor anymore. Some older ASIC-based load balancers had hard limits on the number of simultaneous connections, such as Alteon, but the numbers they could support were pretty high for the time and in most cases more than sufficient. With x86 hardware, it’s largely a factor of how much memory there is, and with memory cheap these days, not really a problem.

Website Metrics

Page views per day are the best website metric to get, because this fairly accurately represents the load the load balancer would face in terms of connections. While each page view typically involves serving up multiple objects through HTTP 1.1, we only care about the number HTTP connections (and as a result, connections per second), and not the number of objects pulled, to determine the performance needs of a web site. From page views per day, you can easily calculate page views per second, which represents connections per second.

If you don’t have page views, you’ll need to use the data you have available to come to something similar. Here are two fairly easy ways to get an appropriate figure.

The first uses SNMP. Microsoft distributes an SNMP service with its server operating systems, and it’s easy to install an SNMP daemon (such as net-snmp) on a Linux/Unix-type system. With that installed, there is a particular metric that can be reported, called tcpPassiveOpens, which is the measure of how many TCP connections (and thus HTTP connections) a web server recieves.

You can use an application like MRTG or PRTG (for Windows) to graph this rate over time. The OID, or Object ID, for tcpPassiveOpens is 1.3.6.1.2.1.6.6.0, which will work with virtually any SNMP-enabled network device, regardless of underlying manufacturer or operating system.

Log files

Another way to gain the HTTP connection rate is to look at the log files.  Count the number of HTML pages served in a day, ignoring images (unless images are typically served by themselves).  They’re ignored because typically images and other in-page content are pulled in the same HTTP 1.1 connection that pulled the HTML page itself.
Take a site that experiences 5 million HTTP connections per day, which is a good amount of traffic. Dividing 5,000,000 by 24 hours per day, by 60 minutes per hour, and 60 seconds per minute, we get a figure of about 58 hits per second on average.

Because traffic tends to peak during certain periods and taper off during others, you’ll want to figure out that peak. To do this, a good rule of thumb is to multiply the average number of hits per second by 3 to get a good estimation of what the traffic will be like during the peak, which in our case is 174 hits per second.

Connection rate = (hits per day / 86400 secs per day)

These calculations are of course estimations, but they should get you in the same ballpark in terms of your performance requirements.

Gigabit Ethernet versus Fast Ethernet

Another consideration is whether to purchase a Gigabit Ethernet capable load balancer or a Fast Ethernet load balancer. Typically, the biggest reason to use Gigabit Ethernet isn’t to push hundreds of megabits per second, it’s to push 101 megabits per second or more. If you’re close to 100 Mbps, even 60 Mbps, you may want to consider moving to a Gigabit load balancer to allow for growth, or split your site up among multiple 100 Mbps load balancers.

If you’re not sure what type of bandwidth your site will be expecting, you can simply take your connection rate and multiply it by the typical page size. This will get you bandwidth in bytes per second (page size is typically measured in bytes, not bits), so multiply that figure by 8 to get bits per second.

Also, many web log munging programs will tell you how much traffic has passed in a given day, so again, divide by 86400 seconds in a day, and multiply by 8 to get bits per second that way (this method is probably more accurate than the above).

Load balancing performance doesn’t need to be a mystery, and specing load balancing hardware can be done with confidence. With the concepts, calculations, and tools from this article, you should be able to make a much more informed decision with regard to your load balancing needs.

About tony

Tony is an IT instructor, pilot, scuba diver, marathon runner, and vegan.